Syllabus Schedule Contacts Project A Project B Home
MGS 8040: Data Mining
Syllabus for Spring 2012
|
Instructor: Dr. Satish
Nargundkar Office Hours: By
appointment |
E-Mail : snargundkar@gmail.com Computer# 12494, ALC-313 7:15 – 9:45 PM, Mondays. |
Prerequisites: MBA 7025 or equivalent or permission of instructor. You must already
have knowledge of basic statistics, including Regression Analysis, to succeed
in this course.
Text:
The following texts are listed in order of importance for the course.
They are optional.
1. Making Sense of Data II by
Glenn Myatt & Wayne Johnson, John Wiley& Sons, 2009.
1. Multivariate Data Analysis by Hair,
2. http://statsoft.com/textbook/stathome.html.
3. The Little SAS Book by Delwiche and Slaughter.
Course Catalog Description
This course covers various
analytical techniques to extract managerial information from large data
warehouses. A number of well-defined data mining tasks such as classification,
estimation, prediction, affinity grouping and clustering, and data
visualization are discussed. Design and implementation issues for corporate
data warehousing are also addressed.
Detailed Course Description
Data mining supports
decision making by detecting patterns, devising rules, identifying new decision
alternatives and making predictions. This course is organized around a number
of well-defined data mining tasks: description,
classification, estimation, prediction, and affinity grouping and clustering. Students will learn to use techniques such as
Rule Induction (classification trees), Logistic Regression, Discriminant
Analysis, and Neural Networks. Data visualization techniques will be used
whenever possible to reveal patterns and relationships. Students will use commercially available
software tools to mine large databases. Team-based projects will be conducted.
The course is organized into
3 broad areas as follows:
1) Context: Decision Support for Strategic/Tactical Decision-making. Data/Information Organization Data Warehouse
Design.
2) Exploratory Analysis: Segmentation Techniques
3) Forecasting: Modeling Techniques, Transforming analysis into actions
Learning Outcomes/Course Objectives
Upon completion of the
course, students will be able to:
Overview section:
1. Develop a general framework
for decision support within organizations.
Data
Warehousing Section:
2. Compare and Contrast RDBMS
and data warehouses; specifically, compare relational structure with star
schema.
3. Discuss the sources of data,
problems with data, and how to overcome them (Data Cleaning).
4. Clean a dataset for further
analysis.
Data
Mining Section:
5. Explain the data mining
methodology.
6. Use visual techniques to
describe data.
7. Explain the assumptions of
K-Means Clustering.
8. Segment data using Cluster
Analysis, and interpret the output
9. Explain the assumptions of
various techniques such as Cluster Analysis, Multiple Regression, Discriminant Analysis, Logistic Regression, and Artificial
Neural Networks.
10. Build multiple regression, discriminant analysis, and Logistic models for forecasting.
11. Validate models using the Kolmogorov-Smirnov
(K-S) test.
12. Compare and Contrast Neural
Networks with Statistical techniques.
13. Interpret Classification
trees.
14. Use Interaction detection
methods such as CART, CHAID, for classification.
15. Discuss issues of
implementation of the results of various techniques.
16. Develop methods to monitor
the ongoing performance of implemented models.
Methods of Instruction:
The course will combine
lectures and discussion, plus guest lectures from
industry experts. The team-based project will be emphasized, and case studies
will be discussed.
Grading:
|
|
|
|
Course Average |
Grade |
Course Average |
Grade |
|
Assignments |
20% |
|
94-96,
97+ |
A,
A+ |
77-79 |
C+ |
|
Tests (2) |
40% |
|
90-93 |
A- |
73-76 |
C |
|
Team Project |
30% |
|
87-89 |
B+ |
70-72 |
C- |
|
Final Exam |
10% |
|
83-86 |
B |
60-69 |
D |
|
|
80-82 |
B- |
Less
than 60 |
F |
All assignments should be
posted to myrobinson.gsu.edu. Late work will get partial credit only.
Software: Students are encouraged
(not required!) to do project work in SAS in order to develop a marketable
skill. You may choose other software (SPSS is available at GSU) if you wish.
General Policies:
1. Students are expected to
attend each class (who knows, you may actually enjoy the class!), arrive on
time and participate in class discussions.
2. Turn off cell phones, pagers, stereos, TVs, etc. when in class. Treat the instructor and each other with courtesy.
Course Assessment:
Your constructive assessment
of this course plays an indispensable role in shaping education at