Syllabus           Schedule          Contacts          Project A        Project B         Home 

 

MGS 8040: Data Mining

Syllabus for Spring 2012

 

Instructor: Dr. Satish Nargundkar 
Office: 827 College of Business 

Office Hours:  By appointment 
Phone: (678) 644 6838  

E-Mail : snargundkar@gmail.com  
Website:
www.nargund.com/gsu  

Computer# 12494, ALC-313

                    7:15 – 9:45 PM, Mondays.

 

Prerequisites: MBA 7025 or equivalent or permission of instructor. You must already have knowledge of basic statistics, including Regression Analysis, to succeed in this course.

 

Text:

The following texts are listed in order of importance for the course. They are optional.

 

1.      Making Sense of Data II by Glenn Myatt & Wayne Johnson, John Wiley& Sons, 2009.

1.      Multivariate Data Analysis by Hair, Anderson, Tatham, & Black, Prentice Hall.

2.      http://statsoft.com/textbook/stathome.html.

3.      The Little SAS Book by Delwiche and Slaughter.

 

Course Catalog Description

This course covers various analytical techniques to extract managerial information from large data warehouses. A number of well-defined data mining tasks such as classification, estimation, prediction, affinity grouping and clustering, and data visualization are discussed. Design and implementation issues for corporate data warehousing are also addressed.

 

Detailed Course Description

Data mining supports decision making by detecting patterns, devising rules, identifying new decision alternatives and making predictions. This course is organized around a number of well-defined data mining tasks: description, classification, estimation, prediction, and affinity grouping and clustering.  Students will learn to use techniques such as Rule Induction (classification trees), Logistic Regression, Discriminant Analysis, and Neural Networks. Data visualization techniques will be used whenever possible to reveal patterns and relationships.  Students will use commercially available software tools to mine large databases. Team-based projects will be conducted.

 

The course is organized into 3 broad areas as follows:

1)      Context: Decision Support for Strategic/Tactical Decision-making.  Data/Information Organization Data Warehouse Design.

2)      Exploratory Analysis: Segmentation Techniques

3)      Forecasting: Modeling Techniques, Transforming analysis into actions

 

Learning Outcomes/Course Objectives

Upon completion of the course, students will be able to:

 

Overview section:

1.      Develop a general framework for decision support within organizations.

 

 

 

 

Data Warehousing Section:

2.      Compare and Contrast RDBMS and data warehouses; specifically, compare relational structure with star schema.

3.      Discuss the sources of data, problems with data, and how to overcome them (Data Cleaning).

4.      Clean a dataset for further analysis.

 

 Data Mining Section:

5.      Explain the data mining methodology.

6.      Use visual techniques to describe data.

7.      Explain the assumptions of K-Means Clustering.

8.      Segment data using Cluster Analysis, and interpret the output

9.      Explain the assumptions of various techniques such as Cluster Analysis, Multiple Regression, Discriminant Analysis, Logistic Regression, and Artificial Neural Networks.

10.  Build multiple regression, discriminant analysis, and Logistic models for forecasting.

11.  Validate models using the  Kolmogorov-Smirnov (K-S) test.                      

12.  Compare and Contrast Neural Networks with Statistical techniques.

13.  Interpret Classification trees.

14.  Use Interaction detection methods such as CART, CHAID, for classification.

15.  Discuss issues of implementation of the results of various techniques.

16.  Develop methods to monitor the ongoing performance of implemented models.

 

Methods of Instruction:

The course will combine lectures and discussion, plus guest lectures from industry experts. The team-based project will be emphasized, and case studies will be discussed.

 

Grading:

           

                                   

 

 

Course Average

Grade

Course Average

Grade

Assignments   

20%

 

94-96, 97+

A, A+

77-79

C+

Tests (2)

40%

 

90-93

A-

73-76

C

Team Project

30%

 

87-89

B+

70-72

C-

Final Exam

10%

 

83-86

B

60-69

D

 

80-82

B-

Less than 60

F

 

All assignments should be posted to myrobinson.gsu.edu. Late work will get partial credit only.

 

Software: Students are encouraged (not required!) to do project work in SAS in order to develop a marketable skill. You may choose other software (SPSS is available at GSU) if you wish.

 

General Policies:

1.      Students are expected to attend each class (who knows, you may actually enjoy the class!), arrive on time and participate in class discussions.

2.       Turn off cell phones, pagers, stereos, TVs, etc. when in class. Treat the instructor and each other with courtesy.

 

Course Assessment:

Your constructive assessment of this course plays an indispensable role in shaping education at Georgia State. Upon completing the course, please take the time to fill out the online course evaluation.