Syllabus           Schedule          Project A         Project B         Home 

 

MGS 8040: Data Mining Summer 2019

 

Wednesdays 5:30 – 9:45 PM

Buckhead Center

(200 Tower Place) Room 404

CRN: 53355

            Date

Topic

Readings

Assignments Due

(Post to iCollege)

Week 1: 6/12

CRISP-DM

Introduction – DM Overview

Regression Review 

Simple Regression

Shoe size data

Height - Multicollinearity

Data and Cows

Notes

Notes – Simple Regression

Notes – Multiple Regression

Exercise

Solution to Exercise

Does p-value matter?

Multicollinearity

Interaction Effect

1.   Regression Analysis

Regression Dataset

 

(due Sunday, 6/16 by 11:59 PM. To iCollege)

Week 2: 6/19

Understanding Credit Data –

Equifax / Experian / Trans Union

 

The Initial Client Meeting

 

Introduction to SAS

SAS Training at UCLA

Notes – Initial Client Meeting

Sample Design Exercise

Solution to Exercise

Data Cleaning

Notes – Basic SAS Analysis
The Little SAS Book

By Delwiche & Slaughter

Data1 subset in Excel

2.  Application – Dep. Var, Outcome, Sample time frame

 

3. SAS assignment

Folder Instructions

 

(both due Sunday, 6/23) by 11:59 PM. Post to iCollege as separate Assignments).

Week 3: 6/26

Data Cleaning

Dummy Variable Definition

 

Class Handout

Solution to Handout

 

Intro to Logistic Regression,

Discriminant Analysis

Data Warehouse introduction

Books by Edward Tufte.

Gallery of Data Visualization

WHO visualization

 

SAS Programs for dummies, Regression and Scoring

 

www.statsoft.com

Logistic Regression

Logistic Reg SAS program

4. Crosstabs, Dummy decisions

(due Sunday 6/30 by 11:59 PM)

Week 4: 7/3

 

Test 1   (5:30 – 7:00 PM)

 

7:15 – 9:45 PM

Model Validation – KS Test

 

 

 

 

PROJECT Dataset Sources

 

1. Dr. Miller’s List

2. UCI Machine Learning

3. Kaggle

Week 5: 7/10

Effectiveness of models – A review of methods

 

Segmentation

   Cluster Analysis

   SPSS Output (Cluster)

Research Paper on Model Effectivenss

www.statsoft.com

 

www.statsoft.com

Factor Analysis

Clustering Paper

5. Discrim, KS

(due 7/07 by 11:59 PM)

 

SAS Programs for Regression/Scoring

Week 6: 7/17

Test 2 (5:30 – 7:00 PM)

 

Association Techniques, Monitoring Reports, Review

Classification Trees

Week 7: 7/24

Project Work

 

Week 8: 7/31

Project Presentations – 5:30 to 7:30/7:45 PM

Conclusion

Project Reports Due [Guidelines]

Sample Final Project