Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Syllabus for DSC 491: Introduction to Data Mining in Business Course Goals 1. To give students confidence in using the computer to analyze very large data sets to discern, understand, and interpret the truth about populations and processes. 2. To promote critical thinking by critically examining the appropriate uses of and conclusions drawn from some of the most important statistical data mining methods (specifically cluster analysis, market basket analysis, tree diagrams, logistic regression and neural nets). 3. To help students gain perspective in the general use of scientific methods by discussing both the assumptions behind statistical methods and remedial actions needed when assumptions are violated. 4. To promote understanding contexts by emphasizing that the current state of statistical practice is a historical and changing situation. 5. To give students the opportunity to engage with other learners by discussing the practical and theoretical meanings of statistical data analyses. Also, to give students practice in communicating statistical results by reporting their conclusions in writing to the professor, who will provide feedback. 6. To demonstrate how statistical analyses allow the evaluation of alternative choices in a business context, and to show how these methods help us to reflect and act in a business context. Text: Required: Applied Analytics Using SAS Enterprise Miner 5 Recommended: Discovering Knowledge in Data, An Introduction to Data Mining By Daniel T. Larose Cheating as outlined in the 2008-2009 Student Handbook, shall be dealt with severely in accordance with the guidelines in the aforementioned document. Copying from another person’s exam, use of unapproved information on an exam, receiving unauthorized help from another student are all cheating. Discussion with others about a home task is permitted but direct copying of another student’s home task is cheating. Proposed Topics 1. Data Mining vs. Data Warehousing, what are they and how are they different? What kinds of business questions can be answered with Data Mining methods? 2. DMAIC, SEMMA, Virtuous Cycle, Vardeman and Jobe plan 3. Supervised vs Unsupervised (Directed vs Undirected) Methods 3. Types of variables: Target vs Input: interval, nominal, ordinal, categorical, dummy. 4. Hypothesis Testing, p-values, χ2 statistic, F statistic, log worth. 5. Cluster Analysis 6. Market Basket Analysis, i.e., affinity grouping, lift, confidence, support 7. Decision Trees, lift, cumulative gains, model evaluation, decision-making 8. Logistic Regression, model evaluation, decision-making 9. Neural Nets, model evaluation, decision-making SAS Enterprise Miner 5.2 to be used extensively. Time permitting, JMP to be used as well Grading Policy You will be graded no tougher than the following: 3 Exams 50% Lowest A is 90 Final Exam 35% Lowest B is 75 Homework 15% Lowest C is 62 Lowest D is 50 Below 50 is F