Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CS 668 Advanced Topics in Database Technologies Course Hours: Tuesday 6:00 ~ 8;35 PM Spring 2013 Classroom: LLC 207 (Cook Lab) Textbooks of CS 668: (Required) Data Mining, Concepts and Technologies, 3rd Edition, Jiawei Han, Micheline Kamber, and Jian Pei The Morgan Kaufmann, (Series in Data Management systems), ISBN 978-0-12-381479-1, 2011. Useful Web Resources: 1. http://www.cs.uiuc.edu/~hanj/ 2. Modern Data Warehousing, Mining, and Visualization – Core Concepts, Prentice-Hall. 3. Practical Applications of Data Mining, Sang C. Sug Prerequisite of CS 668: CS 649 Database Management Systems. Instructor: Prof. Ping-Tsai Chung Contact Information: Office: LLC 206P E-mail: [email protected] Tel: (718) 488-1073 Office Hours: Tuesday & Wednesday 5:00 - 6:00 PM (LLC 203) or by appointment. Participation/Course Grade: Assignments & Project(s): 70%, Exam: 30% Approximate Schedule of Topics: Approximate Schedule of Topics: Reading Schedule Topics Covered Assignments 1 Introduction – 1. What Motivated Data Mining? Why Is It Important? 2 So, What Is Data Mining? 3 Data Mining--On What Kind of Data? 4 Data Mining Functionalities-What Kinds of Patterns Can Be Mined? Chapter 1 & Notes 5 Are All of the Patterns Interesting? 6 Classification of Data Mining Systems 7 Data Mining Task Primitives 8 Integration of a Data Mining System with a Database or Data Warehouse System 9 Major Issues in Data Mining 2 Getting to Know Your Data - Chapter 2 & Notes 1. Types of Data Sets and Attribute Values 2. Basic Statistical Descriptions of Data 3. Data Visualization 4. Measuring Data Similarity 3 Preprocessing - Chapter 3 & Notes 1. Data Quality 2. Major Tasks in Data Preprocessing 3. Data Reduction 4. Data Transformation and Data Discretization 5. Data Cleaning and Data Integration 4 Data Warehousing and On-Line Analytical Ch 4 & Notes Processing 1. Data Warehouse: Basic Concepts 2. Data Warehouse Modeling: Data Cube and OLAP 3. Data Warehouse Design and Usage 4. Data Warehouse Implementation 5. Data Generalization by Attribute-Oriented Induction 5 Mining Frequent Patterns, Associations and Correlations: Concepts and Methods - Chapter 6 & Notes 1. Basic Concepts 2. Frequent Itemset Mining Methods 3. Which Patterns Are Interesting? Pattern Evaluation Methods 4. Association Rules – Notes - 6 Classification Learning: Basic Concepts - Chapter 8 & Notes 1. Classification Learning: Basic Concepts 2. Decision Tree Induction 3. Rough Sets & Bayes Theories – Notes 4. Rule-Based Classification 5. Model Evaluation and Selection 6. Techniques to Improve Classification Accuracy: Ensemble Methods 7 Cluster Analysis: Basic Concepts and Methods - Chapter 10 & Notes 1. Cluster Analysis: Basic Concepts 2. Clustering structures 3. Major Clustering Approaches 4. Partitioning Methods 5. Hierarchical Methods 6. Graph Theory Algorithm with the Single-link Method - Notes 7. Density-Based Methods 8 8. Association Rule Algorithm - Notes Trends and Research Frontiers in Data Mining - Chapter 13 & Notes 1. Mining Complex Types of Data 2. Advanced Data Mining Applications 3. Data Mining System Products and Research Prototypes 4. Social Impacts of Data Mining 5. Trends in Data Mining 9 PROJECT PRESENTATIONS 10 FINAL EXAM – Contents will be discussed in the Class