Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
SAK 5609 DATA MINING Prof. Madya Dr. Md. Nasir bin Sulaiman [email protected] 03-89466514 Synopsis Kredit: 3(3+0) Contact hours: 3 x 1 hour per week Semester: I Emphasis on concepts of data mining. It includes principles of data mining, data mining functions, data mining processes, data mining techniques such as K-nearest neighbour and clustering algorithms, rule induction, decision tree algorithms, association rule mining, neural networks and genetic algorithms; and data mining examples. Industrial and scientific applications will be given. Assessment & References Assessment: – Exercises (10%) – Project I (15%) + presentation I (5%) Week 7 Project II (15%) + presentation II (5%) Week 14 – Mid-exam 20% (1 hour) Week 6 – Final exam 30% (1.5 hours) Week 15 - 17 References: – Jiawei Han & Micheline Kamber, (2006), “Data Mining: Concepts and Techniques”, 2nd. Ed., Morgan Kaufman. – Michael J.A.Berry & Gordon S. Linoff, (2004), “Data Mining Techniques (2nd edition)”, Wiley. – Other related articles Course Contents Chapter 1 Introduction – Motivation – Origin of data mining – What it is/ isn’t – The KDD process – Types of data Chapter 2 Data mining tasks – Classification – Association rule mining – Sequential pattern mining – Clustering – Anomaly detection Chapter 3 Data issues – What is data set? – Types of attributes – Transformation for different types – Types of data • Structured data, record data, data matrix, document data, transaction data, graph data, ordered data – Data quality • Noise and outliers, missing values, inconsistent/duplicate data Chapter 4 Data preprocessing – Why Data Preprocessing? – Why Is Data Preprocessing Important? – Major Tasks in Data Preprocessing • Data Cleaning • Data integration • Data transformation • Data reduction • Data discretization Chapter 5 Association rule mining – Introduction – The Model – Goal and Key Features – Mining Algorithms – Problems with the Association Rule Model – Issues of association rules – Other Main Works on Association Rules Chapter 6 Sequential Pattern Mining – Sequence databases and pattern analysis – Mining algorithms – Challenges on sequential mining – Studies on sequential mining Chapter 7 Classification and Prediction – Classification Model – General Approach – Classification—A Two-Step Process – Classification Techniques – Evaluating classification methods – Decision Tree Based Classification, rule based classifiers, nearest neighbor classifiers etc Chapter 8 Clustering and Anomaly – What is/is not cluster analysis? – Examples of clustering applications – – – – – Types of data in clustering analysis Types of clustering – hierarchical, partitional Major Clustering Techniques Approaches to anomaly detection Issues dealing with anomalies Chapter 9 Data Mining Applications