Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SM-712: Data Mining and Data warehousing Credits: 4 (2-1-2) Objective: The main objective of this course is to provide understanding of data warehouse fundamentals and data mining techniques for business applications. COURSE DESCRIPTION: UNIT I: Data Warehousing: Introduction data warehousing, Data Mart, Data Warehouse Architecture; Star, Snowflake and Galaxy Schemas for Multidimensional databases, Fact and dimension data, Partitioning Strategy-Horizontal and Vertical Partitioning. OLAP technology, Multidimensional data models and different OLAP Operations, OLAP Server: ROLAP, MOLAP, Data Warehouse implementation, Efficient Computation of Data Cubes, Processing of OLAP queries, Indexing data. UNIT II: Data Mining: Basics of data mining, Data mining techniques, KDP (Knowledge Discovery Process), Application and Challenges of Data Mining; Data Processing: Data Cleaning, Data Integration and Transformation; Data Reduction: Data Cube Aggregation, Dimensionality reduction, Data Compression, Numerosity Reduction, Data Discretization and Concept hierarchy generation for numerical and categorical data. Web Mining: Introduction, Web Content Mining, Web Structure Mining, Web Usage Mining; Spatial Mining, Text Mining. UNIT III: Mining Association Rules in Large Databases: Association Rule Mining, Single-Dimensional Boolean Association Rules, Multi-Level Association Rule, Apriori Algorithm, FP-Growth Algorithm, Time series mining association rules, latest trends in association rules mining. UNIT IV: Classification methods: Decision tree, Bayesian Classification, Association Rule based; Prediction: Linear and non-linear regression; Categories of clustering methods, Partitioning methods: K-Means, K-Mediods. Hierarchical Clustering: Agglomerative and Divisive Clustering, BIRCH and ROCK methods, DBSCAN, Outlier Analysis. Data Mining for Business Intelligence Applications Text Books: 1. P.Ponnian, “Data Warehousing Fundamentals”, John Weliey. 2. Han, Kamber, "Data Mining Concepts and Techniques", Morgan Kaufmann. 3. P. N. Tan, M. Steinbach, Vipin Kumar, “Introduction to Data Mining”, Pearson Education. 4. G. Shmueli, N.R. Patel, P.C. Bruce, “Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner”, Wiley India. 5. Michael Berry and Gordon Linoff “Data Mining Techniques”,Wiley Publications. 6. M.H.Dunham, “Data Mining Introductory & Advanced Topics”, Pearson Education. Course Plan: Week Unit 1 I 2 I 3 4 I II 5 II 6 II 7 II 8 III 9 III 10 III 11 IV 12 IV 13 IV 14 IV 15 IV 16 TOTAL Topics Lecture Data Warehousing: Introduction data warehousing, 2 Data Mart, Data Warehouse Architecture; Star, Snowflake and Galaxy Schemas for Multidimensional databases. OLAP technology, Multidimensional data models and 2 different OLAP Operations. Data Warehousing, Data Mining, OLTP, OLAP. 2 Data Mining: Basics of data mining, Data mining 2 techniques, KDP (Knowledge Discovery Process), Application and Challenges of Data Mining. Data Processing: Data Cleaning, Data Integration 2 and Transformation; Data Reduction: Data Cube Aggregation, Dimensionality reduction. Data Compression, Numerosity Reduction, Data 2 Discretization and Concept hierarchy generation for numerical and categorical data. Introduction to Web Mining . Web Content Mining, Web Structure Mining, Web 2 Usage Mining; Spatial Mining, Text Mining Mining Association Rules in Large Databases: 2 Association Rule Mining, Single-Dimensional Boolean Association Rules, Multi-Level Association Rule. Apriori Algorithm, FP-Growth Algorithm, Time series 2 mining association rules. latest trends in association rules mining, 2 Practical case studies based on association rules. Classification methods: Decision tree, Bayesian 2 Classification, Association Rule based; Prediction: Linear and non-linear regression; JD Edwards, QAD Inc, SSA Global, Lawson Software, 2 Baan,Enterprise, Epicor, Intutive. Partitioning methods: K-Means, K-Mediods. 2 Hierarchical Clustering: Agglomerative and Divisive Clustering. BIRCH and ROCK methods, DBSCAN, Outlier Analysis. 2 Data Mining for Business Intelligence Applications Case studies of business analysis using data mining 2 techniques. Revision 2 32 Hours Tutorial 1 Practical 2 1 2 1 1 2 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 16 2 32