Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PEIT5302 DATA MINING & DATA WAREHOUSING (3-0-0) Instructor: Prof. Puspanjali Mohapatra No. of lectures: 40 Objective of the Course:-After learning data Mining, the students can extract the hidden predictive information from large databases. Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions. The automated, prospective analyses offered by data mining move beyond the analyses of past events provided by retrospective tools typical of decision support systems. Data mining tools can answer business questions that traditionally were too time consuming to resolve. Module - I (12 Hours ) Overview: Data warehousing, The compelling need for data warehousing, the Building blocks of data warehouse, data warehouses and data marts, overview of the components, metadata in the data warehouse, trends In data warehousing, emergence of standards, OLAP, web enabled data warehouse, Introduction to the data warehouse project, understanding data warehousing Architecture, Data warehousing implementation, from data warehousing to data mining. Module – II( 14 Hours) Introduction to Data mining, Data mining Functionalities, Data preprocessing (data summarization, data cleaning, data integration and transformation, data reduction, data discretization), Mining frequent patterns, associations, correlations (market basket analysis, the apriori algorithm, mining various kinds of association rules, from association mining to correlation analysis) Classification: classification by decision tree induction, Rule based classification, classification by neural networks, classification by genetic algorithm. Module - III (10 Hours) Cluster Analysis: types of data in cluster analysis, A categorization of major clustering methods (partitioning methods, hierarchical methods),clustering high dimensional data, outlier analysis Advanced techniques: web mining, spatial mining, temporal mining, Data mining applications in (financial data Analysis, retail industry, telecommunication industry, Biological data analysis, intrusion detection, in other scientific applications) Text Books: 1. Data warehousing Fundamentals: Paulraj Ponniah, Willey India. 2. Data Mining: Concepts and techniques: J.Han and M.Camber, Elsevier. Reference books: 1. Data Mining: Arun Pujari, University Press 2. Data Mining –a Tutorial based primer by R.J.Roiger, M.W.Geatz, Pearson Education. 3. Data Mining & Data Warehousing Using OLAP: Berson, TMH. 4. Data Warehousing: Reema Thareja, Oxford University Press PEIT5302 COURSE DETAILS SL. NO. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 TOPICS NO OF HOURS Syllabus Overview, Introduction to Data Mining and Data Warehousing, Overview: Data warehousing, The compelling need for data warehousing, The Building blocks of data warehouse, data warehouses and data marts, Overview of the components, metadata in the data warehouse 1 Trends In data warehousing, Emergence of standards, OLAP Web enabled data warehouse Introduction to the data warehouse project, Understanding data warehousing Architecture, Data warehousing implementation from data warehousing to data mining. Summary Discussion of Module-1 Introduction to Data mining, Data mining Functionalities, Data preprocessing (data summarization, Data cleaning, data integration and transformation, Data reduction, data discretization), Mining frequent patterns, associations, correlations (market basket analysis, the apriori algorithm) mining various kinds of association rules, from association mining to correlation analysis Classification: classification by decision tree induction Rule based classification, 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 1 2 2 23 24 25 26 27 28 29 30 31 32 33 34 35 classification by neural networks, 2 classification by genetic algorithm 2 Summary Discussion of Module-2 2 Cluster Analysis: types of data in cluster analysis, 2 A categorization of major clustering methods(partitioning methods, 1 hierarchical methods) clustering high dimensional data, outlier analysis 1 Advanced techniques: web mining, 1 spatial mining, temporal mining, 1 Data mining applications in financial data Analysis, 1 retail industry, telecommunication industry 1 Biological data analysis 1 intrusion detection, in other scientific applications 1 Summary Discussion of Module-3 1