Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Annexure ‘AAB-CD-01a’ Course Title: Advance Data Mining & Warehousing L T P/ S SW/F W Course Code: CSE903 Credit Units: 04 Level: Ph.D.(Part Time) 4 0 0 0 TOTAL CREDIT UNITS 4 Please give your valuable feedback ratings (on the scale of 6 points) for following course curriculum with respect to relevance to Industry / Profession: 6 Excellent 5 Very Good 4 Good 3 Moderate 2 Needs Improvement # Course Title 1 Course Objectives: To demonstrate new concepts of organizing data ware house & data mining technique to drive the useful information out of the piles of data. With the growth of large amount of data today it has become necessity to explore and mine the data so that we can have hidden useful Information. This course will expose students to the process of extracting patterns and useful information from large data sets by combining methods from data mining, statistics and artificial intelligence with database management. It will also expose students to have data analysis using data mining tools. This course is also covering some advance topics in data mining like, opinion mining, web mining etc. 2 Prerequisites: Basic knowledge of Database Management System and algorithms to analyze data. 3 Student Learning Outcomes: 1 Poor Feedback Rating (on scale of 6 points) Comments (if any) • By the end of this course students will be able to design and develop a data warehouse. • They will be able to analyze and evaluate data warehouse using a multidimensional model and by using various OLAP techniques • Students will be able to display a comprehensive understanding of different data mining tasks and the algorithms most appropriate for addressing them. • Students will be able to evaluate models/algorithms with respect to their accuracy. • Students will be able to demonstrate capacity to perform a self directed piece of practical work that requires the application of data mining techniques. • Students will be able to Analyze and critique the results of a data mining exercise. • Students will be able to conceptualize a data mining solution to a practical problem Course Contents / Syllabus: 4 Module I Introduction % Weightage Data Mining Functionalities. Concept/Class Description: Characterization and 15 Discrimination, Mining Frequent Patterns, Associations, and Correlations, Classification and Prediction, Cluster Analysis, Outlier Analysis, Evolution Analysis. Classification of Data Mining Systems, Data Mining Task Primitives, Integration of a Data Mining System with a Database or Data Warehouse System. 5 Module II Data Preprocessing % Weightage Data Cleaning: Missing Values, Noisy Data, Data Integration and Transformation, Data 15 Reduction: Data Cube Aggregation, Attribute Subset Selection, Dimensionality Reduction. 6 Module III Data Warehouse and OLAP Technology % Weightage Differences between Operational Database Systems and Data Warehouses. A 20 Multidimensional Data Mode: Data Cubes, Stars, Snowflakes, and Fact Constellations: Schemas for Multidimensional Databases, Examples for Defining Star, Snowflake, and Fact Constellation Schemas, Measures: Their Categorization and Computation, Concept Hierarchies, OLAP Operations in the Multidimensional Data Model, A Starnet Query Model for Quering Multidimensional Database. Data Warehouse Architecture, Data Warehouse Implementation: Efficient Computation of Data Cubes, Indexing OLAP Data, Efficient Processing of OLAP Queries. From Data Warehousing to Data Mining: Data Warehouse Usage, From On-Line Analytical Processing to On-Line Analytical Mining. 7 Module IV Mining Frequent Patterns, Associations, and Correlations % Weightage Basic Concepts and a Road Map : Market Basket Analysis: A Motivating Example . 20 Frequent Itemsets, Closed Itemsets, and Association Rules. Frequent Pattern Mining: A Road Map Efficient and Scalable Frequent Itemset Mining Methods : The Apriori Algorithm: Finding Frequent Itemsets Using Candidate Generation . Generating Association Rules from Frequent Itemsets. Improving the Ef ficiency of Apriori. Mining Frequent Itemsets without Candidate Generation . Mining Frequent Itemsets Using Vertical Data Format . Mining Closed Frequent Itemsets . Mining Various Kinds of Association Rules : Mining Multilevel Association Rules . Mining Multidimensional Association Rules from Relational Databases and Data Warehouses . From Association Mining to Correlation Analysis: Strong Rules Are Not Necessarily Interesting . From Association Analysis to Correlation Analysis. 8 Module V Classification and Prediction % Weightage Issues Regarding Classification and Prediction: Preparing the Data for Classification and 20 Prediction, Comparing Classification and Prediction Methods, Classification by Decision Tree Induction: Decision Tree Induction, Attribute Selection Measures, Tree Pruning, Scalability and Decision Tree Induction, Bayesian Classification: Bayes’ Theorem, Naïve Bayesian Classification, Bayesian Belief Networks, Training Bayesian Belief Networks, Rule-Based Classification: Using IF-THEN Rules for Classification, Rule Extraction from a Decision Tree, Rule Induction Using a Sequential Covering Algorithm, Classification by Backpropagation: A Multilayer Feed-Forward Neural Network, Defining a Network Topology, Backpropagation, Backpropagation and Interpretability, Associative Classification: Classification by Association Rule Analysis, Lazy Learners (or Learning from Your Neighours): k-Nearest-Neighbor Classifiers, Prediction: Linear Regression, Nonlinear Regression, Other Regression-Based Methods, Accuracy and Error Measures: Classifier Accuracy Measures, Predictor Error Measures, Evaluating The Accuracy of a Classifier or Predictor: Holdout Method and Random Subsampling, Cross-Validation, Bootstrap, Ensemble Methods—Increasing the Accuracy: Bagging, Boosting. 9 10 Module VI Advance Techniques of Data Mining 10 Active learning ,Reinforcement learning, Text mining , Graphical models, Web Mining Pedagogy for Course Delivery: 1. Classroom teaching using White board and Presentations. 2. Assignments and Tutorials for continuous assessment. 11 Assessment/ Examination Scheme: Theory L/T (%) Lab/Practical/Studio (%) 100% End Term Examination NA 70 Theory Assessment (L&T): Continuous Assessment/Internal Assessment Components (Drop down) Weightage (%) End Term Examination MidTerm Exam Project Viva Attendance 10 10 5 5 70 Text: Text: 1 “Mastering Data Mining: The Art and Science of Customer Relationship Management”, by Berry and Lin off, John Wiley and Sons, 2001. 2 “Data Ware housing: Concepts, Techniques, Products and Applications”, by C.S.R. Prabhu, Prentice Hall of India, 2001. References: 1 “Data Mining: Concepts and Techniques”, J.Han, M.Kamber, Academic Press, Morgan Kanf man Publishers, 2001. 2 “Data Mining”, by Pieter Adrians, DolfZantinge, Addison Wesley, 2000. 3 “Data Mining with Microsoft SQL Server”, by Seidman, Prentice Hall of India, 2001 Remarks and Suggestions: _______________________________ Date: Name, Designation, Organisation