Download Annexure `AAB-CD-01a` Course Title: Advance Data Mining

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Transcript
Annexure ‘AAB-CD-01a’
Course Title: Advance Data Mining & Warehousing
L
T
P/
S
SW/F
W
Course Code: CSE903
Credit Units: 04
Level: Ph.D.(Part Time)
4
0
0
0
TOTAL
CREDIT
UNITS
4
Please give your valuable feedback ratings (on the scale of 6 points) for following course curriculum with respect to relevance to Industry / Profession:
6
Excellent
5
Very
Good
4
Good
3
Moderate
2
Needs
Improvement
#
Course Title
1
Course Objectives: To demonstrate new concepts of organizing data ware house & data
mining technique to drive the useful information out of the piles of data. With the growth
of large amount of data today it has become necessity to explore and mine the data so that
we can have hidden useful Information. This course will expose students to the process of
extracting patterns and useful information from large data sets by combining methods from
data mining, statistics and artificial intelligence with database management. It will also
expose students to have data analysis using data mining tools. This course is also covering
some advance topics in data mining like, opinion mining, web mining etc.
2
Prerequisites: Basic knowledge of Database Management System and algorithms to
analyze data.
3
Student Learning Outcomes:
1
Poor
Feedback Rating
(on scale of 6 points)
Comments (if any)
•
By the end of this course students will be able to design and develop a data
warehouse.
• They will be able to analyze and evaluate data warehouse using a multidimensional
model and by using various OLAP techniques
• Students will be able to display a comprehensive understanding of different data
mining tasks and the algorithms most appropriate for addressing them.
• Students will be able to evaluate models/algorithms with respect to their accuracy.
• Students will be able to demonstrate capacity to perform a self directed piece of
practical work that requires the application of data mining techniques.
• Students will be able to Analyze and critique the results of a data mining exercise.
• Students will be able to conceptualize a data mining solution to a practical problem
Course Contents / Syllabus:
4
Module I Introduction
% Weightage
Data Mining Functionalities. Concept/Class Description: Characterization and 15
Discrimination, Mining Frequent Patterns, Associations, and Correlations, Classification
and Prediction, Cluster Analysis, Outlier Analysis, Evolution Analysis. Classification of
Data Mining Systems, Data Mining Task Primitives, Integration of a Data Mining System
with a Database or Data Warehouse System.
5
Module II Data Preprocessing
% Weightage
Data Cleaning: Missing Values, Noisy Data, Data Integration and Transformation, Data 15
Reduction: Data Cube Aggregation, Attribute Subset Selection, Dimensionality Reduction.
6
Module III Data Warehouse and OLAP Technology
% Weightage
Differences between Operational Database Systems and Data Warehouses. A 20
Multidimensional Data Mode: Data Cubes, Stars, Snowflakes, and Fact Constellations:
Schemas for Multidimensional Databases, Examples for Defining Star, Snowflake, and
Fact Constellation Schemas, Measures: Their Categorization and Computation, Concept
Hierarchies, OLAP Operations in the Multidimensional Data Model, A Starnet Query
Model for Quering Multidimensional Database. Data Warehouse Architecture, Data
Warehouse Implementation: Efficient Computation of Data Cubes, Indexing OLAP Data,
Efficient Processing of OLAP Queries. From Data Warehousing to Data Mining: Data
Warehouse Usage, From On-Line Analytical Processing to On-Line Analytical Mining.
7
Module IV Mining Frequent Patterns, Associations, and Correlations
% Weightage
Basic Concepts and a Road Map : Market Basket Analysis: A Motivating Example . 20
Frequent Itemsets, Closed Itemsets, and Association Rules. Frequent Pattern Mining: A
Road Map
Efficient and Scalable Frequent Itemset Mining Methods : The Apriori Algorithm:
Finding Frequent Itemsets Using Candidate Generation . Generating Association Rules
from Frequent Itemsets. Improving the Ef
ficiency of Apriori. Mining Frequent Itemsets
without Candidate Generation . Mining Frequent Itemsets Using Vertical Data Format .
Mining Closed Frequent Itemsets .
Mining Various Kinds of Association Rules : Mining Multilevel Association Rules .
Mining Multidimensional Association Rules from Relational Databases and Data
Warehouses .
From Association Mining to Correlation Analysis: Strong Rules Are Not Necessarily
Interesting . From Association Analysis to Correlation Analysis.
8
Module V Classification and Prediction
% Weightage
Issues Regarding Classification and Prediction: Preparing the Data for Classification and 20
Prediction, Comparing Classification and Prediction Methods, Classification by Decision
Tree Induction: Decision Tree Induction, Attribute Selection Measures, Tree Pruning,
Scalability and Decision Tree Induction, Bayesian Classification: Bayes’ Theorem, Naïve
Bayesian Classification, Bayesian Belief Networks, Training Bayesian Belief Networks,
Rule-Based Classification: Using IF-THEN Rules for Classification, Rule Extraction from
a Decision Tree, Rule Induction Using a Sequential Covering Algorithm, Classification by
Backpropagation: A Multilayer Feed-Forward Neural Network, Defining a Network
Topology, Backpropagation, Backpropagation and Interpretability,
Associative
Classification: Classification by Association Rule Analysis, Lazy Learners (or Learning
from Your Neighours): k-Nearest-Neighbor Classifiers, Prediction: Linear Regression,
Nonlinear Regression, Other Regression-Based Methods, Accuracy and Error Measures:
Classifier Accuracy Measures, Predictor Error Measures, Evaluating The Accuracy of a
Classifier or Predictor: Holdout Method and Random Subsampling, Cross-Validation,
Bootstrap, Ensemble Methods—Increasing the Accuracy: Bagging, Boosting.
9
10
Module VI Advance Techniques of Data Mining
10
Active learning ,Reinforcement learning, Text mining , Graphical models, Web Mining
Pedagogy for Course Delivery:
1. Classroom teaching using White board and Presentations.
2. Assignments and Tutorials for continuous assessment.
11
Assessment/ Examination Scheme:
Theory L/T (%)
Lab/Practical/Studio (%)
100%
End Term Examination
NA
70
Theory Assessment (L&T):
Continuous Assessment/Internal Assessment
Components
(Drop down)
Weightage (%)
End Term
Examination
MidTerm
Exam
Project
Viva
Attendance
10
10
5
5
70
Text: Text:
1 “Mastering Data Mining: The Art and Science of Customer Relationship Management”, by Berry and Lin off, John Wiley and Sons,
2001.
2 “Data Ware housing: Concepts, Techniques, Products and Applications”, by C.S.R. Prabhu, Prentice Hall of India, 2001.
References:
1 “Data Mining: Concepts and Techniques”, J.Han, M.Kamber, Academic Press, Morgan Kanf man Publishers, 2001.
2 “Data Mining”, by Pieter Adrians, DolfZantinge, Addison Wesley, 2000.
3 “Data Mining with Microsoft SQL Server”, by Seidman, Prentice Hall of India, 2001
Remarks and Suggestions:
_______________________________
Date:
Name, Designation, Organisation