Download CS1004 Data Warehousing and Mining Prof. R

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
DOC/LP/01/28.02.02
LP - CS1004
LESSON PLAN
LP Rev. No:00
Sub Code & Name: CS 1004 – DATA WAREHOUSING AND MINING
Unit: I
Branch: CSE
Semester:
6
Date: 7.12.09
Page 1 of 6
Unit I : INTRODUCTION AND DATAWAREHOUSING
Introduction, Data Warehouse, Multidimensional Data Model, Data Warehouse
Architecture, Implementation, Further Development, Data Warehousing to Data Mining
Objective:
Here the students learn the basics of Data mining and Data Warehousing.The
difference between Database and Data Warehouse are discussed in Detail.
Implementations of Data Warehouse using DMQL are made known to students.
Session
No
Topics to be covered
1
Motivation towards Data mining. Data mining
- Definition, Process of KDD.
Architecture of Data mining systems.
Data mining on different databases.
Introduction to Data mining Functionalities.
Concept/class Description: characterization
and Discrimination, Association analysis,
Cluster analysis, Classification and Prediction
And outlier analysis.
Classification of Data Mining systems.
OLAP and OLTP,Cuboids, star/snowflake
and Fact constellation schema
Introducing Concept Hierarchies, OLAP
operations on Multidimensional Data Models
A 3-tier Data warehouse architecture, Types of
OLAP servers.
Data warehouse Implementation: Compute
cube operator, Partial Materialization,
Multiway array aggregation of Data cube
computation.
Metadata Repository, Integrated OLAM and
OLAP architecture.
Issues in OLAP indexing.
2
3
4
5
6
7
8
9
10
Time
Allocation
(min)
50
Books
Referred
Teaching
Method
1
BB
50
1
BB
50
50
1
1
BB
BB
50
1
BB
50
1,5
BB
50
1,5
BB
25
1,5
BB
50
1,5
BB
50
1,5
BB
25
DOC/LP/01/28.02.02
LP - CS1004
LESSON PLAN
LP Rev. No:00
Sub Code & Name: CS 1004 – DATA WAREHOUSING AND MINING
Unit: II
Branch: CSE
Semester:
6
Date: 7.12.09
Page 2 of 6
Unit II: DATA PREPROCESSING, LANGUAGE, ARCHITECTURES, CONCEPT
DESCRIPTION
Why Preprocessing, Cleaning, Integration, Transformation, Reduction, Discretization,
Concept Hierarchy Generation, Data Mining Primitives, Query Language, Graphical User
Interfaces, Architectures, Concept Description, Data Generalization, Characterizations,
Class Comparisons, Descriptive Statistical Measures.
Objective:
To study and analyze the preprocessing, cleaning and Integration techniques.
Here the students get first hand exposure to DMQL and its implementation
issues.students also learns the functionalities of data mining.
Session
No
Topics to be covered
11
12
13
20
Data Cleaning and Noisy data.
Data Integration and Transformation.
Data Reduction, aggregation and dimension
Reduction, compression techniques.
PCA, Numerosity reduction.
Discretization and Concept Hierarchy
generation.
Defining a DM task. DMQL –syntax and
examples for major functionalities.
Architectures of DM systems.
Concept Description, generalization and
summarization.
Attribute –Oriented Induction. Presentation of
Derived generalizations. Attribute Relevance
analysis.
Descriptive statistical measures.
Measuring central tendency, dispersion of data
Graphical displays of DSM.
21
22
14
15
16
17
18
19
Time
Allocation
(min)
50
50
20
30
50
50
Books
Referred
Teaching
Method
1
1
1
BB
BB
BB
1
1
BB
BB
50
1
BB
35
15
1
BB
50
1
BB
15
35
50
1
BB
1
BB
Problems on quartiles,boxplots,outliers
50
1
BB
CAT – I
60
DOC/LP/01/28.02.02
LP - CS1004
LESSON PLAN
LP Rev. No:00
Sub Code & Name: CS 1004 – DATA WAREHOUSING AND MINING
Date: 7.12.09
6
Page 3 of 6
Unit: III
Branch: CSE
Semester:
Unit III :ASSOCIATION RULES
Association Rule Mining, Single-Dimensional Boolean Association Rules from
Transactional Databases, Multi-Level Association Rules from Transaction Databases
Objective:
The students learn association mining and algorithms that perform single& multi
Level dimensional rule mining.
Session
No
Topics to be covered
23
24
Association Rule mining – an introduction.
Mining single dimensional Boolean
association rules
The Apriori Algorithm: Finding frequent item
Sets using Candidate generation
Mining frequent item sets without candidate
Generation, frequent pattern growth algorithm
Iceberg queries, mining multilevel association
rules from Transactional databases.
Approaches to Mining multilevel association
rules.
Mining multi dimension association rules from
Relational databases
Mining multi dimension association rules
using static discretization of quantitative
attributes.
Mining distance based association rules.
Constraint based association mining
Meta rule – guided Mining of association rules
25
26
27
28
29
30
31
32
33
Time
Allocation
(Min)
50
50
Books
Referred
Teaching
Method
1
1
BB
BB
35
15
50
1
BB
1
BB
50
1
BB
50
1
BB
25
25
50
1
BB
1
BB
50
50
50
1
1
1
BB
BB
BB
DOC/LP/01/28.02.02
LP - CS1004
LESSON PLAN
LP Rev. No:00
Sub Code & Name: CS 1004 – DATA WAREHOUSING AND MINING
Unit: IV
Branch: CSE
Semester:
6
Date: 7.12.09
Page 4 of 6
Unit IV- CLASSIFICATION AND CLUSTERING
Classification and Prediction, Issues, Decision Tree Induction, Bayesian Classification,
Association Rule Based, Other Classification Methods, Prediction, Classifier Accuracy,
Cluster Analysis, Types of data, Categorization of methods, Partitioning methods, Outlier
Analysis.
Objective:
To study various classification methods like Bayesian, DTI and cluster analysis. Here
Outlier analyses are studied in detail.
Session
No
34
35
36
37
38
39
40
41
42
43
44
Topics to be covered
Classification and Prediction
Classification by decision tree induction
method.
Tree pruning. Extracting classification rules
from decision trees.
Bayesian classification, bayes theorem,
Bayesian belief networks.
A multilayer Feed-forward neural Network.
Association rule based classification.
Classifier Accuracy and Increasing accuracy.
Cluster Analysis, types of data.
Partitioning methods – K Means and K medoids
Statistical based outlier detection
distance based outlier detection
Deviation based outlier detection
CAT -II
Time
Allocation
(Min)
50
50
Books
Referred
Teaching
Method
1
1
BB
50
1
BB
BB
50
1
BB
50
1
BB
50
50
50
1
1
1
BB
BB
BB
50
1
BB
50
60
1
BB
DOC/LP/01/28.02.02
LP - CS1004
LESSON PLAN
LP Rev. No:00
Sub Code & Name: CS 1004 – DATA WAREHOUSING AND MINING
Unit: V
Branch: CSE
Semester:
6
Date: 7.12.09
Page 5 of 6
Unit V-RECENT TRENDS
Multidimensional Analysis and Descriptive Mining of Complex Data Objects, Spatial
Databases, Multimedia Databases, Time Series and Sequence Data, Text Databases,
World Wide Web, Applications and Trends in Data Mining
Objective:
Here the student gets exposure over Text Databases, Web Databases, Spatial
Databases and Multimedia Databases. Thorough understanding of this chapter would.
Help the student to carry out research work in this area.
Topics to be covered
Session
No
45
Multidimensional Analysis and Descriptive
Mining of Complex Data Objects
46
Aggregation and approximation in spatial and
Multimedia Data generalization.
47
Mining spatial Databases ,Spatial OLAP
Spatial assoc and cluster analysis.
48
Mining Multimedia Databases
49
Mining Time series and sequence data
Similarity search in time –series Analysis
50
Mining Text Databases
Text Data analysis and Information retrieval
51
Mining WWW
Identification of Authoritative web pages.
52
Web Usage mining
53
CAT -III
Time
Allocation
(min)
50
Books
Referred
Teaching
Method
1
BB
50
1
BB
20
30
50
10
40
25
25
15
35
50
60
1
BB
1
1
BB
BB
1
BB
1
BB
1
BB
DOC/LP/01/28.02.02
LP - CS1004
LESSON PLAN
LP Rev. No:00
Sub Code & Name: CS 1004 – DATA WAREHOUSING AND MINING
Unit: V
Branch: CSE
Semester:
Date: 7.12.09
6
Page 6 of 6
Course Delivery Plan:
Week
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
I II
I II
I II
I II
I II
I II
I II
I II
I II
I II
I II
I II
I II
I II
I II
Units
TEXT BOOK
1.J. Han, M. Kamber, “ Data Mining: Concepts and Techniques ” , Harcourt India
Morgan Kauffman, 2001.
REFERENCES
2.Margaret H.Dunham, “ Data Mining: Introductory and Advanced Topics ” ,
Pearson Education 2004.
3.Sam Anahory, Dennis Murry, “ Data Warehousing in the real world ” , Pearson
Education 2003.
4.David Hand, Heikki Manila, Padhraic Symth, “ Principles of Data Mining ” ,
PHI 2004.
5.W.H.Inmon, “ Building the Data Warehouse ” , 3 rd Edition, Wiley, 2003.
Alex Bezon, Stephen J.Smith, “ Data Warehousing, Data Mining & OLAP ” , MeGrawHill Edition, 2001.
Prepared by
Approved by
Signature
Name
Designation
Date
Prof. R.NEDUNCHELIAN
Ms. S.PUSHPA
Prof/CSE
Asst. Prof/CSE
Dr. SUSAN ELIAS
HOD – CSE
Related documents