Download 3640006

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Cluster analysis wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Transcript
GUJARAT TECHNOLOGICAL UNIVERSITY
Master in Computer Application
Year II – (Semester-IV) (W.E.F. January 2017)
Subject Name: Data Mining
Subject Code: 3640006
1. Objective



To understand the need for Data Mining and advantages to the business world.
To get a clear idea of various classes of Data Mining techniques, their need, scenarios situations)
and scope of their applicability.
To learn the algorithms used for various types of Data Mining problems.
2. Prerequisites: Knowledge of RDBMS, OLTP and OLAP
3. Contents:
Unit
Content
1
Data Mining: Introduction and Preprocessing
Weightage # Lectures
15%
07
20%
09


2
Data Mining and Knowledge Discovery
Kinds of Data to be mined ( Database data,
DataWarehouse data, Transactional data and Other data)
 Patterns to be mined ( Concept description, Frequent
pattens, association, correlations, classification, clustering,
outlier analysis) – upto 1.4.5
 Technology used in Mining ( statistics, machine learning,
database, information retrieval )
 Applications of Data Mining
 Data Preprocessing and its requirement, Preprocessing
steps– Data Cleaning, Data Integration and Transformation,
Data Reduction, Sampling.
 Data Transformation and Data Discretization
Mining Frequent Patterns, Associations, and Correlations
 Basic Concepts: Market-Basket Analysis, Frequent Item
sets, Closed Item sets and Association rules.
 Frequent Itemset Mining methods: Apriori algorithm,
generating association rules from frequent itemsets,
improving efficiency of apriori.
 Pattern Evaluation methods :


3
CASE STUDY on Apriori algorithm
Classification : Basic Concepts and Methods








4






09
25%
09
15%
06
Introduction to Cluster Analysis, Requirements for Cluster
Analysis, Types of Data in Cluster Analysis, Partitioning
Methods, Centroid-Based Technique: K-Means Method.
Overview of Basic Clustering Methods: Partitioning
method, Hierarchical method, Density based method, Grid
based method.
Partitioning Methods : k-Means, K-Medoids,
Density based method : DBSCAN
OPTICS, Clustering based on Graph partitioning
CASE STUDY on clustering methods.
Data Mining Trends and Research

25%
Introduction to Classification, general approach to
classification: supervised learning, unsupervised learning,
prediction and Regression analysis, Decision tree induction,
attribute selection methods: information gain, Gain ratio,
Gini index, tree pruning.
CHAID(Chi-square Automatic Interaction Detection)
CART(Classification and Regression trees)
Bayes Classification methods : Bayes’ Theorem, Naïve
Bayesian Classification, Bayesian Belief Networks
Rule based Classification: Using IF-THEN Rules for
Classification, Rule Extraction from a Decision Trees, Rule
Induction Using a Sequential Covering Algorithm
Classification by backpropagation
Classification using frequent patterns
K-Nearest neighbor classifier
CASE STUDY on classification methods.
Cluster Analysis : Basic Concepts and Methods

5
Frequent Pattern Mining: A Roadmap
Applications of pattern mining.
Data Mining for: (a) Financial Data Analysis, (b) The
Retail Industry, (c) The Science and Engineering, (d)
Biological Data Analysis, (e) Other Scientific Applications,
(f) Intrusion detection
Mining Time-Series and Sequence Data, Graph Mining,
Social Network Analysis and Multi relational Data Mining.
Overview of Advanced Techniques: Web Mining, Spatial
Mining, and Text Mining.
4. Text Books:
1. Data Mining: Concepts & Techniques, Jiawei Han & Micheline Kamber, Morgan
Kaufmann Publishers, Elsevier, Third edition .
2. Insight of Data Mining- theory and Practice by K.P.Soman, Shyam Diwakar and V.
Ajay, PHI Publication.
5. Other Reference Books:
1.
2.
3.
4.
Data Mining, Vikram Pudi & P. Radhakrishnan, Oxford University Press (2009).
Data Mining, Pieter Adriaans & Dolf Zentinge¸ Addison-Wesley, Pearson (2000).
Data Mining Methods & Models, Daniel T. Larose, Wiley-India (2007).
Data Mining Techniques, Michael J. A. Berry & Gordon S. Linoff, Wiley-India
(2008).
5. Data Mining – a Tutorial-based Primer, Richard J. Roiger & Michael W. Geatz,
Pearson Education (2005).
6. Data Mining: Introductory and Advanced Topics, Margaret H. Dunham & S. Sridhar,
Pearson Education (2008).
7. Introduction to Data Mining with Case Studies, G. K. Gupta, EEE, PHI (2006).
6. Chapter wise Coverage from the Text Books:
Unit
1
Book
1
Topics/Subtopics
1.1, 1.2, 1.3, 1.4, 1.6, 3.1.2,3.2,(3.2.1, 3.2.2, 3.2.3), 3.3, 3.4.1,
3.4.8,3.5.1,3.5.2, 3.5.3, 3.5.4, 3.5.5, 3.5.6
2
1
6.1, 6.2, 6.3, 7.1, 7.6.2
3
2
1
Chapter 6- Datasets for practical’s only
8.1, 8.2,8.3,8.4, 9.2, 9.4, 9.5
2
1
2
1
Chapter 4- 4.3 and 4.4
10.1,10.2.1, 10.2.2, 10.3.1,
Chapter 11- 11.6, 11.7,11.8
13.1,13.2, 13.3,13.4
4
5