Download Module Data Mining

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
COMP H3027: Data Mining
Short Title:
Data Mining APPROVED
Full Title:
Data Mining
Module Code:
COMP H3027
ECTS credits:
5
NFQ Level:
7
Module Delivered in
no programmes
Module Contributor:
Geraldine Gray
Module Description:
The aim of this module is to provide learners with the knowledge and understanding of the steps involved in
the discovery of information in data through the process of Data Mining. This module will give students an in
depth understanding of how to prepare the data for analysis and a variety of data mining algorithms.
Learning Outcomes:
On successful completion of this module the learner will be able to
1. Have knowledge and understanding of the following processes and procedures in Data Mining: - exploratory data analysis
to identify data quality problems and detect interesting subsets of data to form hypotheses for hidden information with use
of visualisation - the data preparation phase and techniques used for data mining and the knowledge discovery process the data modelling process, techniques and algorithms for classification and prediction - applications of data mining and
its role in the market place including case studies where data mining was used successfully - data mining products and
methodologies
2. Implement and apply data mining for knowledge discovery on a dataset using CRISP-DM methodology
3. Be competent at using and deriving algorithms to build Decision Tree and Back-Propagation Neural Network models for
classification and prediction in a data mining task
4. Have knowledge of other categories of mining algorithms such as clustering and association analysis
5. Assess the performance and patterns produced by a data mining process
Page 1 of 3
COMP H3027: Data Mining
Module Content & Assessment
Indicative Content
Introduction to Data Mining
Data Mining and Knowledge Discovery Data Mining Tasks and Applications Methodologies: CRISP-DM
Data Understanding
Exploratory Data Analysis Data Quality, Detecting Interesting Subsets of Data Graphs, Cross-Tabulation
Data Preparation
Data Cleaning: Handling Missing Data, Noisy Data and Outliers Data Transformation: Smoothing, Normalisation Data Reduction: Data
aggregation, Dimensionality Reduction, Sampling
Classification
Data Modelling Techniques Training and Test Data Decision Trees - C5.0, CART Neural Networks – Back Propagation Association
Analysis and Market Basket analysis Model Evaluation Techniques
Clustering
Introduction to Clustering K-means Hierarchical
Indicative Assessment Breakdown
%
Course Work Assessment %
50.00%
Final Exam Assessment %
50.00%
Course Work Assessment %
Assessment
Type
Assessment Description
Outcome
addressed
% of
total
Assessment
Date
Lab work
Weekly lab work to complete data mining task in Rapidiner
1,3,5
15.00
n/a
Project
Complete all stages of CRISP-DM on a dataset provided by the
lecturer
1,2,3,5
35.00
n/a
Final Exam Assessment %
Assessment Type
Assessment Description
Outcome
addressed
% of
total
Assessment Date
Formal Exam
End-of-Semester Final Examination
1,3,5
50.00
End-of-Semester
Indicative Reassessment Requirement
Repeat examination
Reassessment of this module will consist of a repeat examination. It is possible that there will also be a requirement to be reassessed in a
coursework element.
Reassessment Description
Students must repeat continuous assessment work and the final exam
ITB reserves the right to alter the nature and timings of assessment
Page 2 of 3
COMP H3027: Data Mining
Indicative Module Workload & Resources
Indicative Workload: Full Time
Frequency
Indicative Average Weekly Learner Workload
Every Week
2.00
Every Week
2.00
Resources
Recommended Book Resources
Daniel T. Larose 2005, Discovering knowledge in data, Wiley-Interscience Hoboken, N.J. [ISBN: 0471666572]
Jiawei Han and Micheline Kamber 2001, Data mining, Morgan Kaufmann Publishers San Francisco [ISBN: 1558604898]
Pang-Ning Tan, Michael Steinbach, Vipin Kumar 2006, Introduction to data mining, Pearson Addison Wesley Boston [ISBN:
0321321367]
Supplementary Book Resources
Witten, I. H., Morgan Kaufmann 2005, Data Mining: Practical Machine Learning Tools and Techniques
Robert Groth 2000, Data mining, Prentice Hall PTR Upper Saddle River, NJ [ISBN: 0130862711]
This module does not have any article/paper resources
Other Resources
Internet based resource: KD Nuggets
http://www.kdnuggets.com
Internet based resource: CRISP-DM Life Cycle
http://www.crisp-dm.org
Internet based resource: Training Material Repository
http://www.rapidminerresources.com
Internet based resource: Training Video Repository
http://www.neuralmarkettrends.com
Page 3 of 3