Download Module Data Mining

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
COMP H6035: Data Mining
Short Title:
Data Mining APPROVED
Full Title:
Data Mining
Module Code:
COMP H6035
ECTS credits:
6
NFQ Level:
8
Module Delivered in
no programmes
Module Contributor:
Laura Keyes
Module Description:
The aim of this module is to provide learners with the knowledge and understanding of the steps involved in
the discovery of information in data through the process of Data Mining. This module will give learners an in
depth understanding of how to prepare data for analysis and a variety of data mining algorithms.
Learning Outcomes:
On successful completion of this module the learner will be able to
1. Identify and discuss applications of data mining and its role in the market place including case studies where data mining
was used successfully
2. Describe and conduct - exploratory data analysis to identify data quality problems and detect interesting subsets of data
to form hypotheses for hidden information with use of visualisation - data preparation phase and feature selection
techniques used for data mining and the knowledge discovery process - the data modelling process, paradigms and
algorithms for classification and prediction
3. Be competent at using and deriving algorithms to build Decision Tree, Back-Propagation Neural Network and K-NN
models for classification and prediction in a data mining task
4. Describe and assess clustering analysis and association rules for data analysis
5. Implement and apply data mining for information discovery on a dataset using CRISP-DM methodology
6. Assess the performance and patterns produced by a data mining process
Page 1 of 3
COMP H6035: Data Mining
Module Content & Assessment
Indicative Content
Introduction to Data Mining
Data Mining and Knowledge Discovery Data Mining Tasks and Applications Methodologies: CRISP-DM
Data Understanding
Exploratory Data Analysis Data Quality, Detecting Interesting Subsets of Data Graphs, Cross-Tabulation
Data Preparation
Data Cleaning: Handling Missing Data, Noisy Data and Outliers. Data Transformation: Smoothing, Normalisation. Data Reduction: Data
aggregation, Dimensionality Reduction, Sampling
Classification
Data Modelling Techniques, Training and Test Data, Decision Trees - C5.0, CART. Neural Networks – Back Propagation. Association
Analysis and Market Basket analysis. Model Evaluation Techniques
Clustering
Introduction to Clustering - Partitioning, Hierarchical, Grid-based and Density based. Algorithms - K-means, DBScan
Indicative Assessment Breakdown
%
Course Work Assessment %
100.00%
Course Work Assessment %
Assessment
Type
Assessment Description
Outcome
addressed
% of
total
Assessment
Date
Lab work
The students will complete laboratory practical work that is designed to
introduce them to, and familiarize them with using data mining software.
2,3,5,6
15.00
n/a
Project
This assessment will include a theoretical and practical component
bringing together many of the concepts addressed in the module.
Theoretical: Students will research and report on relevant case studies
of data mining and knowledge discovery. Practical: Application and
evaluation of the different phases in a data mining project on a dataset
using a data mining tool such as Rapidminer. Students will be required
to document their work in a comprehensive manner.
1,2,3,5,6
70.00
n/a
Practical/Skills
Evaluation
In-class test assessing knowledge of topics including feature selection
techniques, clustering analysis and association rule mining
2,4
15.00
n/a
No Final Exam Assessment %
Indicative Reassessment Requirement
Coursework Only
This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.
ITB reserves the right to alter the nature and timings of assessment
Page 2 of 3
COMP H6035: Data Mining
Indicative Module Workload & Resources
Indicative Workload: Full Time
Frequency
Indicative Average Weekly Learner Workload
Every Week
2.00
Every Week
2.00
Resources
Recommended Book Resources
Daniel T. Larose, Wiley-Interscience Hoboken, N.J. 2005, Discovering knowledge in data [ISBN: 0471666]
Jiawei Han and Micheline Kamber, Morgan Kaufmann Publishers San Francisco 2001, Data mining [ISBN: 1558604]
Pang-Ning Tan, Michael Steinbach, Vipin Kumar 2006, Introduction to data mining, Pearson Addison Wesley Boston [ISBN:
0321321367]
Paolo Giudici, Silvia Figini 2009, Applied Data Mining for Business and Industry, Wiley [ISBN: 9780470058879]
Jiawei Han, Micheline Kamber, Jian Pei, Data Mining: Concepts and Techniques, Third Edition, Morgan Kaufmann [ISBN:
9780123814791]
Supplementary Book Resources
Robert Groth 2000, Data mining, Prentice Hall PTR Upper Saddle River, NJ [ISBN: 0130862711]
Ian H. Witten, Eibe Frank, Mark A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Third Edition, Morgan
Kaufmann [ISBN: 9780123748560]
This module does not have any article/paper resources
Other Resources
Internet based resource: KD Nuggets
http://www.kdnuggets.com
Internet based resource: CRISP-DM Life Cycle
http://www.crisp-dm.org
Internet based resource: Training Material Repository
http://www.rapidminerresources.com
Page 3 of 3