Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COMP H6035: Data Mining Short Title: Data Mining APPROVED Full Title: Data Mining Module Code: COMP H6035 ECTS credits: 6 NFQ Level: 8 Module Delivered in no programmes Module Contributor: Laura Keyes Module Description: The aim of this module is to provide learners with the knowledge and understanding of the steps involved in the discovery of information in data through the process of Data Mining. This module will give learners an in depth understanding of how to prepare data for analysis and a variety of data mining algorithms. Learning Outcomes: On successful completion of this module the learner will be able to 1. Identify and discuss applications of data mining and its role in the market place including case studies where data mining was used successfully 2. Describe and conduct - exploratory data analysis to identify data quality problems and detect interesting subsets of data to form hypotheses for hidden information with use of visualisation - data preparation phase and feature selection techniques used for data mining and the knowledge discovery process - the data modelling process, paradigms and algorithms for classification and prediction 3. Be competent at using and deriving algorithms to build Decision Tree, Back-Propagation Neural Network and K-NN models for classification and prediction in a data mining task 4. Describe and assess clustering analysis and association rules for data analysis 5. Implement and apply data mining for information discovery on a dataset using CRISP-DM methodology 6. Assess the performance and patterns produced by a data mining process Page 1 of 3 COMP H6035: Data Mining Module Content & Assessment Indicative Content Introduction to Data Mining Data Mining and Knowledge Discovery Data Mining Tasks and Applications Methodologies: CRISP-DM Data Understanding Exploratory Data Analysis Data Quality, Detecting Interesting Subsets of Data Graphs, Cross-Tabulation Data Preparation Data Cleaning: Handling Missing Data, Noisy Data and Outliers. Data Transformation: Smoothing, Normalisation. Data Reduction: Data aggregation, Dimensionality Reduction, Sampling Classification Data Modelling Techniques, Training and Test Data, Decision Trees - C5.0, CART. Neural Networks – Back Propagation. Association Analysis and Market Basket analysis. Model Evaluation Techniques Clustering Introduction to Clustering - Partitioning, Hierarchical, Grid-based and Density based. Algorithms - K-means, DBScan Indicative Assessment Breakdown % Course Work Assessment % 100.00% Course Work Assessment % Assessment Type Assessment Description Outcome addressed % of total Assessment Date Lab work The students will complete laboratory practical work that is designed to introduce them to, and familiarize them with using data mining software. 2,3,5,6 15.00 n/a Project This assessment will include a theoretical and practical component bringing together many of the concepts addressed in the module. Theoretical: Students will research and report on relevant case studies of data mining and knowledge discovery. Practical: Application and evaluation of the different phases in a data mining project on a dataset using a data mining tool such as Rapidminer. Students will be required to document their work in a comprehensive manner. 1,2,3,5,6 70.00 n/a Practical/Skills Evaluation In-class test assessing knowledge of topics including feature selection techniques, clustering analysis and association rule mining 2,4 15.00 n/a No Final Exam Assessment % Indicative Reassessment Requirement Coursework Only This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination. ITB reserves the right to alter the nature and timings of assessment Page 2 of 3 COMP H6035: Data Mining Indicative Module Workload & Resources Indicative Workload: Full Time Frequency Indicative Average Weekly Learner Workload Every Week 2.00 Every Week 2.00 Resources Recommended Book Resources Daniel T. Larose, Wiley-Interscience Hoboken, N.J. 2005, Discovering knowledge in data [ISBN: 0471666] Jiawei Han and Micheline Kamber, Morgan Kaufmann Publishers San Francisco 2001, Data mining [ISBN: 1558604] Pang-Ning Tan, Michael Steinbach, Vipin Kumar 2006, Introduction to data mining, Pearson Addison Wesley Boston [ISBN: 0321321367] Paolo Giudici, Silvia Figini 2009, Applied Data Mining for Business and Industry, Wiley [ISBN: 9780470058879] Jiawei Han, Micheline Kamber, Jian Pei, Data Mining: Concepts and Techniques, Third Edition, Morgan Kaufmann [ISBN: 9780123814791] Supplementary Book Resources Robert Groth 2000, Data mining, Prentice Hall PTR Upper Saddle River, NJ [ISBN: 0130862711] Ian H. Witten, Eibe Frank, Mark A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Third Edition, Morgan Kaufmann [ISBN: 9780123748560] This module does not have any article/paper resources Other Resources Internet based resource: KD Nuggets http://www.kdnuggets.com Internet based resource: CRISP-DM Life Cycle http://www.crisp-dm.org Internet based resource: Training Material Repository http://www.rapidminerresources.com Page 3 of 3