Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
COMP H3027: Data Mining Short Title: Data Mining APPROVED Full Title: Data Mining Module Code: COMP H3027 ECTS credits: 5 NFQ Level: 7 Module Delivered in no programmes Module Contributor: Geraldine Gray Module Description: The aim of this module is to provide learners with the knowledge and understanding of the steps involved in the discovery of information in data through the process of Data Mining. This module will give students an in depth understanding of how to prepare the data for analysis and a variety of data mining algorithms. Learning Outcomes: On successful completion of this module the learner will be able to 1. Have knowledge and understanding of the following processes and procedures in Data Mining: - exploratory data analysis to identify data quality problems and detect interesting subsets of data to form hypotheses for hidden information with use of visualisation - the data preparation phase and techniques used for data mining and the knowledge discovery process the data modelling process, techniques and algorithms for classification and prediction - applications of data mining and its role in the market place including case studies where data mining was used successfully - data mining products and methodologies 2. Implement and apply data mining for knowledge discovery on a dataset using CRISP-DM methodology 3. Be competent at using and deriving algorithms to build Decision Tree and Back-Propagation Neural Network models for classification and prediction in a data mining task 4. Have knowledge of other categories of mining algorithms such as clustering and association analysis 5. Assess the performance and patterns produced by a data mining process Page 1 of 3 COMP H3027: Data Mining Module Content & Assessment Indicative Content Introduction to Data Mining Data Mining and Knowledge Discovery Data Mining Tasks and Applications Methodologies: CRISP-DM Data Understanding Exploratory Data Analysis Data Quality, Detecting Interesting Subsets of Data Graphs, Cross-Tabulation Data Preparation Data Cleaning: Handling Missing Data, Noisy Data and Outliers Data Transformation: Smoothing, Normalisation Data Reduction: Data aggregation, Dimensionality Reduction, Sampling Classification Data Modelling Techniques Training and Test Data Decision Trees - C5.0, CART Neural Networks – Back Propagation Association Analysis and Market Basket analysis Model Evaluation Techniques Clustering Introduction to Clustering K-means Hierarchical Indicative Assessment Breakdown % Course Work Assessment % 50.00% Final Exam Assessment % 50.00% Course Work Assessment % Assessment Type Assessment Description Outcome addressed % of total Assessment Date Lab work Weekly lab work to complete data mining task in Rapidiner 1,3,5 15.00 n/a Project Complete all stages of CRISP-DM on a dataset provided by the lecturer 1,2,3,5 35.00 n/a Final Exam Assessment % Assessment Type Assessment Description Outcome addressed % of total Assessment Date Formal Exam End-of-Semester Final Examination 1,3,5 50.00 End-of-Semester Indicative Reassessment Requirement Repeat examination Reassessment of this module will consist of a repeat examination. It is possible that there will also be a requirement to be reassessed in a coursework element. Reassessment Description Students must repeat continuous assessment work and the final exam ITB reserves the right to alter the nature and timings of assessment Page 2 of 3 COMP H3027: Data Mining Indicative Module Workload & Resources Indicative Workload: Full Time Frequency Indicative Average Weekly Learner Workload Every Week 2.00 Every Week 2.00 Resources Recommended Book Resources Daniel T. Larose 2005, Discovering knowledge in data, Wiley-Interscience Hoboken, N.J. [ISBN: 0471666572] Jiawei Han and Micheline Kamber 2001, Data mining, Morgan Kaufmann Publishers San Francisco [ISBN: 1558604898] Pang-Ning Tan, Michael Steinbach, Vipin Kumar 2006, Introduction to data mining, Pearson Addison Wesley Boston [ISBN: 0321321367] Supplementary Book Resources Witten, I. H., Morgan Kaufmann 2005, Data Mining: Practical Machine Learning Tools and Techniques Robert Groth 2000, Data mining, Prentice Hall PTR Upper Saddle River, NJ [ISBN: 0130862711] This module does not have any article/paper resources Other Resources Internet based resource: KD Nuggets http://www.kdnuggets.com Internet based resource: CRISP-DM Life Cycle http://www.crisp-dm.org Internet based resource: Training Material Repository http://www.rapidminerresources.com Internet based resource: Training Video Repository http://www.neuralmarkettrends.com Page 3 of 3