Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lahore University of Management Sciences CS 432 – Introduction to Data Mining CS 536 – Data Mining Spring 2015-2016 Instructor Room No. Office Hours Email Telephone Secretary/TA TA Office Hours Course URL (if any) Course Basics Credit Hours Lecture(s) Recitation/Lab (per week) Tutorial (per week) Mian Muhammad Awais 9-115A [email protected] 8188 3 Nbr Per Nbr Per Nbr Per of Lec(s) Week of Lab(s) Week of Lec(s) Week 2 Duration 0 Duration Optional Duration 75 minutes Course Distribution Core Elective X Open for Student Junior, Senior Category Close for Student Freshmen, Sophomore Category COURSE DESCRIPTION Data mining or discovery of knowledge in large datasets has created a lot of interest in the business and research communities in recent years. The tremendous increase in the generation and collection of data has highlighted the need for systems that can extract useful and actionable knowledge from large datasets. This course will provide a comprehensive introduction to the data mining process; build theoretical and conceptual foundations of key data mining tasks such as itemset mining and clustering; discuss analysis and implementation of algorithms; and introduce major sub-areas such as text and web mining. Emphasis will be placed on the design and application of efficient and scalable algorithms. The students will get hands on experience through the implementation of algorithms and use of software in assignments and course project. COURSE PREREQUISITE(S) CS 202 - Data Structures, OR grad standing Lahore University of Management Sciences COURSE OBJECTIVES To develop the concepts of and the techniques in key data mining tasks To provide hands-on experience with data mining using tools To encourage innovative and useful applications of data mining tasks Learning Outcomes Explore, visualize, and analyze large datasets Select and evaluate data mining techniques for the discovery of relevant knowledge from datasets Understand efficiency, scalability, and correctness challenges in data mining Grading Breakup and Policy Assignment(s): 10% Quiz(s): 15% Midterm Exam: 25% Project: 15% Final Exam: 35% Examination Detail Midterm Exam Yes Combine Separate: Duration: 75 minutes Preferred Date: Exam Specifications: closed books/notes, help sheet, calculator allowed Final Exam Yes Combine Separate: Duration: 2 hours Exam Specifications: closed books/notes, help sheet, calculator allowed COURSE OVERVIEW Lecture 1-2 3-7 Topics Overview of Data Mining Need and motivation; data mining process; data mining tasks and functionalities, interestingness measures Data Understanding and Preprocessing Recommended Readings Ch. 1 Ch. 2 & 3 Objectives/ Application Lahore University of Management Sciences Data exploration and visualization; basic stats; data cleaning, data reduction, dimensionality reduction; discretization, concept hierarchies 8-14 Mining Frequent Patterns and Associations 15 16-21 Basic definitions, market basket analysis, Apriori algorithm, FP-growth algorithm, mining complex patterns, constrained itemset mining, sequential pattern mining MIDTERM EXAM Cluster Analysis 22-27 Similarity measures, partitioning methods: KMeans, K-Medoids, hierarchical methods, density-based methods, graph-based methods, outlier/anomaly detection Applications 28 Sentiment Analysis, opinion mining, behavior modeling etc. Makeup and/or review Ch. 5; sections from WDM Ch. 7, selected papers Handouts/Relevant Book Chapters Textbook(s)/Supplementary Readings Data Mining: Concepts and Techniques, J. Han, M. Kamber, and J. Pei, Third Edition, Morgan Kaufmann Publishers, 2011. Web Data Mining, B. Liu, Springer, 2006. Introduction to Information Retrieval, C. Manning et al., Cambridge University Press, Available Online, 2008. Reference: Introduction to Data Mining, V. Tan et al. Addison-Wesley, 2006.