Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Warehousing/Mining Comp 150DW Course Overview Instructor: Dan Hebert Data Warehousing/Mining 1 Comp 150 Thursday 6:50 - 9:50 PM Instructor - Mr. Dan Hebert – email - [email protected] – Location - Halligan Hall, rm. 108 Data Warehousing/Mining 2 Course Description Fundamental concepts and techniques of data warehousing and data mining – concepts, principles, architecture, design, implementation, and application of data warehousing and data mining Topics: Data warehousing and OLAP technology for data mining, data preprocessing, data mining primitives, languages and systems, descriptive data mining, both characterization and comparison, association analysis, classification and prediction, cluster analysis, mining complex types of data, and applications and trends in data mining Data Warehousing/Mining 3 Course Prerequisite Comp 115 – Introduction to RDBMS – Familiarity with programming with C/C++ is assumed Students should be comfortable with: – – – – – – – – relational model basics relational algebra SQL Views Security conceptual database design and ER models schema refinement and normal forms physical database design and tuning Data Warehousing/Mining 4 Required Textbook Data Mining Concepts and Techniques – Jiawei Han & Micheline Kamber – Morgan Kaufmann Publishers; ISBN: 1-55860-489-8 Data Warehousing/Mining 5 Reading Schedule Lecture Date January 22 January 29 February 5 February 19 February 26 March 4 March 11 March 18 Topic Reading: Text Chapter Introduction to Comp 150, Introduction 1 Data Warehouse and OLAP 2 Technology for Data Mining Aggregation in SQL, Data Not In Book Warehousing Introduction, Data Warehousing Design President’s Day Schedule Shift– No Class Data Warehouse Semantics Not In Book Semistructured Data Data Preprocessing 3 Data Mining Primitives, Languages, 4 and System Architectures – Midterm Review Midterm Exam Data Warehousing/Mining 6 Reading Schedule (continued) Lecture Date April 1 April 8 April 15 April 14 April 22 April 29 May 6 May 13 Data Warehousing/Mining Topic Concept Description: Characterization and Comparison Mining Association Rules in Large Databases Classification and Prediction Cluster Analysis Mining Complex Types of Data Applications and Trends in Data Mining – Final Exam Review Reading Period/Project Completion Final Exam Reading: Text Chapter 5 6 7 8 9 10 7 Grading Homework Project Midterm Final Data Warehousing/Mining 30% 10% 25% 35% 8 Homework Assigned weekly (each Wednesday) – Due at the start of lecture the following Wednesday Late policy: – Homework turned in up to one week after the due date 20% penalty. – Homework turned in anytime later - 100% penalty Typical homework assignment – Exercises from the text – “Hands-on” problems that involve building data warehouses and performing data mining Working with PostgresQL Data Warehousing/Mining 9 Project Develop a data warehouse and perform data mining on it using Postgres as the underlying datastore Additional details provided as the course progresses Data Warehousing/Mining 10 Midterm & Final Open book, open notes Opportunity during class for review of material covered prior to midterm and final Data Warehousing/Mining 11 Computing Environment All students will have a computer account on psql.cs.tufts.edu – Account will work on all workstations in the SUN lab Commercial RDBMS utilized will be PostgreSQL – For information http://www.postgresql.org/index.html Data Warehousing/Mining 12 Course Homepage Course web page will be available Lectures/homework assignments will also be posted in my account – ~dhebert/comp150dw Data Warehousing/Mining 13