Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CISC 492: Data Analytics Instructor: D.B. Skillicorn Data mining builds inductive models from data. Almost all organisations, and many individuals, accumulate data from their interactions, and can use this data to improve service, and sometimes profit. The algorithms used for data mining must be efficient, because of the huge volumes of data that have to be examined, and sophisticated, because the benefit of an extracted concept depends heavily on how subtle it is. This course is a project course, meeting for two 1.5 hour sessions each week. We will examine a number of datasets, with each participant using a particular technique to investigate each dataset and see what structure the technique discovers. You will have a chance to try several different techniques during the course. You will present your progress to the class each week. You may want to look at kaggle.com for some ideas about the kind of data mining problems we will look at. Good working knowledge of standard software environments is required, especially the ability to develop scripts and plot data (e.g. Excel, Matlab, Open GL + Perl, Python, Awk). Prerequisites: CISC 333; Textbooks (suggested): Tan, Steinbach, Kumar, Introduction to Data Mining, Addison-Wesley. Hand, Mannila, Smyth, Principles of Data Mining, MIT Press. Assessment: In-class performance: 70% based on assessments from all class participants; 30% take-home examination involving a simulated data mining task.