Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CSCI-B565 Data Mining Course Objective The course objective is to study algorithmic and practical aspects of discovering patterns and relationships in large databases. This course is designed to introduce basic concepts of data mining and also provide hands-on experience in data analysis, clustering and prediction. Data mining is a dynamic field that has wide applications to a number of scientific areas such as finance, life sciences, social sciences, or medicine. Textbooks Required Data Mining: Concepts and Techniques - by J. Han et al., Morgan Kaufmann 2006. (ISBN: 978-1-55860-901-3; $54.47) Recommended Introduction to Data Mining - by P.-N. Tan et al., Pearson 2006. (ISBN 0-321-321136-7; $87.99) Topics basic concepts o introduction to data mining, origins of data mining, data mining tasks o relational databases, transactional databases, data warehouses data o types of data, data quality, similarity metrics, summary statistics o data preprocessing: cleaning, normalization, reduction, transformation, integration data warehouse and OLAP technology for data mining o multidimensional data model and OLAP operations o warehouse architecture, implementations and relationship with data mining association rule mining o basic concepts: frequent itemset generation, rule generation, apriori and FP-growth algorithms o advanced concepts: graph data, sequential patterns, infrequent patterns, concept hierarchies classification and regression algorithms o Bayesian classification, k-nearest neighbor, neural networks, classification and regression trees, support vector machines, ensemble methods o handling biased data, and class-imbalanced data clustering o partitioning methods (k-means, k-medoids) and hierarchical methods (agglomerative/divisive clustering) o density-based, graph-based, prototype-based, and model-based clustering o clustering with constraints anomaly detection o statistical approaches to outlier detection o density-based, proximity-based, clustering-based techniques mining complex types of data o mining spatial, text, time-series and multimedia data o mining web data, mining graphs o mining streaming data human factors and social issues o ethics of data mining and social impacts o privacy-preserving data mining o user interfaces, data and result visualization Grading Midterm exam: 25% Final exam/project: 25% Homework assignments: 40% Class participation/activity: 10% School of Informatics and Computing, Indiana University