Download COURSE SYLLABUS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Principal component analysis wikipedia , lookup

Multinomial logistic regression wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Transcript
Advanced Statistical Data Mining and Optimization Methods for Machine
Learning (Summer 2009)
Instructor


Dr. Myong K. (MK) Jeong (Assistant Professor, Department of Industrial & Systems
Engineering and RUTCOR (Rutgers Center for Operations Research), Rutgers
University)
Contact: [email protected]; http:www.rci.rutgers.edu/~mjeong
Class information
Location: TBD (to be determined)
Class times: Mon and Wed 1:30 pm –3:30pm
Tentative duration: July 6 – August 19 (7 weeks)
(Class times can be rearranged at the first class)
Course Description
This course will focus on statistical data mining and machine learning. Also, this course will
introduce the data mining formulations through the mathematical programming (e.g.,
Mangasarian’s work).
Textbooks
1. No textbook is required. Class materials for each topic will be provided by the instructor.
2. Some recommended books
 The Elements of Statistical Learning by T. Hastie et al., Springer
 Pattern Recognition and Machine Learning by Christopher M. Bishop, Springer
 Bayesian Data Analysis by A. Gelman et al., Chapman & Hall
Topics To be Covered
(The topics and level of their coverage may be changed according to the audience)
O Classification
 Support vector machines (SVMs), Convex optimization for machine learning
 Bayesian version of SVMs: Relevance vector machine (RVM)
 Variants of SVMs: introduction of Mangasarian’s works (e.g., robust SVMs, sparse
SVMs, L0 norm SVM, …)
O Regression (Prediction or Calibration)
 Loss functions & Regularization (ridge regression, Lasso, and others)
 SVMs for regression, Bayesian SVMs for regression
 Function approximation
 Robust regression
 Optimization-based modeling for regression
O Feature Selection and Extraction
(1) Criteria
 Metrics and distance functions depending on data types (numeric, text, …)
 Distance between probability distributions (e.g., divergence)
 Dynamic time warping
 Separability measures
(2) Optimization methods for feature selection
 Sequential forward/backward Selection, Recursive feature elimination (RFE)
 Forward/backward floating search, Branch and bound method, Genetic algorithms
(3) Other topics
 Bayesian variable selection
 SVM-based feature selection
 Huge-scale feature selection
O Transformation of variables
 Principal component analysis (PCA)
 Variants of PCA: Sparse-PCA, Probabilistic PCA, Kernel PCA
 Preprocessing of spectra data: orthogonal signal correction (OSC)
 Wavelet transform
O Decision trees
 Classification trees
 Regression trees
O Ensemble Methods
 Bagging
 Boosting (AdaBoost)
O Logical Analysis of Data (LAD) for Classification
(E. Boros et al., IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 12, NO. 2, MARCH/APRIL 2000)
(1) Introduction of LAD and optimization models for LAD
 Introduction to Boolean functions
 Binarization of numerical variables & minimal support set identification through the set
covering problem
 Pattern generation and selection, Construction of a discriminant
(2) Comparison with CART and SVMs for classification
O Other topics
 Reinforcement Learning
 Clustering
 Bayesian decision theory
 Spatial data mining
Grading Policy:
Homework (50%) and class project with an oral presentation (50%)