Download here.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Cluster analysis wikipedia , lookup

Transcript
Machine Learning and Data Mining
Reading List – Fall 2006
09/05: Introduction
• T. Mitchell. Chapter 1, Machine Learning, McGraw-Hill, 1998.
• C. E. Brodley, T. Lane, and T. Stough. “Knowledge discovery in databases.” American
Scientist, 87(1):54–61, 1999.
09/07: Decision Trees and Noisy Data
• T. Mitchell. Chapter 3, Machine Learning, McGraw-Hill, 1998.
09/12: Finish D-Trees, k-NN, Neural Networks
• Duda, Hart and Stork, Chapter 6: Multilayer Neural Networks, Pattern Classification,
second Ed, Wiley, 2001.
09/14: NN’s
09/19: Finish NN, Feature Selection and Evaluation
• CRITIQUE REQUIRED: G. H. John, R. Kohavi, and K. Pfleger. “Irrelevant
features and the subset selection problem.” In Machine Learning: Proceedings of the
Eleventh International Conference, pages 121–129, New Brunswick, NJ, 1994. Morgan
Kaufmann.
• S. Salzberg. “On comparing classifiers: Pitfalls to avoid and a recommended approach.”
Data Mining and Knowledge Discovery, 1:317–327, 1998.
09/21: Finish Evaluation, start Clustering/Unsupervised learning – K-means, EM
• J. Han and M. Kamber, Chapter 8, pages 335-362 Data Mining: concepts and techniques
09/25: LUNCHTIME TUTORIAL (12:00 Halligan 106) – Lagrange multipliers
09/26: Clustering
09/28: Presentation Preferences DUE Finish Clustering, start SVMs
• CRITIQUE REQUIRED: K. Bennett and C. Campbell, Support Vector Machines:
Hype or Hallelujah? SIGKDD Explorations, 2,2, 2000, 1-13.
09/29: Poster check with D. Sculley
10/03: Guest Lecturer: Rob Holte
• CRITIQUE REQUIRED: Chris Drummond and Robert C. Holte, “Explicitly representing expected cost: an alternative to ROC representation,” KDD00: Proceedings
of the sixth ACM SIGKDD international conference on Knowledge discovery and data
mining, pp. 198-207, 2000.
10/04: 5-7pm POSTER Session (Don’t forget the colloquium at 3-4!)
10/05: Finish EM Start SVMs, (Visit from Med school to talk about applications?)
1
• OPTIONAL reading: Christopher J. C. Burges, “A Tutorial on Support Vector Machines
for Pattern Recognition,” Data Min. Knowl. Discov., 2(2), 1998.
10/10: Finish SVMs!
Advanced Supervised Learning – Student Presentations
• Critique for ONE Poster paper of your choice due IFF you are NOT a poster
presenter
10/12: Active Learning– Student Presentations
• presenter: Noah S. CRITIQUE REQUIRED: N. Roy and A. McCallum, “Toward
optimal active learning through sampling of error reduction,” ICML2001.
• presenter: Josh D. CRITIQUE REQUIRED: T. Luo, K. Kramer, D. Goldgof, L.
Hall, S. Samson, A. Remsen, and T. Hopkins, “Active learning to recognize multiple
types of Plankton,” Journal of Machine Learning Research, vol 6, pp 589-613, 2005.
10/13: FRIDAY TUTORIAL (10:30 Location TBA) – basic statistics (including Bayes Rule, etc)
10/17: Advanced SVM – Student Presentations
• Presenter: John O. CRITIQUE REQUIRED: B. Scholkopf, R. Williamson, A.
Smola, J. Shawe-Taylor, J. Platt “Support Vector Method for Novelty Detection,”
http://axiom.anu.edu.au/ williams/papers/P126.pdf
• Presenter: Erin T. CRITIQUE REQUIRED: J. Platt, N. Cristianini, J. ShaweTaylor, “Large Margin DAGs for Multiclass Classification,” http://research.microsoft.com/users/jplatt/D
10/19: Proposals DUE D. Sculley to lecture on Kernel Methods
• Reading TBA
10/24: Umaa Rebapragada to lecture on Time series
• CRITIQUE REQUIRED: Sakuri, Yoshikawa, and Faloutsos, “FTW: Fast similarity search under the time warping distance,” Proceedings of the twenty-fourth ACM
SIGMOD-SIGACT-SIGART symposium on Principles of database systems pp: 326 337, 2005.
10/25: Other Learning Criteria – Student Presentations
• Presenter Ben R. CRITIQUE REQUIRED: I. Davidson, S.S. Ravi, “Clustering
with Constraints: Feasibility Issues and the k-Means Algorithm,” 5th Siam Data Mining
Conference (winner best paper award)
• Presenter: Audrey G. CRITIQUE REQUIRED: B. Zadrozny and C. Elkan, “Learning and making decisions when costs and probabilities are both unknown,” KDD ’01:
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001, pp. 204–213.
10/31: Bias, Variance, Boosting and Bagging
• CRITIQUE REQUIRED: R. Shapire “A brief introduction to boosting,” In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 1999.
2
11/02 Bayesian Learning, Naive Bayes Rule, Intro to PAC-Learning
• CRITIQUE REQUIRED: V. Metsis, I. Androutsopoulos, G. Paliouras, “Spam Filtering with Naive Bayes – Which Naive Bayes?” CEAS, 2000.
11/09: Semi supervised learning and concept drift– Student Presentations
• Presenter: Andrew F. CRITIQUE REQUIRED: Sugato Basu, Arindam Banerjee,
and Raymond J. Mooney, Active Semi-Supervision for Pairwise Constrained Clustering
Proceedings of the SIAM International Conference on Data Mining (SDM-2004), Lake
Buena Vista, FL, April 2004.
• presentor: Jonathan K. CRITIQUE REQUIRED: Haixun Wang, Wei Fan, Philip S.
Yu, Jaiwei Han, “Mining Concept-Drifting Data Streams Using Ensemble Classifiers,”
KDD 2003.
11/07: Virtual Friday
11/14: Advanced Supervised Learning – Student Presentations
• Presenter: Byron W. CRITIQUE REQUIRED: Keogh, Chakrabarti, Pazzani and
Mehrotra, “ Locally adaptive dimensionality reduction for indexing large time series
databases,” SIGMOD ’01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data, pp 151–162, 2001.
• Presentor: Andy B. CRITIQUE REQUIRED: S. Tong, E. Chang, “Support Vector
Machine Active Learning for Image Retrieval,” Proceedings of the ninth ACM International Conference on Multimedia.
11/16: TBA
11/21: Paper draft due, but no class!
11/28: (Reviews due) Project Presentations
11/30: Project Presentations
12/05: Project Presentations
12/07: (Final paper due) Project Presentations/Final Lecture
3