Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Machine Learning and Data Mining Reading List – Fall 2006 09/05: Introduction • T. Mitchell. Chapter 1, Machine Learning, McGraw-Hill, 1998. • C. E. Brodley, T. Lane, and T. Stough. “Knowledge discovery in databases.” American Scientist, 87(1):54–61, 1999. 09/07: Decision Trees and Noisy Data • T. Mitchell. Chapter 3, Machine Learning, McGraw-Hill, 1998. 09/12: Finish D-Trees, k-NN, Neural Networks • Duda, Hart and Stork, Chapter 6: Multilayer Neural Networks, Pattern Classification, second Ed, Wiley, 2001. 09/14: NN’s 09/19: Finish NN, Feature Selection and Evaluation • CRITIQUE REQUIRED: G. H. John, R. Kohavi, and K. Pfleger. “Irrelevant features and the subset selection problem.” In Machine Learning: Proceedings of the Eleventh International Conference, pages 121–129, New Brunswick, NJ, 1994. Morgan Kaufmann. • S. Salzberg. “On comparing classifiers: Pitfalls to avoid and a recommended approach.” Data Mining and Knowledge Discovery, 1:317–327, 1998. 09/21: Finish Evaluation, start Clustering/Unsupervised learning – K-means, EM • J. Han and M. Kamber, Chapter 8, pages 335-362 Data Mining: concepts and techniques 09/25: LUNCHTIME TUTORIAL (12:00 Halligan 106) – Lagrange multipliers 09/26: Clustering 09/28: Presentation Preferences DUE Finish Clustering, start SVMs • CRITIQUE REQUIRED: K. Bennett and C. Campbell, Support Vector Machines: Hype or Hallelujah? SIGKDD Explorations, 2,2, 2000, 1-13. 09/29: Poster check with D. Sculley 10/03: Guest Lecturer: Rob Holte • CRITIQUE REQUIRED: Chris Drummond and Robert C. Holte, “Explicitly representing expected cost: an alternative to ROC representation,” KDD00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 198-207, 2000. 10/04: 5-7pm POSTER Session (Don’t forget the colloquium at 3-4!) 10/05: Finish EM Start SVMs, (Visit from Med school to talk about applications?) 1 • OPTIONAL reading: Christopher J. C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition,” Data Min. Knowl. Discov., 2(2), 1998. 10/10: Finish SVMs! Advanced Supervised Learning – Student Presentations • Critique for ONE Poster paper of your choice due IFF you are NOT a poster presenter 10/12: Active Learning– Student Presentations • presenter: Noah S. CRITIQUE REQUIRED: N. Roy and A. McCallum, “Toward optimal active learning through sampling of error reduction,” ICML2001. • presenter: Josh D. CRITIQUE REQUIRED: T. Luo, K. Kramer, D. Goldgof, L. Hall, S. Samson, A. Remsen, and T. Hopkins, “Active learning to recognize multiple types of Plankton,” Journal of Machine Learning Research, vol 6, pp 589-613, 2005. 10/13: FRIDAY TUTORIAL (10:30 Location TBA) – basic statistics (including Bayes Rule, etc) 10/17: Advanced SVM – Student Presentations • Presenter: John O. CRITIQUE REQUIRED: B. Scholkopf, R. Williamson, A. Smola, J. Shawe-Taylor, J. Platt “Support Vector Method for Novelty Detection,” http://axiom.anu.edu.au/ williams/papers/P126.pdf • Presenter: Erin T. CRITIQUE REQUIRED: J. Platt, N. Cristianini, J. ShaweTaylor, “Large Margin DAGs for Multiclass Classification,” http://research.microsoft.com/users/jplatt/D 10/19: Proposals DUE D. Sculley to lecture on Kernel Methods • Reading TBA 10/24: Umaa Rebapragada to lecture on Time series • CRITIQUE REQUIRED: Sakuri, Yoshikawa, and Faloutsos, “FTW: Fast similarity search under the time warping distance,” Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems pp: 326 337, 2005. 10/25: Other Learning Criteria – Student Presentations • Presenter Ben R. CRITIQUE REQUIRED: I. Davidson, S.S. Ravi, “Clustering with Constraints: Feasibility Issues and the k-Means Algorithm,” 5th Siam Data Mining Conference (winner best paper award) • Presenter: Audrey G. CRITIQUE REQUIRED: B. Zadrozny and C. Elkan, “Learning and making decisions when costs and probabilities are both unknown,” KDD ’01: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001, pp. 204–213. 10/31: Bias, Variance, Boosting and Bagging • CRITIQUE REQUIRED: R. Shapire “A brief introduction to boosting,” In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 1999. 2 11/02 Bayesian Learning, Naive Bayes Rule, Intro to PAC-Learning • CRITIQUE REQUIRED: V. Metsis, I. Androutsopoulos, G. Paliouras, “Spam Filtering with Naive Bayes – Which Naive Bayes?” CEAS, 2000. 11/09: Semi supervised learning and concept drift– Student Presentations • Presenter: Andrew F. CRITIQUE REQUIRED: Sugato Basu, Arindam Banerjee, and Raymond J. Mooney, Active Semi-Supervision for Pairwise Constrained Clustering Proceedings of the SIAM International Conference on Data Mining (SDM-2004), Lake Buena Vista, FL, April 2004. • presentor: Jonathan K. CRITIQUE REQUIRED: Haixun Wang, Wei Fan, Philip S. Yu, Jaiwei Han, “Mining Concept-Drifting Data Streams Using Ensemble Classifiers,” KDD 2003. 11/07: Virtual Friday 11/14: Advanced Supervised Learning – Student Presentations • Presenter: Byron W. CRITIQUE REQUIRED: Keogh, Chakrabarti, Pazzani and Mehrotra, “ Locally adaptive dimensionality reduction for indexing large time series databases,” SIGMOD ’01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data, pp 151–162, 2001. • Presentor: Andy B. CRITIQUE REQUIRED: S. Tong, E. Chang, “Support Vector Machine Active Learning for Image Retrieval,” Proceedings of the ninth ACM International Conference on Multimedia. 11/16: TBA 11/21: Paper draft due, but no class! 11/28: (Reviews due) Project Presentations 11/30: Project Presentations 12/05: Project Presentations 12/07: (Final paper due) Project Presentations/Final Lecture 3