Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Exeter College Summer Programme Introduction to Machine Learning Course Description Machine learning is the study of methods and algorithms that can automatically discover patterns and hidden structure in data, and then use these discoveries for prediction of future data and generalisation to unseen situations or environments. It is widely used across many scientific and engineering disciplines. This course covers fundamentals of machine learning, including unsupervised learning methods for dimensionality reduction and clustering, recommender systems, and supervised learning for classification and regression. An emphasis will be placed on essential concepts such as empirical risk minimisation, bias/variance tradeoff and probabilistic generative models. Prerequisites Basic background in algorithms and programming as well as a good grasp of linear algebra, calculus and probability. Basic knowledge of optimisation techniques is desirable. Teaching Methods and Assessment • 12 x 1.25hr Lectures (15hrs) • 6 x 1.25hr Seminar Problem classes (7.5hrs) • 2 x 1.25hrs Tutorials (2.5hrs) Performance Evaluation Final examination: 60% Problem sheets and Research project: 30% Participation and attendance: 10% Core reading The recommended texts for the course are: Murphy, Machine Learning: A Probabilistic Perspective, MIT Press, 2012. Bishop, Pattern Recognition and Machine Learning, Springer, 2007. Hastie, Tibshirani and Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, 2009. Lecture Schedule 1. Introduction a) Types of machine learning b) Parametric vs. nonparametric models c) Probability distributions and inference d) Properties of the Gaussian distribution e) Nearest-neighbour classifier and the curse of dimensionality Reading: Murphy 1; Bishop 2.3 2. Overview of Supervised Learning a) Loss, risk and empirical risk minimization b) Regularization and model complexity c) Linear regression and ridge regression d) Bias/variance tradeoff Reading: Hastie et al 2.4-6, 2.9; Bishop 3.1-2 3. Linear Methods for Classification a) Discriminant Analysis b) Generative vs Discriminative Classifiers c) Logistic Regression Reading: Hastie et al 4; Murphy 8 4. Support Vector Machines a) Margin maximisation b) Dual SVM c) Solution sparsity and support vectors Reading: Bishop 7.1, Hastie et al 12.2 5. Kernel Methods a) Feature map and kernel trick b) Some examples of kernels c) Kernel ridge regression d) Representer theorem e) Kernel SVM Reading: Murphy 14.1-4, Hastie et al 12.3 6. Dimensionality Reduction a) Feature selection and extraction b) Principal Components Analysis (PCA) c) Kernel PCA d) Multidimensional Scaling Reading: Bishop 12.1, 12.3; Hastie et al 14.8 7. Cluster Analysis a) K-means algorithm b) Hierarchical clustering c) Spectral clustering Reading: Hastie et al 14.3; Murphy 25.4-6 8. Probabilistic Unsupervised Learning a) Gaussian mixtures b) Expectation-Maximization algorithm c) Probabilistic PCA Reading: Bishop 9, 12.2; Murphy 11 9. Gaussian Processes a) Prior distributions on function spaces b) Gaussian process regression c) Estimating the kernel parameters d) Gaussian process classification Reading: Murphy 15; Bishop 6.4 10. Random Forests a) Criteria for learning decision trees b) Bagging c) Variable Importance Reading: Hastie et al 15 11. Neural Networks a) Single-layer Perceptron b) Multilayer Perceptron c) Weight decay d) Backpropagation algorithm Reading: Bishop 5.1-3, Murphy 16.5 12. Deep Learning a) Deep Generative Models b) Auto-Encoders c) Convolutional Neural Networks Reading: Murphy 28