Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
4.30 Machine Learning Pádraig Cunningham Machine Learning Group University College Dublin 2 Outline Week 1 Introduction & General Overview of Matrix Decomposition Nearest Neighbour Classifiers Tutorial Week 2: Neural Networks Simple Perceptron, Backpropagation Other Architectures: Hopfield, Self-Organising Maps Tutorial Week 3 Support Vector Machines Kernel Methods & Evaluation Tutorial Week 4 Decision Trees Naïve Bayes Tutorial Intro to ML 3 Outline Week 5: Ensemble Techniques Bagging Boosting Tutorial Coursework 3-4 pieces, 15 hours, Weka & Java Week 6: Unsupervised Learning Hierarchical Clustering Other Clustering Algorithms: k-Means, Spectral Clustering Tutorial Week 7: Dimension Reduction Principle Components Analysis, LSI, SVD Feature Selection Tutorial Later 2 revision tutorials Intro to ML 4 Why Machine Learning Recent progress in algorithms and theory Loads of processing power Computational power is available Growing flood of online data Amazon Google Intro to ML 5 3 niches for ML Data mining: using historical data to improve decisions Software applications that cannot be programmed by hand. medical records medical knowledge autonomous driving speech recognition i.e. weak theory domains. Self customising programs Personalised Newspaper E-mail filtering Intro to ML 6 Data-mining in medical records Quality Assurance in Maternity Care. http://svr-www.eng.cam.ac.uk/projects/qamc/qamc.html Intro to ML 7 Rule Learning The QAMC system uses Decision /trees (I think!) It is also possible to extract rules from data:If Then No previous normal delivery, and Abnormal 2nd Trimester Ultrasound, and Malpresentation at admission Probability of Emergency C-Section is 0.6 Over training dat 26/41 = 0.63 Over test data: 12/20 = 0.6 <Rule taken from Machine Learning by Tom Mitchell> Intro to ML 8 Spam Filtering For Machine Learning… Lots of training data High dimensionality data (lots of features) Email is a diverse concept Porn, mortgage, religion, cheap drugs… Work, family, play… Spam Filtering is a challenge because… Arms race: spammers vs filters False Positives are unacceptable Spam is a changing concept Intro to ML 9 ALVIN Problems too difficult to program by hand Alvin drives at 70mph on motorways Intro to ML 10 Autonomous Vehicles DARPA Grand Challenge 2005 Winner: Stanley from Stanford Various modules use ML Intro to ML 11 SmartRadio Internet-based music radio Personalised Collaborative Recommendation Content-Based Recommendation supported by knowledge discovery from log data supported by feature extraction from sound files feature seleciton refinement Intro to ML 12 Smart Radio Smart Radio is a web based client-server music application which allows listeners build, manage and share music programmes The project was set up to look at a possible model for: The regulated distribution of music on the web A personalised stream of music service To provide an architecture and data to test our data mining and collaborative filtering algorithms Intro to ML 13 ML Dimensions Lazy v’s Eager k-NN v’s rule learning Supervised v’s Unsupervised Symbolic v’s Sub-symbolic Intro to ML