* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Ensemble methods with Data stream
Survey
Document related concepts
Transcript
Ensemble methods with Data Streams Jungbeom Lee CS240B Outline Intro Ensemble in Machine learning Online ensemble algorithms Future work Intro Previous class: Data Streams Classifiers Ensemble methods Online algorithm Classifiers • The batch classification problem: – Given a finite training set D={(x,y)} , where y={y1, y2, …, yk}, |D|=n, find a function y=f(x) that can predict the y value for an unseen instance x • The data stream classification problem: – Given an infinite sequence of pairs of the form (x,y) where y={y1, y2, …, yk}, find a function y=f(x) that can predict the y value for an unseen instance x • Example applications: – Fraud detection in credit card transactions – Topic classification in a news aggregation site, e.g. Google news – Translator for foreign languages Motivations • Online mining different from static mining Data Volume ◦ impossible to mine the entire data at one time ◦ can only afford constant memory per data sample Changing data characteristics ◦ previously learned models are invalid Cost of Learning ◦ model updates can be costly ◦ can only afford constant time per data sample. Ensemble A set of classifiers whose individual decisions are combined in some way to classify new examples An ensemble of classifiers to be more accurate than any of its individual members one key to successful is to use individual classifiers with error rates below .5 Reasons Ensemble methods Manipulating the Training Examples ◦ Bagging ◦ Adaboost Injecting Randomness ◦ C4.5 decision tree algorithm Bagging algorithm Bagging algorithm Online bagging algorithm Online weighted bagging algorithm AdaBoost algorithm AdaBoost algorithm Adaptive boosting algorithm Experimental Results Type of Data Experimental Results Experimental Results Experimental Results Future work Better online algorithm for Bagging Dealing with multiple data types References http://web.engr.oregonstate.edu/~tgd/publications /mcs-ensembles.pdf http://pages.bangor.ac.uk/~mas00a/papers/lkSUEM A2008.pdf http://web.cs.ucla.edu/~zaniolo/papers/NBCAJM W77MW0J8CP.pdf https://ti.arc.nasa.gov/m/pubarchive/archive/0962.pdf https://engineering.purdue.edu/~givan/papers/bp.p df http://hanj.cs.illinois.edu/pdf/kdd03_emsemble.pdf