* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Ensemble methods with Data stream
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					Ensemble methods with
Data Streams
Jungbeom Lee
CS240B
Outline
Intro
 Ensemble in Machine learning
 Online ensemble algorithms
 Future work
Intro
Previous class: Data Streams Classifiers
 Ensemble methods
 Online algorithm
Classifiers
•
The batch classification problem:
– Given a finite training set D={(x,y)} , where y={y1, y2, …, yk}, |D|=n, find
a function y=f(x) that can predict the y value for an unseen instance x
•
The data stream classification problem:
– Given an infinite sequence of pairs of the form (x,y) where y={y1, y2, …,
yk}, find a function y=f(x) that can predict the y value for an unseen
instance x
•
Example applications:
– Fraud detection in credit card transactions
– Topic classification in a news aggregation site, e.g. Google news
– Translator for foreign languages
Motivations
• Online mining different from static mining
Data Volume
◦ impossible to mine the entire data at one time
◦ can only afford constant memory per data sample
Changing data characteristics
◦ previously learned models are invalid
Cost of Learning
◦ model updates can be costly
◦ can only afford constant time per data sample.
Ensemble
A set of classifiers whose individual
decisions are combined in some way to
classify new examples
 An ensemble of classifiers to be more
accurate than any of its individual
members
 one key to successful is to use individual
classifiers with error rates below .5
Reasons
Ensemble methods
Manipulating the Training Examples
◦ Bagging
◦ Adaboost
Injecting Randomness
◦ C4.5 decision tree algorithm
Bagging algorithm
Bagging algorithm
Online bagging algorithm
Online weighted bagging algorithm
AdaBoost algorithm
AdaBoost algorithm
Adaptive boosting algorithm
Experimental Results
Type of Data
Experimental Results
Experimental Results
Experimental Results
Future work
Better online algorithm for Bagging
 Dealing with multiple data types
References
http://web.engr.oregonstate.edu/~tgd/publications
/mcs-ensembles.pdf
http://pages.bangor.ac.uk/~mas00a/papers/lkSUEM
A2008.pdf
http://web.cs.ucla.edu/~zaniolo/papers/NBCAJM
W77MW0J8CP.pdf
https://ti.arc.nasa.gov/m/pubarchive/archive/0962.pdf
https://engineering.purdue.edu/~givan/papers/bp.p
df
http://hanj.cs.illinois.edu/pdf/kdd03_emsemble.pdf