Download Ensemble methods with Data stream

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia, lookup

Nearest-neighbor chain algorithm wikipedia, lookup

Nonlinear dimensionality reduction wikipedia, lookup

K-means clustering wikipedia, lookup

Expectation–maximization algorithm wikipedia, lookup

K-nearest neighbors algorithm wikipedia, lookup

Ensemble methods with
Data Streams
Jungbeom Lee
 Ensemble in Machine learning
 Online ensemble algorithms
 Future work
Previous class: Data Streams Classifiers
 Ensemble methods
 Online algorithm
The batch classification problem:
– Given a finite training set D={(x,y)} , where y={y1, y2, …, yk}, |D|=n, find
a function y=f(x) that can predict the y value for an unseen instance x
The data stream classification problem:
– Given an infinite sequence of pairs of the form (x,y) where y={y1, y2, …,
yk}, find a function y=f(x) that can predict the y value for an unseen
instance x
Example applications:
– Fraud detection in credit card transactions
– Topic classification in a news aggregation site, e.g. Google news
– Translator for foreign languages
• Online mining different from static mining
Data Volume
◦ impossible to mine the entire data at one time
◦ can only afford constant memory per data sample
Changing data characteristics
◦ previously learned models are invalid
Cost of Learning
◦ model updates can be costly
◦ can only afford constant time per data sample.
A set of classifiers whose individual
decisions are combined in some way to
classify new examples
 An ensemble of classifiers to be more
accurate than any of its individual
 one key to successful is to use individual
classifiers with error rates below .5
Ensemble methods
Manipulating the Training Examples
◦ Bagging
◦ Adaboost
Injecting Randomness
◦ C4.5 decision tree algorithm
Bagging algorithm
Bagging algorithm
Online bagging algorithm
Online weighted bagging algorithm
AdaBoost algorithm
AdaBoost algorithm
Adaptive boosting algorithm
Experimental Results
Type of Data
Experimental Results
Experimental Results
Experimental Results
Future work
Better online algorithm for Bagging
 Dealing with multiple data types