Download When Efficient Model Averaging Out

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Predictive analytics wikipedia , lookup

Hardware random number generator wikipedia , lookup

Generalized linear model wikipedia , lookup

Numerical weather prediction wikipedia , lookup

General circulation model wikipedia , lookup

Computational phylogenetics wikipedia , lookup

Computer simulation wikipedia , lookup

History of numerical weather prediction wikipedia , lookup

Data assimilation wikipedia , lookup

Atmospheric model wikipedia , lookup

Tropical cyclone forecast model wikipedia , lookup

Transcript
When Efficient Model Averaging
Out-Perform Bagging and Boosting
Ian Davidson, SUNY Albany
Wei Fan, IBM T.J.Watson
Ensemble Techniques
• Techniques such as boosting and bagging
are methods of combining models.
• Used extensively in ML and DM seems to
work well in a large variety of situations.
• But model averaging is the “correct”
Bayesian method of using multiple
models.
• Does model averaging have a place in ML
and DM?
What is Model Averaging?
Averaging of class probabilities weighted by posterior
Integration Over
Class
Model Space Probability
Posterior
weighting
Removes model uncertainty by averaging
Prohibitive for large model spaces
such as decision trees
Efficient Model Averaging:
PBMA and Random DT
• PBMA (Davidson 04): parametric bootstrap
model averaging
– Use parametric model to generate multiple bootstraps
computed from a single training set.
• Random Decision Tree (Fan et al 03)
– Construct each tree’s structure randomly
• Categorical feature used once in a decision path
• Random threshold for continuous features.
– Leaf node statistics estimated from data.
– Average probability of multiple trees.
Our Empirical Study
• Idea: When model uncertainty occurs,
model averaging should perform well
• Four specific but common situations when
factoring in model uncertainty is beneficial
– Class label noise
– Many label problem
– Sample selection bias
– Small data sets
Class Label Noise
• Randomly flip 10% of labels
Data Set with Many Classes
Biased Training Sets
• See ICDM 2005 for a formal analysis
• See KDD 2006 to look at estimating accuracy
• See ICDM 2006 for a case study
Universe of Examples
Two classes:
red and green
red: f2>f1
green: f2<=f1
Unbiased and Biased Samples
Single Decision Tree
Unbiased 97.1%
Biased 92.1%
Random Decision Tree
Unbiased 96.9%
Biased 95.9%
Bagging
Unbiased 97.82%
Biased 93.52%
PBMA
Unbiased 99.08%
Biased 94.55
Boosting
Unbiased 96.405%
Biased 92.7%
Scope of This Paper
• Identifies conditions where model
averaging should outperform bagging and
boosting.
• Empirically verifies these claims.
• Other questions:
– Why does bagging and boosting perform
badly in these conditions?