Download Fuzzy-rough instance selection - Aberystwyth University Users Site

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Fuzzy-Rough Instance Selection
Richard Jensen
Aberystwyth University, UK
Chris Cornelis
Ghent University, Belgium
Richard Jensen and Chris Cornelis
Outline
• The importance of instance selection
• Rough set theory
• Fuzzy-rough sets
• Fuzzy-rough instance selection
• Experimentation
• Conclusion
Richard Jensen and Chris Cornelis
Instance selection
• Knowledge discovery
• The problem of too much data
• Requires storage
• Intractable for data mining algorithms
• Removing data that is noisy or irrelevant
Richard Jensen and Chris Cornelis
Rough set theory
Upper
Approximation
Set A
Lower
Approximation
Equivalence
class Rx
Rx is the set of all points
that are indiscernible with point x
Richard Jensen and Chris Cornelis
Fuzzy-rough sets
• Approximate equality
• Handle real-valued features via fuzzy tolerance
relations instead of crisp equivalence
• Better noise and uncertainty handling
• Focus has been on feature selection, not
instance selection
Richard Jensen and Chris Cornelis
Fuzzy-rough sets
• Parameterized relation
• Fuzzy-rough definitions:
Richard Jensen and Chris Cornelis
Instance selection: basic idea
Not needed
Remove objects to keep the underlying
approximations unchanged
Richard Jensen and Chris Cornelis
Instance selection: basic idea
Remove objects to keep the underlying
approximations unchanged
Richard Jensen and Chris Cornelis
FRIS-I
Richard Jensen and Chris Cornelis
FRIS-II
Richard Jensen and Chris Cornelis
FRIS-III
Richard Jensen and Chris Cornelis
Experimentation: setup
Richard Jensen and Chris Cornelis
Results: FRIS-I (heart)
• (214 objects, 9 features)
Richard Jensen and Chris Cornelis
Results: FRIS-II (heart)
Richard Jensen and Chris Cornelis
Results: FRIS-III (heart)
Richard Jensen and Chris Cornelis
Conclusion
• Proposed new techniques for instance selection
based on fuzzy-rough sets
• Managed to reduce the number of instances significantly,
retaining classification accuracy
• Future work
• Many possibilities for novel fuzzy-rough instance selection
methods
• Comparisons with non-rough techniques
• Improving the complexity of FRIS-III
• Combined instance/feature selection
Richard Jensen and Chris Cornelis
• WEKA implementations of all fuzzy-rough
methods can be downloaded from:
Richard Jensen and Chris Cornelis