Download Data Mining in Soft Computing Framework: A Survey

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Cluster analysis wikipedia , lookup

K-nearest neighbors algorithm wikipedia , lookup

Transcript
Data Mining (DM)
•Data
Cleaning
Knowledge
MatheInterpretation
matical
Model of
Preprocessed
•Knowledge
•Dimensionality Data
•Classification
Extraction
Reduction
Data
•Clustering
•Knowledge
(Patterns)
•Rule
Evaluation
Generation
•Data
Wrapping/
Description
•Data
Condensation
Huge
Raw
Data
Machine
Learning
Useful
Knowledge
Knowledge Discovery in Database (KDD)
Pattern Recognition, World Scientific, 2001
Data Mining Algorithm Components
• Model : Function of the model (e.g., classification,
clustering, rule generation) and its representational form
(e.g., linear discriminates, neural networks, fuzzy logic,
GAs, rough sets).
• Preference criterion : Basis for preference of one model
or set of parameters over another.
• Search algorithm : Specification of an algorithm for
finding particular patterns of interest (or models and
parameters), given the data, family of models, and
preference criterion.
Why Growth of Interest ?
• Falling cost of large storage devices and increasing ease
of collecting data over networks
• Availability of Robust/Efficient machine learning
algorithms to process data.
• Falling cost of computational power  enabling use of
computationally intensive methods for data analysis.
Soft Computing
• Computational Theory of Perception
• For human like decision making
• From: uncertainty, approximate reasoning and partial truth
• To: tractability, robustness, low solution cost, and close
resemblance
• To find an approximate solution to an imprecisely
/precisely formulated problem.
• ‘soft computing rather than hard computing’ as the
foundation for Artificial Intelligence.
Computational Theory of Perceptions
• Provides capability to compute and reason with
perception based information
• Humans have remarkable capability to perform a wide
variety of physical and mental tasks without any
measurement and computations
• They use perceptions of time, direction, speed, shape,
possibility, likelihood, truth, and other attributes of
physical and mental objects
Soft Computing
• A collection of methodologies
• Fuzzy Logic : the algorithms for dealing with imprecision and
uncertainty
• Neural Networks: the machinery for learning and curve fitting
• Genetic Algorithms : the algorithms for search and optimization
• Rough Sets : handling uncertainty arising from the granularity in
the domain of discourse
• They are Complementary rather than Competitive
Perceptions are fuzzy (F) – granular
• Boundaries of perceived classes are unsharp
• Values of attributes are granulated
– a clump of indistinguishable points/objects
Example:
Granules in age: very young, young, not so old,…
Granules in direction: slightly left, sharp right
• F-granularity of perceptions puts them well beyond the reach of
traditional methods of analysis (based on predicate logic and
probability theory)
• Is location A in the forest? Defined by membership function u
– Certainly yes: u (A) = 1
– Certainly not: u (A) = 0
– It dependence on a subjective (vague) opinion: u (A) = 0.6
Role of Fuzzy Sets
• Modeling of imprecise/qualitative
knowledge
• Transmission and handling uncertainties at various
stages
• Supporting, to an extent, human type reasoning in
natural form
Role of Neural Networks
• Machinery for learning and curve fitting (Learns from
examples)
• Resistance to Noise
• Tolerance to Distorted Patterns /Images (Ability to
Generalize
• Superior Ability to Recognize Overlapping Pattern Classes
or Classes with Highly Nonlinear Boundaries or Partially
Occluded or Degraded Images
Role of Genetic Algorithms
• Many tasks involved in analyzing/identifying a pattern
need Appropriate Parameter Selection and Efficient
Search in complex spaces to obtain Optimal Solutions
• Used more in Prediction (P) than Description(D)
– D : Finding human interpretable patterns
describing the
data
– P : Using some variables or attributes in the database to predict
unknown/ future values of other variables of interest
Integrated approaches
•
•
•
•
•
Fuzzy Logic + NN
NN + GA
Fuzzy Logic + NN + GA
Fuzzy Logic + NN + GA + Rough Set
Neuro-fuzzy hybridization is the most visible
integration realized so far.
– Fuzzy Set theoretic models try to mimic human reasoning and
the capability of handling uncertainty
– Neural Network models attempt to emulate architecture and
information representation scheme of human brain
Rough Sets
• Offer mathematical tools to discover hidden patterns in
data
• Fundamental principle of a rough set-based learning
system is to discover redundancies and dependencies
between the given features of a data to be classified
• Approximate a given concept both from below and from
above, using lower and upper approximations
• Rough set learning algorithms can be used to obtain
rules in IF-THEN form from a decision table
• Extract Knowledge from data base
– decision table (objects and attributes)  remove undesirable
attributes (knowledge discovery)  analyze data dependency 
minimum subset of attributes (reducts)
Approximations of the set
B-lower: BX =
w.r.t feature subset B
X U
{x  U : [ x ] B  X }
Granules definitely
belonging to X
B-upper: BX = {x U : [ x]B  X  }
Granules definitely
and possibly belonging
to X
If BX = BX, X is B-exact or B-definable
Otherwise it is Roughly definable
Accuracy of rough set
| B( X ) |
B (X ) 
| B( X ) |
Rough Sets
• Uncertainty Handling
– Using lower & upper approximations
• Granular Computing
– Using information granules
– Computation is performed using information granules and not
the data points (objects)
low medium high
F2
• Rule provides crude description of the
class using granule
low
medium high F1
Rule  M1  M 2
Issues in the Decision Table
• The same or indiscernible objects may be represented se
veral times. (redundant)
• That is, their removal cannot worsen the classification.
• Keep only those attributes that preserve the indiscernibili
ty relation and, consequently, set approximation
• There are usually several such subsets of attributes and t
hose which are minimal are called reducts
Rough Set Rule Generation
Decision Table:
Object
F1
x1
x2
x3
x4
x5
1
0
1
0
1
F2
0
0
1
1
1
F3
F4
1
0
1
0
1
F5
0
0
1
1
0
Decision
1
1
1
0
0
Class 1
Class 1
Class 1
Class 2
Class 2
Discernibility Matrix (c) for Class 1:
cij  {a : a( xi )  a( x j )},1  i, j  p}
Objects
x1
x2
x3
x1

F1 , F 3
F2 , F4
x2
x3

F1,F2,F3,F4

Discernibility function:
f xk   j {(ckj ) : 1  j  p, j  k , ckj   }
Discernibility function considering the object x1 belonging to Class 1
= Discernibility of x1 w.r.t x2 (and) Discernibility of x1 w.r.t x3
=
( F1  F3 )  ( F2  F4 )
Similarly,
Discernibility function considering object
x2  F1  F2  F3  F4
Dependency Rules (AND-OR form): DNF of discernibility functions
Class 1  ( F1  F2 )  ( F1  F4 )  ( F3  F2 )  ( F3  F4 )
Class 1  F1  F2  F3  F4
Summary
• Fuzzy sets provide efficient granulation of feature space
(F -granulation)
• Neural networks are suitable in data-rich environments
and are typically used for extracting embedded
knowledge in the form of rules.
• Genetic algorithms provide efficient search algorithms to
select a model based on preference criterion function.
• Rough sets used for generating information granules.
• They are Complementary for KDD