Download Experiments

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
A fuzzy conceptualization model
for text mining with application in opinion
polarity classification
Presenter : JIAN-REN CHEN
Authors
: Sheng-Tun Lia,b,*, Fu-Ching Tsaia
2013 , KBS
Intelligent Database Systems Lab
Outlines
 Motivation
 Objectives
 Methodology
 Experiments
 Conclusions
 Comments
Intelligent Database Systems Lab
Motivation
Most existing document classification algorithms are easily
affected by ambiguous terms.
The ability to disambiguate for a classifier is thus as important as
the ability to classify accurately.
- opinion polarity classification
Intelligent Database Systems Lab
Objectives
We propose a concept driven text classification approach based on
Formal Concept Analysis (FCA) to train a classifier using concepts
instead of documents, so as to reduce the inherent ambiguities.
We further utilize fuzzy formal concept analysis (FFCA) to take
uncertain information into consideration.
Intelligent Database Systems Lab
Formal concept analysis
Objects: {Review6,Review7}
positive class:
Attributes: {Phenomenal,
Love}
‘‘Phenomenal’’,
‘‘Fantastic’’Fantastic,
and ‘‘Love’’
{Review1,
=> formalReview4,
conceptReview6 and Review7}
negative class:
‘‘Awful’’
{Review2, Review3}
neutral class:
‘‘Cover’’
{Review5}
Intelligent Database Systems Lab
Formal concept analysis
positive class:
{Review1, Review4, Review6, Review7}
negative class:
{Review2, Review3}
neutral class:
{Review5}
Intelligent Database Systems Lab
Methodology - Architecture
Intelligent Database Systems Lab
Methodology
tf-idf:
Inverted Conformity
Frequency (ICF):
Uniformity (Uni):
tf-idf > 26
ICF < log(2)
Uni > 0.2
Intelligent Database Systems Lab
Methodology
Intelligent Database Systems Lab
Methodology
Intelligent Database Systems Lab
Experiments - Data set and evaluation
• Data set:
 Reuter-21578
 movie review
 e-book review
• Evaluation
Intelligent Database Systems Lab
Experiments (parameters)
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments (conceptualization)
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Conclusions
• FFCM successfully reduce the impact from textual ambiguity.
• The results from the experiments show that FFCM
outperforms other state-of-the-art algorithms for both
Reuters-21578 and two opinion polarity collections.
Intelligent Database Systems Lab
Comments
• Advantages
- the formal concepts plays an important role
• Disadvantage
- α may differ from various datasets
- only focuses on single-class classification
• Applications
- text mining
Intelligent Database Systems Lab