Download PowerPoint **

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Wikification via Link Co-occurrence
Presenter : WU, MIN-CONG
Authors
: ZHIYUAN CAI,
KAIQI ZHAO, AND KENNY Q. ZHU,AND HAIXUN WANG
2013, CIKM
Intelligent Database Systems Lab
Outlines
 Motivation
 Objectives
 Methodology
 Experiments
 Conclusions
 Comments
Intelligent Database Systems Lab
Motivation
• Wikipedia concepts by the link graph or link
distributions but, link structure or link distribution is
often biased or incomplete by themselves due to the
fact that Wikipedia pages are often sparsely linked.
Intelligent Database Systems Lab
Objectives
• We propose an iterative method to enrich the
sparsely-linked articles by adding more links and then
use the resulting link co-occurrence matrix
Intelligent Database Systems Lab
Methodology-framework
Intelligent Database Systems Lab
Methodology-preprocessing
Intelligent Database Systems Lab
Methodology-preprocessing
Produce by Algorithm 1
by
Intelligent Database Systems Lab
Methodology- co-occurrence matrix generation
Matrix Initialization problem
problem
caused
computing the cooccurrence within
the whole article
computationally demanding
Multiple topics
caused each other in the article might not be
related at all!
Solve:
Therefore we only consider two concepts co-occur if
they are less than Wc terms
Set:
Intelligent Database Systems Lab
Methodology- co-occurrence matrix generation
Avoid Scc = 0
No discrimination
Concept 1’s scpre = 20
Concept 2’s scpre = 19
Intelligent Database Systems Lab
Methodology- Wikify New Documents BCC
Next step
Intelligent Database Systems Lab
Experiment - Parameter Settings
Intelligent Database Systems Lab
Experiment - Data Preparation
first dataset :Cucerzan
second dataset : Kulkarni
third dataset :
our own creation which
is extracted from
25 articles
Intelligent Database Systems Lab
Experiment - Effects of Wikipedia
Corpus Sizes
increases the cost in time and space doesn’t give us proportional gain.
Intelligent Database Systems Lab
Experiment - Iteration Results
accuracy stabilizes above 0.9.
Intelligent Database Systems Lab
Experiment - End-to-End Wikification
Results
Intelligent Database Systems Lab
Conclusions
• Our evaluation shows that the co-occurrence based
wikification can achieve high accuracy (about 82.58%
on F1) efficiently (over 1000 words per second)
Intelligent Database Systems Lab
Comments
• Advantages
– high accuracy.
– efficiently
• Applications
– Phrase Sense Disambiguation.
Intelligent Database Systems Lab
Related documents