Download Explass: Exploring Associations between Entities via Top

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Explass: Exploring Associations between Entities
via Top-K Ontological Patterns and Facets
Gong Cheng, Yanan Zhang, and Yuzhong Qu
Contents
2







Introduction
Association Definition
Overview of Explass
Approach
Evaluation
Conclusion
Next work
Introduction
3

What are the associations between A and B ?
Introduction
4
Introduction (Related Work)
5


Association Discovery and Ranking
Exploratory Association Search
Introduction
6
Our work:
 provides a flat list (top-K, rather than a hierarchy) of
clusters for refocusing.
mine all the significant patterns
 find top-K ones that are as frequent and informative as
possible while sharing small overlap between each other.


integrates patterns with facet values.
Association Definition
7

Association (Path-based)

G = <V, A, s, t, lV, lA>, a path v0a1 · · · anvn from eS =
lV (v0) to eE = lV (vn): Z= r1e1 · · · en−1rn,



for 1 ≤ i ≤ n − 1, ei = lV (vi), and
for 1 ≤ i ≤ n, if s(ai) = vi−1, then ri = lA(ai); otherwise, ri =
˜lA(ai).
Ontological association pattern

P = r’1c’1· · · c’n−1r’n, denoted by Z ∈ M(P)


for 1 ≤ i ≤ n − 1, ei ∈ I(c’i), and
for 1 ≤ i ≤ n, ri ⊑R r’i.
Association definition
8
secondAuthor
inProcOf
PaperA
ConfA
ConfB
inProcOf
Alice
firstAuthor
cites
chair
firstAuthor
PaperB
secondAuthor
PaperD
reviewer
Bob
PaperC
cites
ArticleA
extends
firstAuthor
Overview of Explass
9
Filters in use
Facet values
(classes)
Facet values
(relations)
Click to use this pattern as a filter
Associations matching a
recommended pattern
Associations not matching any
recommended pattern
Pattern Recommendation
10

Mining Signicant Patterns
secondAuthor
Author
RELATED

psc(PaperA)
ConfPaper
Publication
ENTITY
To characterize the relevance of pattern P to the query
context,
2/5
1/5
…
Pattern Recommendation
11

Mining Signicant Patterns

Data mining


Frequent closed itemset mining problem(FCIMP)
Encode the path structure


Association->transaction
Item: a position-relation pair {1, 3, … , 2n − 1} × ∑R or a
position-class pair in {2, 4, … , 2n − 2} × ∑C
Finding Frequent, Informative, and SmallOverlapping Patterns
12

Informativeness

self-information (specific)

entropy
Finding Frequent, Informative, and SmallOverlapping Patterns
13

Overlap
ontological overlap
 contextual overlap

P = r1c1· · · cn−1rn
P’ = r’1c’1· · · c’n−1r’n
hits(p)

hits(p’)
Optimization



find up to K ones that are as frequent and informative as
possible while sharing small overlap between each other
Multidimensional 0-1 knapsack problem (MKP)
Greedy heuristic
Facet Value Recommendation
14

K classes of entities and K relations
frequency
 Informativeness
 overlap

Evaluation
15


To investigate how patterns and facets help users
explore associations in practice
Two hypotheses


H1. For association exploration, providing a flat list (top-K) of frequent,
informative, and small-overlapping patterns (as on Explass) is more satisfying
than an inclusive hierarchy of patterns (as on RelClus).
H2. Patterns and facets are notably complementary in terms of usage in
association exploration, and thus providing both of them (as on Explass) is
more satisfying than only one of them (as on RelFinder and RelClus).
Evaluation
16

Data Sets: DBpedia

Tasks


Derived from the 100 training queries (QALD-3 evaluation campaign)

Related entities that “people search for” by Goolge Search

26 tasks
Explass vs RelClus vs RF (reproduced RelFinder)
Results
17

User Experience
Results
18

User Behavior
User Feedback and Discussion
19

RelClus

6 subjects(30%):


11 subjects(55%):


provided a good overview of all the associations and helped refocus on a
particular theme
a high level were often too general to be useful, confused about the deep
and complicated hierarchies
RF

5 subjects (25%):


recommended classes and relations were useful filters
8 subjects (40%):

needed a better overview for summarizing associations
User Feedback and Discussion
20

Explass

14 subjects (70%):


11 subjects (55%):



provided a good summary of associations and helped refocus on a
particular theme when recommended facet values helped filter
associations
some very large clusters could be divided into small ones
As to H1: Explass considered the informativeness of patterns
in recommendation.
As to H2: patterns provided an overview that meaningfully
summarized significant subsets of associations covering
diverse themes to be refocused on, when facets provided
useful filters for refining the search.
Conclusion
21

realized exploratory association search in a new way by
recommending Top-K patterns and facet values, which have
been shown to be notably complementary in terms of usage:
patterns for summarizing and refocusing, and facets for
refining and filtering.
Next work
22


Discover implicit semantic associations between
entities
Other types of associations
Type similarity
 <p, l> similarity
 …


Named associations with understandable and
meaningful labels.



Virtual properties
Semantic metrics
Ranking (pruning)
References
23






Aleman-Meza, B., Halaschek-Wiener, C., Arpinar, I.B., Ramakrishnan, C., Sheth, A.P.:
Ranking Complex Relationships on the Semantic Web. IEEE Internet Comput. 9(3), 37–
44 (2005)
Anyanwu, K., Maduko, A., Sheth, A.: SemRank: Ranking Complex Relationship Search
Results on the Semantic Web. In: 14th International Conference on WorldWide Web,
pp. 117–127. ACM, New York (2005)
Anyanwu K, Sheth A. Ρ-Queries: enabling querying for semantic associations on the
semantic web[C]//Proceedings of the 12th international conference on World Wide
Web. ACM, 2003: 690-699.
Jeh G, Widom J. SimRank: a measure of structural-context similarity[C]//Proceedings
of the eighth ACM SIGKDD international conference on Knowledge discovery and
data mining. ACM, 2002: 538-543.
Araujo S, Houben G J, Schwabe D, et al. Fusion–Visually Exploring and Eliciting
Relationships in Linked Data[M]//The Semantic Web–ISWC 2010. Springer Berlin
Heidelberg, 2010: 1-15.
Lassila O. Generating Rewrite Rules by Browsing RDF Data[C]//Rules and Rule
Markup Languages for the Semantic Web, Second International Conference on. IEEE,
2006: 51-57.
Thanks