Download Synonym Analysis for Predicate Expansion

Document related concepts
no text concepts found
Transcript
Synonym Analysis for Predicate
Expansion
Ziawasch Abedjan and Felix Naumann
Hasso Plattner Institute Potsdam, Germany
ESWC 2013, Montpellier
Linked Open Data
2
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Linked Data
3
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
4
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Using LOD Sources
5
■ Select ?x where { ?x composer Abba}
subject
predicate
object
Mamma Mia (Movie)
musicComposer
ABBA
Mamma Mia (Movie)
language
English
Mamma Mia (Movie)
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
origin
UK
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Using LOD Sources
5
■ Select ?x where { ?x composer Abba}
subject
predicate
object
Mamma Mia (Movie)
musicComposer
ABBA
Mamma Mia (Movie)
language
English
Mamma Mia (Movie)
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
origin
UK
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Using LOD Sources
5
■ Select ?x where { ?x composer Abba}
subject
predicate
object
Mamma Mia (Movie)
musicComposer
ABBA
Mamma Mia (Movie)
language
English
Mamma Mia (Movie)
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
origin
UK
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Using LOD Sources
5
■ Select ?x where { ?x composer Abba}
subject
predicate
object
Mamma Mia (Movie)
musicComposer
ABBA
Mamma Mia (Movie)
language
English
Mamma Mia (Movie)
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
origin
UK
■ Select ?x where { ?x musicComposer Abba}
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Using LOD Sources
5
■ Select ?x where { ?x composer Abba}
subject
predicate
object
Mamma Mia (Movie)
musicComposer
ABBA
Mamma Mia (Movie)
language
English
Mamma Mia (Movie)
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
origin
UK
■ Select ?x where { ?x musicComposer Abba}
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Synonym Discrepancy
6
■ Domain specific knowledge is needed
□ What does a property mean?
□ “musicComposer” or “composer”?
■ Inconsistent property usage
□ Data publisher ignores ontology
□ Extracted data
□ Use “weight” or “Person/weight”?
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Query Log Analysis: USEWOD 2012
7
Select ?company where {
……
{?company dbpedia-prop:name “International Business Machines
Corporation”@en }
UNION
{?company rdfs:label “International Business Machines
Corporation”@en}
…
}
…..
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Query Log Analysis: USEWOD 2012
7
Select ?company where {
……
{?company dbpedia-prop:name “International Business Machines
Corporation”@en }
UNION
{?company rdfs:label “International Business Machines
Corporation”@en}
…
}
…..
{?place dbpedia-prop:name ”Dublin”@en.}
UNION
{?place dbpedia-prop:officialName ”Dublin”@en.}
UNION
{?place foaf:name ”Dublin”@en.}
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Query Log Analysis: USEWOD 2012
7
Select ?company where {
……
{?company dbpedia-prop:name “International Business Machines
Corporation”@en }
UNION
{?company rdfs:label “International Business Machines
Corporation”@en}
…
}
…..
{?place dbpedia-prop:name ”Dublin”@en.}
UNION
{?place dbpedia-prop:officialName ”Dublin”@en.}
UNION
{?place foaf:name ”Dublin”@en.}
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Query Log Analysis: USEWOD 2012
7
Select ?company where {
……
{?company dbpedia-prop:name “International Business Machines
Corporation”@en }
UNION
{?company rdfs:label “International Business Machines
Corporation”@en}
…
}
…..
{?place dbpedia-prop:name ”Dublin”@en.}
UNION
{?place dbpedia-prop:officialName ”Dublin”@en.}
UNION
{?place foaf:name ”Dublin”@en.}
UNION:
Expand query
with
synonymously
used
predicates
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Dealing with inconsistency through
query expansion
8
■ [Hurtado JoDS 2008]
■ [Elbassuoni ESWC 2011]
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Dealing with inconsistency through
query expansion
8
■ [Hurtado JoDS 2008]
□ Change query patterns using logical operators
□ E.g., (s,p,ConferenceArticle)  (s,p,Article)
□ But, relies on well-defined ontology [Hurtado JoDS 2008]
■ [Elbassuoni ESWC 2011]
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Dealing with inconsistency through
query expansion
8
■ [Hurtado JoDS 2008]
□ Change query patterns using logical operators
□ E.g., (s,p,ConferenceArticle)  (s,p,Article)
□ But, relies on well-defined ontology [Hurtado JoDS 2008]
■ [Elbassuoni ESWC 2011]
□ Co-occurrence of entities/predicates in “documents”
□ Documents created through RDF facts
□ But, uses external datasets
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Our Approach: Synonym Analysis
9
■ Discovering synonymously used predicates:
{?place dbpedia-prop:officialName ”Dublin”@en.}
UNION
{?place foaf:name ”Dublin”@en.}
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Our Approach: Synonym Analysis
9
■ Discovering synonymously used predicates:
{?place dbpedia-prop:officialName ”Dublin”@en.}
UNION
{?place foaf:name ”Dublin”@en.}
■ Intuition:
□ Predicates should have a similar range [Rahm et al. 2001]
□ Should not co-occur for the same subjects [Cafarella et al. 2008]
■ Methodology:
□ Mining configurations
□ Association rules on statement level
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Mining Configurations
10
■ Association rule mining
□ On statement level
□ One part of the statement (S,P,O) is the context (TID) and
another the target (transaction)
Context
Target
Semantics
subject
predicate
Schema analysis
subject
object
Knowledge discovery
predicate
subject
Ontological clustering
predicate
object
Ontological clustering
object
subject
Topical clustering
object
predicate
Schema matching
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Range Content Filtering
11
Context
Target
object
predicate
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Range Content Filtering
11
Context
Target
object
predicate
subject
predicate
object
Mamma Mia
musicComposer
ABBA
Mamma Mia
language
English
Mamma mia
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
country
UK
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Range Content Filtering
11
Context
Target
object
predicate
subject
predicate
object
Mamma Mia
musicComposer
ABBA
Mamma Mia
language
English
Mamma mia
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
country
UK
object
predicates (transaction)
UK
country, birthPlace...deathPlace
ABBA
composer, musicComposer
Comedy
genre
English
language
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Range Content Filtering
11
Context
Target
object
predicate
Frequent Pattern:
{composer, musicComposer}
subject
predicate
object
Mamma Mia
musicComposer
ABBA
Mamma Mia
language
English
Mamma mia
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
country
UK
object
predicates (transaction)
UK
country, birthPlace...deathPlace
ABBA
composer, musicComposer
Comedy
genre
English
language
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Range Content Filtering
11
Context
Target
object
predicate
Frequent Pattern:
{composer, musicComposer}
subject
predicate
object
Mamma Mia
musicComposer
ABBA
Mamma Mia
language
English
Mamma mia
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
country
UK
object
predicates (transaction)
UK
country, birthPlace...deathPlace
ABBA
composer, musicComposer
Comedy
genre
English
language
Much faster than pair wise
range matching
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Schema Analysis
12
Context
Target
subject
predicate
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Schema Analysis
12
Context
Target
subject
predicate
object
subject
predicate
Mamma Mia
musicComposer
ABBA
Mamma Mia
language
English
Mamma mia
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
country
UK
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Schema Analysis
12
Context
Target
subject
predicate
object
subject
predicate
Mamma Mia
musicComposer
ABBA
Mamma Mia
language
English
Mamma mia
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
country
UK
subject
predicates (transaction)
Mamma Mia
musicComposer, language, country
KMKY
composer, genre, country
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Schema Analysis
12
Context
Target
subject
predicate
object
subject
predicate
Mamma Mia
musicComposer
ABBA
Mamma Mia
language
English
Mamma mia
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
country
UK
Negative association rule:
composer -> NOT musicComposer
musicComposer -> NOT composer
subject
predicates (transaction)
Mamma Mia
musicComposer, language, country
KMKY
composer, genre, country
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Schema Analysis
12
Context
Target
subject
predicate
object
subject
predicate
Mamma Mia
musicComposer
ABBA
Mamma Mia
language
English
Mamma mia
country
UK
Knowing Me Knowing You
composer
ABBA
Knowing Me Knowing You
genre
Comedy
Knowing Me Knowing You
country
UK
Negative association rule:
composer -> NOT musicComposer
musicComposer -> NOT composer
subject
predicates (transaction)
Mamma Mia
musicComposer, language, country
KMKY
composer, genre, country
 Aggregate confidence of both rules:
 Maximum, minimum, f-measure
 Reverse correlation coefficient (RCC)
 Syn [Cafarella VLDB 2008]
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Algorithm Workflow
13
■ Input: RDF graph




Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Algorithm Workflow
13
■ Input: RDF graph
1. Range content filtering (RCF)
 Mine predicates in the context of objects
 Retrieve frequent candidate pairs


Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Algorithm Workflow
13
■ Input: RDF graph
1. Range content filtering (RCF)
 Mine predicates in the context of objects
 Retrieve frequent candidate pairs
2. Schema Analysis


Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Algorithm Workflow
13
■ Input: RDF graph
1. Range content filtering (RCF)
 Mine predicates in the context of objects
 Retrieve frequent candidate pairs
2. Schema Analysis
 Mine predicates in the context of subejcts
 Keep pairs with high negative correlation
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Algorithm Workflow
13
■ Input: RDF graph
1. Range content filtering (RCF)
 Mine predicates in the context of objects
 Retrieve frequent candidate pairs
2. Schema Analysis
 Mine predicates in the context of subejcts
 Keep pairs with high negative correlation
■ Output: candidates for predicate expansion
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Precision Evaluation
14
100
precision
80
60
minConf
maxConf
40
fMeasure
RCC
Syn
20
RCF
0
Datasets
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Precision Evaluation
14
100
precision
80
60
minConf
maxConf
40
fMeasure
RCC
Syn
20
RCF
0
Datasets
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Precision Evaluation
14
100
precision
80
60
minConf
maxConf
40
fMeasure
RCC
Syn
20
RCF
0
Datasets
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Precision Evaluation
14
100
precision
80
60
minConf
maxConf
40
fMeasure
RCC
Syn
20
RCF
0
Datasets
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Precision Evaluation
14
100
precision
80
60
minConf
maxConf
40
fMeasure
RCC
Syn
20
RCF
0
Datasets
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Precision Evaluation
14
100
precision
80
60
minConf
maxConf
40
fMeasure
RCC
Syn
20
RCF
0
Datasets
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Precision and Recall
15
■ DBpedia Work
■ Manually classified 9456 pairs of properties
■ 3 computer scientists agrred on 82 candidates
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Precision and Recall
15
■ DBpedia Work
■ Manually classified 9456 pairs of properties
■ 3 computer scientists agrred on 82 candidates
■ Top 5 results:
1. artist, starring
2. artist, musicComposer
3. author, writer
4. creator, writer
5. composer, musicComposer
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Precision and Recall
■ DBpedia Work
■ Manually classified 9456 pairs of properties
■ 3 computer scientists agrred on 82 candidates
RCF support 0.1%
■ Top 5 results:
1
0,9
1. artist, starring
2. artist, musicComposer
3. author, writer
4. creator, writer
5. composer, musicComposer
minConf
0,8
precision
15
0,7
maxConf
0,6
fmeasure
0,5
0,4
RCC
0,3
0,2
Syn
0,1
0
0
0,1
0,2
recall
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
0,3
Precision and Recall
■ DBpedia Work
■ Manually classified 9456 pairs of properties
■ 3 computer scientists agrred on 82 candidates
1
■ Top 5 results:
3. author, writer
4. creator, writer
5. composer, musicComposer
minConf
maxConf
0,8
1. artist, starring
2. artist, musicComposer
RCF support 0.01%
0,9
precision
15
0,7
fmeasure
0,6
RCC
0,5
Syn
0,4
0,3
0,2
0,1
0
0
0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9
recall
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
1
Runtime comparison for RCF
16
■ Our approach: Mine predicates in the context of objects
■ Naïve approach: Look for value overlaps for each predicate pair
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Conclusion and Future Work
17
■ Deal with inconsistent LOD data through query expansion
■ Synonym analysis for query expansion
□ Mining on RDF statement level
□ Works best on domain-specific data
□ Value overlap computation performs by magnitudes faster
than the naïve approach
■ Future work:
□ Instead of improving data usability through synonym
discovery, improve the data itself
Ziawasch Abedjan | Synonym Discovery for Predicate Expansion | ESWC 2013
Related documents