Download Ontology lexica and automatic grammar generation

Document related concepts

Macedonian grammar wikipedia , lookup

Sanskrit grammar wikipedia , lookup

Lithuanian grammar wikipedia , lookup

Zulu grammar wikipedia , lookup

Ojibwe grammar wikipedia , lookup

Esperanto grammar wikipedia , lookup

Probabilistic context-free grammar wikipedia , lookup

Inflection wikipedia , lookup

Old Irish grammar wikipedia , lookup

Arabic grammar wikipedia , lookup

Georgian grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Construction grammar wikipedia , lookup

Old Norse morphology wikipedia , lookup

Latin syntax wikipedia , lookup

Modern Greek grammar wikipedia , lookup

Lexical semantics wikipedia , lookup

Japanese grammar wikipedia , lookup

Turkish grammar wikipedia , lookup

Ukrainian grammar wikipedia , lookup

Modern Hebrew grammar wikipedia , lookup

Icelandic grammar wikipedia , lookup

Swedish grammar wikipedia , lookup

French grammar wikipedia , lookup

Spanish grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Polish grammar wikipedia , lookup

Russian grammar wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Junction Grammar wikipedia , lookup

Old English grammar wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Transcript
ESSLLI 2012
Ontology-based Interpretation of Natural Language
August 15, 2012
Ontology lexica and
automatic grammar generation
Philipp Cimiano · Christina Unger
Semantic Computing Group
CITEC, Bielefeld University
1 / 44
ESSLLI 2012
Ontology-based Interpretation of Natural Language
These slides are not the
orginal ones produced by
prof. Philipp Cimiano.
They have been edited by
Manuel Fiorelli
([email protected]).
Edited by Manuel Fiorelli ([email protected])
1 / 44
Motivation
So far we have:
I
Ontologies
I
LTAG/DUDES grammars aligned to the ontology
But…
I
Who is going to write those grammars for new domains?
I
What if we want to produce different grammar formats (e.g.
CG, HPSG, GF, etc.)?
Goal: automatize the task of grammar generation
2 / 44
Today
closed
class
…
GRAMMAR domain
Grammar
Generation
LEXICO
N
temporal
Ontology-based
interpretation
QA System
Reasoner
ONTOLO
GY
Application domain
3 / 44
Outline
lemon: A model for the lexicon-ontology interface
Automatic grammar generation
Verbs
Proper names
Nouns
Prepositions
Adjectives
4 / 44
Outline
lemon: A model for the lexicon-ontology interface
Automatic grammar generation
Verbs
Proper names
Nouns
Prepositions
Adjectives
lemon: A model for the lexicon-ontology interface
5 / 44
The lexicon-ontology interface
Ontology
Grammar
In order to generate grammars for a given ontology, we need to enrich
the ontology with lexical and linguistic information.
lemon: A model for the lexicon-ontology interface
5 / 44
The lexicon-ontology interface
Ontology
Lexicon
Grammar
In order to generate grammars for a given ontology, we need to enrich
the ontology with lexical and linguistic information.
An ontology lexicon specifies how ontology concepts correspond to
natural language expressions. This can support:
I Interpretation (knowing how natural language expressions should
be interpreted with respect to a given ontology)
I
Generation (knowing how ontology concepts can be verbalized)
lemon: A model for the lexicon-ontology interface
5 / 44
Semantics by reference
Semantics by reference (McCrae et al. 2012)
Lexicon and ontology are clearly separated. The meaning of lexical entries is specified by pointing to elements in the ontology.
I
That is, the ontology can live without the lexicon, while the
lexicon depends on a given ontology.
I
One ontology can be connected to several lexica (e.g. for different
languages).
lemon: A model for the lexicon-ontology interface
6 / 44
lemon
lemon (Lexicon model for ontologies) is a metamodel for describing lexica with RDF.
Most importantly, lemon is agnostic w.r.t linguistic categories.
I
It has general concepts like `lexical entry'.
I
It does not have specific concepts like `noun', 'intransitive verb',
`singular', etc.
lemon is developed in collaboration between CITEC (Bielefeld), DERI (Galway),
UPM (Madrid), DFKI (Saarbrücken), and TU Delft.
lemon: A model for the lexicon-ontology interface
7 / 44
The lemon model
lemon: A model for the lexicon-ontology interface
8 / 44
Lemon core concepts
I
Lexicon
The lexicon is itself a resource and has one associated language.
I
Lexical Entry
Each entry is a resource and is split into three subclasses: Word,
Phrase, Part.
I
Form
Each entry has a number of forms. Each form has a number of
written representations.
I
Lexical Sense
Each entry has a number of senses. Each sense has a reference
pointing to an ontological symbol (entity, class or relation).
lemon: A model for the lexicon-ontology interface
10 / 44
Lexica and entries
Entries are indicated with the entry property.
1
2
@base <http://www.example.org/lexicon> .
@prefix lemon: <http://www.lexinfo.net/ontology/lemon> .
3
4
5
6
7
8
9
10
:lexicon a lemon:Lexicon ;
lemon:language "en" ;
lemon:entry :team,
:match,
:goal,
:win,
... .
lemon: A model for the lexicon-ontology interface
11 /
44
Forms
1
2
@base <http://www.example.org/lexicon> .
@prefix lemon: <http://www.lexinfo.net/ontology/lemon> .
3
4
5
6
:team a lemon:Word ;
lemon:canonicalForm [ lemon:writtenRep "team"@en ] ;
lemon:otherForm
[ lemon:writtenRep "teams"@en ] .
7
8
9
10
1
1
12
:win
a lemon:Word ;
lemon:canonicalForm [ lemon:writtenRep "win"@en ] ;
lemon:otherForm
lemon:otherForm
lemon:otherForm
[ lemon:writtenRep "wins"@en ] ;
[ lemon:writtenRep "won"@en ] ;
[ lemon:writtenRep "winning"@en ] .
lemon: A model for the lexicon-ontology interface
12 / 44
Forms
: Form
writtenRep = "win"@en
other form
win : Word
: Form
writtenRep = "wins"@en
: Form
writtenRep = "won"@en
: Form
writtenRep = "winning"@en
lemon: A model for the lexicon-ontology interface
13 / 44
Senses and references
1
2
:team lemon:sense [ lemon:reference
<http://sc.cit-ec.uni-bielefeld.de/ontologies/soccer#Team> ].
3
4
5
:win lemon:sense [ lemon:reference
<http://sc.cit-ec.uni-bielefeld.de/ontologies/soccer#winner> ].
sense
win : Word
: LexicalSense
reference
<http://sc.cit-ec.uni-bielefeld.de/ontologies/soccer#winner>
lemon: A model for the lexicon-ontology interface
14 / 44
Frames
Syntactic arguments: Each word has a subcategorization frame.
I
intransitive: X won (X = subject)
I
transitive: X won Y (X = subject, Y = direct object)
Semantic arguments: Each ontological relation has a number of
arguments.
I
<subject>
soccer:winner
lemon: A model for the lexicon-ontology interface
<object>
.
15 / 44
Frames
Syntactic arguments: Each word has a subcategorization frame.
I
intransitive: X won (X = subject)
I
transitive: X won Y (X = subject, Y = direct object)
Semantic arguments: Each ontological relation has a number of
arguments.
I
<subject> soccer:winner <object> .
We also need to specify the correspondence between syntactic and
semantics arguments.
I
Some team won some game.
I
<soccer:Match> soccer:winner <soccer:Team> .
lemon: A model for the lexicon-ontology interface
15 / 44
Linguistic ontology
synBehavior
: Frame
win : Word
synArg
sense
subjOfProp
: Argument
: LexicalSense
reference
: Argument
lemon: A model for the lexicon-ontology interface
<soccer:winner>
16 / 44
Linguistic ontology
lemon provides a general model but stays agnostic w.r.t. linguistic
theories. Hence there is no way to indicate the syntactic roles of the
arguments (subject, direct object, indirect object, etc.).
For specifics, lemon thus needs to be extended with a linguistic
ontology.
I
LexInfo
I
ISOcat
I
…
lemon: A model for the lexicon-ontology interface
17 / 44
Linguistic ontology
synBehavior
: Frame
win : Word
lexinfo:directObject
sense
subjOfProp
: Argument
: LexicalSense
reference
: Argument
lemon: A model for the lexicon-ontology interface
<soccer:winner>
18 / 44
Lexical properties
All LexInfo's lexical properties are subproperties of lemon's property.
I
lexinfo:partOfSpeech
lexinfo:noun, lexinfo:verb, lexinfo:adjective,…
I
lexinfo:number
lexinfo:singular, lexinfo:plural
I
lexinfo:person
lexinfo:firstPerson, lexinfo:secondPerson,
lexinfo:thirdPerson
I
lexinfo:tense lexinfo:present, lexinfo:past
…
I
lemon: A model for the lexicon-ontology interface
19 / 44
A simple lexical entry
: Form
writtenRep = "win"@en
tense = present
: Form
synBehavior
: TransitiveFrame
win : Word
partOfSpeech = verb
other form
writtenRep = "wins"@en
tense = present person =
thirdPerson number =
singular
sense
: Form
subject
: Argument
writtenRep = "won"@en
verbFormMood = participle
subjOfProp
: LexicalSense
reference
: Argument
lemon: A model for the lexicon-ontology interface
<soccer:winner>
aspect = perfective
: Form
writtenRep = "winning"@en
verbFormMood = gerundive
20 / 44
A simple lexical entry
1
:win a lemon:LexicalEntry ;
2
3
4
5
6
lexinfo:partOfSpeech lexinfo:verb ;
lemon:synBehavior [ rdf:typelexinfo:TransitiveFrame ;
lexinfo:subject _:arg1 ;
lexinfo:directObject _:arg2 ] ;
7
8
9
10
1
1
12
13
14
15
lemon:sense [ lemon:reference
<http://sc.cit-ec.uni-bielefeld.de/ontologies/soccer#winner> ;
lemon:subjOfProp _:arg2 ;
lemon:objOfProp _:arg1 ] ;
lemon:canonicalForm [ lemon:writtenRep "win"@en ;
lexinfo:tense lexinfo:present] ;
lemon:otherForm [ lemon:writtenRep "wins"@en ; ... ] .
lemon: A model for the lexicon-ontology interface
21 / 44
Most common frames (verbs)
I
Intransitive
I
I
I
Transitive
I
I
I
Arguments: subject, direct object
Example: win (smth)
Intransitive PP
I
I
I
Arguments: subject
Example: win
Arguments: subject, prepositional object
Example: win against (so)
Transitive PP
I
I
Arguments: subject, direct object, prepositional object
Example: win (smth) against (so)
lemon: A model for the lexicon-ontology interface
22 / 44
Creating lexica with lemon
I
lemon source:
http://monnetproject.deri.ie/lemonsource/
I
Java API: http://www.lemon-model.net/api.html
lemon: A model for the lexicon-ontology interface
23 / 44
Outline
lemon: A model for the lexicon-ontology interface
Automatic grammar generation
Verbs
Proper names
Nouns
Prepositions
Adjectives
Automatic grammar generation
24 / 44
Generating grammars from lexica
For every lexicon entry:
Based on the part of speech and syntactic frame…
I
intransitive verb (with PP)
I
transitive verb (with PP)
I
noun (with PP)
I
adjective
…extract relevant properties (word forms, required arguments,
semantic restrictions, etc.) and build corresponding grammar entries.
Automatic grammar generation
24 / 44
Outline
lemon: A model for the lexicon-ontology interface
Automatic grammar generation
Verbs
Proper names
Nouns
Prepositions
Adjectives
Automatic grammar generation
Verbs
25 / 44
Example: to win
: Form
writtenRep = "wins"@en
tense = present person =
thirdPerson number =
singular
otherForm
synBehavior
: TransitiveFrame
win : Word
partofSpeech = verb
sense
directObject
subject
: LexicalSense
reference
: Argument
: Argument
Automatic grammar generation
Verbs
hhttp://sc.cit-ec.uni-bielefeld.de/
ontologies/soccer#winner i
25 / 44
Example: to win
: Form
writtenRep = "wins"@en
tense = present person =
thirdPerson number =
singular
otherForm
synBehavior
: TransitiveFrame
S
DP1↓
VP
win : Word
partOfSpeech = verb
V
DP2↓
sense
directObject
subject
: LexicalSense
reference
: Argument
: Argument
Automatic grammar generation
Verbs
wins
hhttp://sc.cit-ec.uni-bielefeld.de/
ontologies/soccer#winner i
25 / 44
Example: to win
S
: Form
writtenRep = "wins"@en
tense = present person =
thirdPerson number =
singular
DP1↓
VP
otherForm
: TransitiveFrame
synBehavior
win : Word
partOfSpeech = verb
sense
directObject
subject
V
DP2↓
wins
: LexicalSense
reference
: Argument
: Argument
Automatic grammar generation
Verbs
hhttp://sc.cit-ec.uni-bielefeld.de/
ontologies/soccer#winner i
soccer:winner (y, x)
(x, DP1 ), (y, DP2 )
25 / 44
Grammar entries for transitive verbs
Present (3P singular, others) and past:
S
DP1↓
VP
V
DP2↓
soccer:winner(y, x)
(x, DP1 ), (y, DP2 )
win
wins
won
Automatic grammar generation
Verbs
26 / 44
Grammar entries for transitive verbs
Passive (using past participle):
S
DP2↓
VP
V
is
are
was
were
VP
V
won
PP
P
soccer:winner(y, x)
(x, DP1 ), (y, DP2 )
DP1↓
by
Automatic grammar generation
Verbs
27 / 44
Grammar entries for transitive verbs
Gerundive:
NP
NP∗
ADJ
ADJ
DP2↓
x
soccer:winner(y, x)
(y, DP2 )
winning
Automatic grammar generation
Verbs
28 / 44
Grammar entries for transitive verbs
Gerundive:
NP
NP∗
ADJ
y
ADJ
won
PP
P
soccer:winner(y, x)
(x, DP1 )
DP1↓
by
Automatic grammar generation
Verbs
29 / 44
Grammar entries for transitive verbs
Relative clauses:
NP
NP∗
S
COMP
VP
that
V
x
soccer:winner(y, x)
(y, DP2 )
DP2↓
win
wins
won
Automatic grammar generation
Verbs
30 / 44
Grammar entries for transitive verbs
Relative clauses:
NP
NP∗
S
COMP
VP
y
that
V
is
are
was
were
V
PP
P
won |
by
Automatic grammar generation
Verbs
soccer:winner(y, x)
(x, DP1 )
VP
DP1↓
31 / 44
Intransitive verbs
Example:
S
DP1 ↓ VP
V
y, t
soccer:loser(y, x)
time : hasEnd(y, t)
t < now
(x, DP1 )
lost
Automatic grammar generation
Verbs
32 / 44
Intransitive and transitive verbs with PP
Examples:
I
win against (so.)
I
score from (smth.)
I
win (smth.) against (so.)
I
substitute (so.) for (so.)
Are prepositional arguments subcategorized or do they adjoin?
Automatic grammar generation
Verbs
33 / 44
Intransitive PP (subcategorized)
S
VP
DP1
z
↓
V
win
soccer:winner(z, x)
soccer:loser(z, y)
PP
P
DP3
(x, DP1 ), (y, DP3 )
↓
against
Automatic grammar generation
Verbs
34 / 44
Intransitive PP (adjoining)
S
VP
DP1↓ VP
VP∗
V
P
win
z z
soccer:winner(z, x)
(x, DP1 )
Automatic grammar generation
Verbs
PP
DP3↓
in
z s
soccer:stadium(z, s)
soccer:city(s, y)
(y, DP3 )
35 / 44
Outline
lemon: A model for the lexicon-ontology interface
Automatic grammar generation
Verbs
Proper names
Nouns
Prepositions
Adjectives
Automatic grammar generation
Proper names
36 / 44
Proper names
DP
x
x
x = entity
name
In our case, entities are represented by URIs. For example:
DP
x
x
x = soccer:Uruguay
Uruguay
Automatic grammar generation
Proper names
36 / 44
Outline
lemon: A model for the lexicon-ontology interface
Automatic grammar generation
Verbs
Proper names
Nouns
Prepositions
Adjectives
Automatic grammar generation
Nouns
37 / 44
Nouns
I
semantically simple nouns (usually expressing atomic classes)
I
I
semantically complex nouns (not expressing atomic classes)
I
I
goal, match, team
opener, player, coach
relational nouns (usually expressing a property)
I
winner, wife, distance, capital
Automatic grammar generation
Nouns
37 / 44
Semantically simple nouns
1
2
3
4
5
6
7
:goal a lemon:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:noun ;
lemon:canonicalForm [ lemon:writtenRep "goal"@en ;
lexinfo:number lexinfo:singular] ;
lemon:otherForm [ lemon:writtenRep "goals"@en ;
lexinfo:number lexinfo:plural] ;
lemon:sense [ lemon:reference soccer:Goal ] .
NP
DP
goal(s)
goals
x
x
soccer:Goal(x)
Automatic grammar generation
x
soccer:Goal(x)
Nouns
38 / 44
Semantically complex nouns
1
2
3
4
5
:player a lemon:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:noun ;
...
lemon:sense
[ lemon:subsense [ lemon:reference soccer:Person ] ,
[ lemon:reference soccer:role ;
lemon:propertyRange soccer:PlayerRole ] ] .
6
7
NP
DP
player(s)
players
x
r
x
soccer:Person(x)
soccer:role(x, r)
soccer:PlayerRole(r)
Automatic grammar generation
x, r
soccer:Person(x)
soccer:role(x, r)
soccer:PlayerRole(r)
Nouns
39 / 44
Relational nouns
1
2
3
4
:winner a lemon:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:noun ;
...
lemon:sense [ lemon:reference soccer:winner ] .
NP
DP
winner(s)
x
winners
y
x
soccer:winner(y, x)
Automatic grammar generation
xy
soccer:winner(y, x)
Nouns
40 / 44
Alternative: Define additional classes
1
2
3
4
5
6
7
8
9
10
1
1
12
I
Winner ≡ ∃ winner−1 .⊤
I
Player ≡ Person ⊓ ∃ role.PlayerRole
<Declaration><Class IRI="#Player"/></Declaration>
<SubClassOf>
<Class IRI="#Player"/>
<Class IRI="#Person"/>
</SubClassOf>
<SubClassOf>
<Class IRI="#Player"/>
<ObjectSomeValuesFrom>
<ObjectProperty IRI="#role"/>
<Class IRI="#PlayerRole"/>
</ObjectSomeValuesFrom>
</SubClassOf>
Automatic grammar generation
Nouns
41 / 44
Relational nouns (2)
:capacity a lemon:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:noun ;
lemon:canonicalForm [lemon:writtenRep «capacity»@en ;
lexinfo:number lexinfo:singular ];
lemon:synBehavior [ a lexinfo:NounPossessiveFrame ;
lexinfo:copulativeArg :y ;
lexinfo:possessiveAdjunct :x ] ;
lemon:sense [ lemon:reference soccer:capacity ;
lemon:subjOfProp :x ;
lemon:objOfProp :y] ].
NP
N
NP
DP2↓ POSS N
PP
capacity P
‘s
DP2↓
capacity
y
soccer:capacity(x, y)
(x, DP2)
of
Automatic grammar generation
Nouns
41 / 44
Outline
lemon: A model for the lexicon-ontology interface
Automatic grammar generation
Verbs
Proper names
Nouns
Prepositions
Adjectives
Automatic grammar generation
Prepositions
42 / 44
Prepositions
1
2
3
4
5
:by a lemon:LexicalEntry ;
lexinfo:partOfSpeech lexinfo:preposition ;
lemon:canonicalForm [ lemon:writtenRep "by"@en ] ;
lemon:synBehavior [ lexinfo:complement :y ] ;
lemon:sense [ lemon:reference <soccer:byPlayer> ;
lemon:subjOfProp :x ;
lemon:objOfProp
6
7
:y ] .
NP
NP∗
PP
P
DP↓
x
soccer:byPlayer(x, y)
(y, DP)
by
Automatic grammar generation
Prepositions
42 / 44
Outline
lemon: A model for the lexicon-ontology interface
Automatic grammar generation
Verbs
Proper names
Nouns
Prepositions
Adjectives
Automatic grammar generation
Adjectives
43 / 44
Adjectives
NP
x
ADJ
m
soccer:atMinute(x, m)
m < 10
NP∗
early
Automatic grammar generation
Adjectives
43 / 44
Adjectives
AP
x
ADJ
earlier
PP
P
DP↓
mn
soccer:atMinute(x, m)
soccer:atMinute(y, n)
m < n
(y, DP)
than
Automatic grammar generation
Adjectives
43 / 44
Adjectives
NP
x
ADJ
m
soccer:atMinute(x, m)
min(m)
NP∗
earliest
Automatic grammar generation
Adjectives
43 / 44
Tomorrow
closed
class
…
GRAMMAR domain
Grammar
Generation
LEXICO
N
temporal
Ontology-based
interpretation
QA System
Automatic grammar generation
Adjectives
Reasoner
ONTOLO
GY
Application domain
44 / 44