* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Ontology lexica and automatic grammar generation
Macedonian grammar wikipedia , lookup
Sanskrit grammar wikipedia , lookup
Lithuanian grammar wikipedia , lookup
Zulu grammar wikipedia , lookup
Ojibwe grammar wikipedia , lookup
Esperanto grammar wikipedia , lookup
Probabilistic context-free grammar wikipedia , lookup
Old Irish grammar wikipedia , lookup
Arabic grammar wikipedia , lookup
Georgian grammar wikipedia , lookup
Portuguese grammar wikipedia , lookup
Construction grammar wikipedia , lookup
Old Norse morphology wikipedia , lookup
Latin syntax wikipedia , lookup
Modern Greek grammar wikipedia , lookup
Lexical semantics wikipedia , lookup
Japanese grammar wikipedia , lookup
Turkish grammar wikipedia , lookup
Ukrainian grammar wikipedia , lookup
Modern Hebrew grammar wikipedia , lookup
Icelandic grammar wikipedia , lookup
Swedish grammar wikipedia , lookup
French grammar wikipedia , lookup
Spanish grammar wikipedia , lookup
Yiddish grammar wikipedia , lookup
Polish grammar wikipedia , lookup
Russian grammar wikipedia , lookup
Ancient Greek grammar wikipedia , lookup
Scottish Gaelic grammar wikipedia , lookup
Junction Grammar wikipedia , lookup
Old English grammar wikipedia , lookup
ESSLLI 2012 Ontology-based Interpretation of Natural Language August 15, 2012 Ontology lexica and automatic grammar generation Philipp Cimiano · Christina Unger Semantic Computing Group CITEC, Bielefeld University 1 / 44 ESSLLI 2012 Ontology-based Interpretation of Natural Language These slides are not the orginal ones produced by prof. Philipp Cimiano. They have been edited by Manuel Fiorelli ([email protected]). Edited by Manuel Fiorelli ([email protected]) 1 / 44 Motivation So far we have: I Ontologies I LTAG/DUDES grammars aligned to the ontology But… I Who is going to write those grammars for new domains? I What if we want to produce different grammar formats (e.g. CG, HPSG, GF, etc.)? Goal: automatize the task of grammar generation 2 / 44 Today closed class … GRAMMAR domain Grammar Generation LEXICO N temporal Ontology-based interpretation QA System Reasoner ONTOLO GY Application domain 3 / 44 Outline lemon: A model for the lexicon-ontology interface Automatic grammar generation Verbs Proper names Nouns Prepositions Adjectives 4 / 44 Outline lemon: A model for the lexicon-ontology interface Automatic grammar generation Verbs Proper names Nouns Prepositions Adjectives lemon: A model for the lexicon-ontology interface 5 / 44 The lexicon-ontology interface Ontology Grammar In order to generate grammars for a given ontology, we need to enrich the ontology with lexical and linguistic information. lemon: A model for the lexicon-ontology interface 5 / 44 The lexicon-ontology interface Ontology Lexicon Grammar In order to generate grammars for a given ontology, we need to enrich the ontology with lexical and linguistic information. An ontology lexicon specifies how ontology concepts correspond to natural language expressions. This can support: I Interpretation (knowing how natural language expressions should be interpreted with respect to a given ontology) I Generation (knowing how ontology concepts can be verbalized) lemon: A model for the lexicon-ontology interface 5 / 44 Semantics by reference Semantics by reference (McCrae et al. 2012) Lexicon and ontology are clearly separated. The meaning of lexical entries is specified by pointing to elements in the ontology. I That is, the ontology can live without the lexicon, while the lexicon depends on a given ontology. I One ontology can be connected to several lexica (e.g. for different languages). lemon: A model for the lexicon-ontology interface 6 / 44 lemon lemon (Lexicon model for ontologies) is a metamodel for describing lexica with RDF. Most importantly, lemon is agnostic w.r.t linguistic categories. I It has general concepts like `lexical entry'. I It does not have specific concepts like `noun', 'intransitive verb', `singular', etc. lemon is developed in collaboration between CITEC (Bielefeld), DERI (Galway), UPM (Madrid), DFKI (Saarbrücken), and TU Delft. lemon: A model for the lexicon-ontology interface 7 / 44 The lemon model lemon: A model for the lexicon-ontology interface 8 / 44 Lemon core concepts I Lexicon The lexicon is itself a resource and has one associated language. I Lexical Entry Each entry is a resource and is split into three subclasses: Word, Phrase, Part. I Form Each entry has a number of forms. Each form has a number of written representations. I Lexical Sense Each entry has a number of senses. Each sense has a reference pointing to an ontological symbol (entity, class or relation). lemon: A model for the lexicon-ontology interface 10 / 44 Lexica and entries Entries are indicated with the entry property. 1 2 @base <http://www.example.org/lexicon> . @prefix lemon: <http://www.lexinfo.net/ontology/lemon> . 3 4 5 6 7 8 9 10 :lexicon a lemon:Lexicon ; lemon:language "en" ; lemon:entry :team, :match, :goal, :win, ... . lemon: A model for the lexicon-ontology interface 11 / 44 Forms 1 2 @base <http://www.example.org/lexicon> . @prefix lemon: <http://www.lexinfo.net/ontology/lemon> . 3 4 5 6 :team a lemon:Word ; lemon:canonicalForm [ lemon:writtenRep "team"@en ] ; lemon:otherForm [ lemon:writtenRep "teams"@en ] . 7 8 9 10 1 1 12 :win a lemon:Word ; lemon:canonicalForm [ lemon:writtenRep "win"@en ] ; lemon:otherForm lemon:otherForm lemon:otherForm [ lemon:writtenRep "wins"@en ] ; [ lemon:writtenRep "won"@en ] ; [ lemon:writtenRep "winning"@en ] . lemon: A model for the lexicon-ontology interface 12 / 44 Forms : Form writtenRep = "win"@en other form win : Word : Form writtenRep = "wins"@en : Form writtenRep = "won"@en : Form writtenRep = "winning"@en lemon: A model for the lexicon-ontology interface 13 / 44 Senses and references 1 2 :team lemon:sense [ lemon:reference <http://sc.cit-ec.uni-bielefeld.de/ontologies/soccer#Team> ]. 3 4 5 :win lemon:sense [ lemon:reference <http://sc.cit-ec.uni-bielefeld.de/ontologies/soccer#winner> ]. sense win : Word : LexicalSense reference <http://sc.cit-ec.uni-bielefeld.de/ontologies/soccer#winner> lemon: A model for the lexicon-ontology interface 14 / 44 Frames Syntactic arguments: Each word has a subcategorization frame. I intransitive: X won (X = subject) I transitive: X won Y (X = subject, Y = direct object) Semantic arguments: Each ontological relation has a number of arguments. I <subject> soccer:winner lemon: A model for the lexicon-ontology interface <object> . 15 / 44 Frames Syntactic arguments: Each word has a subcategorization frame. I intransitive: X won (X = subject) I transitive: X won Y (X = subject, Y = direct object) Semantic arguments: Each ontological relation has a number of arguments. I <subject> soccer:winner <object> . We also need to specify the correspondence between syntactic and semantics arguments. I Some team won some game. I <soccer:Match> soccer:winner <soccer:Team> . lemon: A model for the lexicon-ontology interface 15 / 44 Linguistic ontology synBehavior : Frame win : Word synArg sense subjOfProp : Argument : LexicalSense reference : Argument lemon: A model for the lexicon-ontology interface <soccer:winner> 16 / 44 Linguistic ontology lemon provides a general model but stays agnostic w.r.t. linguistic theories. Hence there is no way to indicate the syntactic roles of the arguments (subject, direct object, indirect object, etc.). For specifics, lemon thus needs to be extended with a linguistic ontology. I LexInfo I ISOcat I … lemon: A model for the lexicon-ontology interface 17 / 44 Linguistic ontology synBehavior : Frame win : Word lexinfo:directObject sense subjOfProp : Argument : LexicalSense reference : Argument lemon: A model for the lexicon-ontology interface <soccer:winner> 18 / 44 Lexical properties All LexInfo's lexical properties are subproperties of lemon's property. I lexinfo:partOfSpeech lexinfo:noun, lexinfo:verb, lexinfo:adjective,… I lexinfo:number lexinfo:singular, lexinfo:plural I lexinfo:person lexinfo:firstPerson, lexinfo:secondPerson, lexinfo:thirdPerson I lexinfo:tense lexinfo:present, lexinfo:past … I lemon: A model for the lexicon-ontology interface 19 / 44 A simple lexical entry : Form writtenRep = "win"@en tense = present : Form synBehavior : TransitiveFrame win : Word partOfSpeech = verb other form writtenRep = "wins"@en tense = present person = thirdPerson number = singular sense : Form subject : Argument writtenRep = "won"@en verbFormMood = participle subjOfProp : LexicalSense reference : Argument lemon: A model for the lexicon-ontology interface <soccer:winner> aspect = perfective : Form writtenRep = "winning"@en verbFormMood = gerundive 20 / 44 A simple lexical entry 1 :win a lemon:LexicalEntry ; 2 3 4 5 6 lexinfo:partOfSpeech lexinfo:verb ; lemon:synBehavior [ rdf:typelexinfo:TransitiveFrame ; lexinfo:subject _:arg1 ; lexinfo:directObject _:arg2 ] ; 7 8 9 10 1 1 12 13 14 15 lemon:sense [ lemon:reference <http://sc.cit-ec.uni-bielefeld.de/ontologies/soccer#winner> ; lemon:subjOfProp _:arg2 ; lemon:objOfProp _:arg1 ] ; lemon:canonicalForm [ lemon:writtenRep "win"@en ; lexinfo:tense lexinfo:present] ; lemon:otherForm [ lemon:writtenRep "wins"@en ; ... ] . lemon: A model for the lexicon-ontology interface 21 / 44 Most common frames (verbs) I Intransitive I I I Transitive I I I Arguments: subject, direct object Example: win (smth) Intransitive PP I I I Arguments: subject Example: win Arguments: subject, prepositional object Example: win against (so) Transitive PP I I Arguments: subject, direct object, prepositional object Example: win (smth) against (so) lemon: A model for the lexicon-ontology interface 22 / 44 Creating lexica with lemon I lemon source: http://monnetproject.deri.ie/lemonsource/ I Java API: http://www.lemon-model.net/api.html lemon: A model for the lexicon-ontology interface 23 / 44 Outline lemon: A model for the lexicon-ontology interface Automatic grammar generation Verbs Proper names Nouns Prepositions Adjectives Automatic grammar generation 24 / 44 Generating grammars from lexica For every lexicon entry: Based on the part of speech and syntactic frame… I intransitive verb (with PP) I transitive verb (with PP) I noun (with PP) I adjective …extract relevant properties (word forms, required arguments, semantic restrictions, etc.) and build corresponding grammar entries. Automatic grammar generation 24 / 44 Outline lemon: A model for the lexicon-ontology interface Automatic grammar generation Verbs Proper names Nouns Prepositions Adjectives Automatic grammar generation Verbs 25 / 44 Example: to win : Form writtenRep = "wins"@en tense = present person = thirdPerson number = singular otherForm synBehavior : TransitiveFrame win : Word partofSpeech = verb sense directObject subject : LexicalSense reference : Argument : Argument Automatic grammar generation Verbs hhttp://sc.cit-ec.uni-bielefeld.de/ ontologies/soccer#winner i 25 / 44 Example: to win : Form writtenRep = "wins"@en tense = present person = thirdPerson number = singular otherForm synBehavior : TransitiveFrame S DP1↓ VP win : Word partOfSpeech = verb V DP2↓ sense directObject subject : LexicalSense reference : Argument : Argument Automatic grammar generation Verbs wins hhttp://sc.cit-ec.uni-bielefeld.de/ ontologies/soccer#winner i 25 / 44 Example: to win S : Form writtenRep = "wins"@en tense = present person = thirdPerson number = singular DP1↓ VP otherForm : TransitiveFrame synBehavior win : Word partOfSpeech = verb sense directObject subject V DP2↓ wins : LexicalSense reference : Argument : Argument Automatic grammar generation Verbs hhttp://sc.cit-ec.uni-bielefeld.de/ ontologies/soccer#winner i soccer:winner (y, x) (x, DP1 ), (y, DP2 ) 25 / 44 Grammar entries for transitive verbs Present (3P singular, others) and past: S DP1↓ VP V DP2↓ soccer:winner(y, x) (x, DP1 ), (y, DP2 ) win wins won Automatic grammar generation Verbs 26 / 44 Grammar entries for transitive verbs Passive (using past participle): S DP2↓ VP V is are was were VP V won PP P soccer:winner(y, x) (x, DP1 ), (y, DP2 ) DP1↓ by Automatic grammar generation Verbs 27 / 44 Grammar entries for transitive verbs Gerundive: NP NP∗ ADJ ADJ DP2↓ x soccer:winner(y, x) (y, DP2 ) winning Automatic grammar generation Verbs 28 / 44 Grammar entries for transitive verbs Gerundive: NP NP∗ ADJ y ADJ won PP P soccer:winner(y, x) (x, DP1 ) DP1↓ by Automatic grammar generation Verbs 29 / 44 Grammar entries for transitive verbs Relative clauses: NP NP∗ S COMP VP that V x soccer:winner(y, x) (y, DP2 ) DP2↓ win wins won Automatic grammar generation Verbs 30 / 44 Grammar entries for transitive verbs Relative clauses: NP NP∗ S COMP VP y that V is are was were V PP P won | by Automatic grammar generation Verbs soccer:winner(y, x) (x, DP1 ) VP DP1↓ 31 / 44 Intransitive verbs Example: S DP1 ↓ VP V y, t soccer:loser(y, x) time : hasEnd(y, t) t < now (x, DP1 ) lost Automatic grammar generation Verbs 32 / 44 Intransitive and transitive verbs with PP Examples: I win against (so.) I score from (smth.) I win (smth.) against (so.) I substitute (so.) for (so.) Are prepositional arguments subcategorized or do they adjoin? Automatic grammar generation Verbs 33 / 44 Intransitive PP (subcategorized) S VP DP1 z ↓ V win soccer:winner(z, x) soccer:loser(z, y) PP P DP3 (x, DP1 ), (y, DP3 ) ↓ against Automatic grammar generation Verbs 34 / 44 Intransitive PP (adjoining) S VP DP1↓ VP VP∗ V P win z z soccer:winner(z, x) (x, DP1 ) Automatic grammar generation Verbs PP DP3↓ in z s soccer:stadium(z, s) soccer:city(s, y) (y, DP3 ) 35 / 44 Outline lemon: A model for the lexicon-ontology interface Automatic grammar generation Verbs Proper names Nouns Prepositions Adjectives Automatic grammar generation Proper names 36 / 44 Proper names DP x x x = entity name In our case, entities are represented by URIs. For example: DP x x x = soccer:Uruguay Uruguay Automatic grammar generation Proper names 36 / 44 Outline lemon: A model for the lexicon-ontology interface Automatic grammar generation Verbs Proper names Nouns Prepositions Adjectives Automatic grammar generation Nouns 37 / 44 Nouns I semantically simple nouns (usually expressing atomic classes) I I semantically complex nouns (not expressing atomic classes) I I goal, match, team opener, player, coach relational nouns (usually expressing a property) I winner, wife, distance, capital Automatic grammar generation Nouns 37 / 44 Semantically simple nouns 1 2 3 4 5 6 7 :goal a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:noun ; lemon:canonicalForm [ lemon:writtenRep "goal"@en ; lexinfo:number lexinfo:singular] ; lemon:otherForm [ lemon:writtenRep "goals"@en ; lexinfo:number lexinfo:plural] ; lemon:sense [ lemon:reference soccer:Goal ] . NP DP goal(s) goals x x soccer:Goal(x) Automatic grammar generation x soccer:Goal(x) Nouns 38 / 44 Semantically complex nouns 1 2 3 4 5 :player a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:noun ; ... lemon:sense [ lemon:subsense [ lemon:reference soccer:Person ] , [ lemon:reference soccer:role ; lemon:propertyRange soccer:PlayerRole ] ] . 6 7 NP DP player(s) players x r x soccer:Person(x) soccer:role(x, r) soccer:PlayerRole(r) Automatic grammar generation x, r soccer:Person(x) soccer:role(x, r) soccer:PlayerRole(r) Nouns 39 / 44 Relational nouns 1 2 3 4 :winner a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:noun ; ... lemon:sense [ lemon:reference soccer:winner ] . NP DP winner(s) x winners y x soccer:winner(y, x) Automatic grammar generation xy soccer:winner(y, x) Nouns 40 / 44 Alternative: Define additional classes 1 2 3 4 5 6 7 8 9 10 1 1 12 I Winner ≡ ∃ winner−1 .⊤ I Player ≡ Person ⊓ ∃ role.PlayerRole <Declaration><Class IRI="#Player"/></Declaration> <SubClassOf> <Class IRI="#Player"/> <Class IRI="#Person"/> </SubClassOf> <SubClassOf> <Class IRI="#Player"/> <ObjectSomeValuesFrom> <ObjectProperty IRI="#role"/> <Class IRI="#PlayerRole"/> </ObjectSomeValuesFrom> </SubClassOf> Automatic grammar generation Nouns 41 / 44 Relational nouns (2) :capacity a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:noun ; lemon:canonicalForm [lemon:writtenRep «capacity»@en ; lexinfo:number lexinfo:singular ]; lemon:synBehavior [ a lexinfo:NounPossessiveFrame ; lexinfo:copulativeArg :y ; lexinfo:possessiveAdjunct :x ] ; lemon:sense [ lemon:reference soccer:capacity ; lemon:subjOfProp :x ; lemon:objOfProp :y] ]. NP N NP DP2↓ POSS N PP capacity P ‘s DP2↓ capacity y soccer:capacity(x, y) (x, DP2) of Automatic grammar generation Nouns 41 / 44 Outline lemon: A model for the lexicon-ontology interface Automatic grammar generation Verbs Proper names Nouns Prepositions Adjectives Automatic grammar generation Prepositions 42 / 44 Prepositions 1 2 3 4 5 :by a lemon:LexicalEntry ; lexinfo:partOfSpeech lexinfo:preposition ; lemon:canonicalForm [ lemon:writtenRep "by"@en ] ; lemon:synBehavior [ lexinfo:complement :y ] ; lemon:sense [ lemon:reference <soccer:byPlayer> ; lemon:subjOfProp :x ; lemon:objOfProp 6 7 :y ] . NP NP∗ PP P DP↓ x soccer:byPlayer(x, y) (y, DP) by Automatic grammar generation Prepositions 42 / 44 Outline lemon: A model for the lexicon-ontology interface Automatic grammar generation Verbs Proper names Nouns Prepositions Adjectives Automatic grammar generation Adjectives 43 / 44 Adjectives NP x ADJ m soccer:atMinute(x, m) m < 10 NP∗ early Automatic grammar generation Adjectives 43 / 44 Adjectives AP x ADJ earlier PP P DP↓ mn soccer:atMinute(x, m) soccer:atMinute(y, n) m < n (y, DP) than Automatic grammar generation Adjectives 43 / 44 Adjectives NP x ADJ m soccer:atMinute(x, m) min(m) NP∗ earliest Automatic grammar generation Adjectives 43 / 44 Tomorrow closed class … GRAMMAR domain Grammar Generation LEXICO N temporal Ontology-based interpretation QA System Automatic grammar generation Adjectives Reasoner ONTOLO GY Application domain 44 / 44