Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Computational Semantics http://www.coli.uni-sb.de/cl/projects/milca/esslli Day II: A Modular Architecture Aljoscha Burchardt, Alexander Koller, Stephan Walter, Universität des Saarlandes, Saarbrücken, Germany ESSLLI 2004, Nancy, France Computing Semantic Representations • Yesterday: – -Calculus is a nice tool for systematic meaning construction. – We saw a first, sketchy implementation – Some things still to be done • Today: – Let’s fix the problems – Let’s build nice software Yesterday: -Calculus • Semantic representations constructed along the syntax tree: How to get there? By using functional application • s help to guide arguments in the right place on -reduction: x.love(x,mary)@john love(john,mary) Yesterday’s disappointment Our first idea for NPs with determiner didn’t work out: “A man” ~> z.man(z) „A man loves Mary“ ~> * love(z.man(z),mary) But what was the idea after all? Nothing! z.man(z) just isn‘t the meaning of „a man“. If anything, it translates the complete sentence „There is a man“ Let‘s try again, systematically… A solution What we want is: „A man loves Mary“ ~> z(man(z) love(z,mary)) What we have is: “man” ~> “loves Mary” ~> y.man(y) x.love(x,mary) How about: z(man(z) love(z,mary)) z(y.man(y)(z) x.love(x,mary)(z)) love(z,mary)) Remember: We can use variables for any kind of term. So next: P(Q.z(P(z) Q(z))) P( Q.z(y.man(y)(z) P x.love(x,mary)(z)) Q(z)) x.love(x,mary) )y.man(y) <~ “A” But… “A man … loves Mary” P(Q.z(P(z) Q(z)))@ y.man(y) @ x.love(x,mary) Q.z(man(z)Q(z)) z.man(z) man(z) x.love(x,mary)(z) love(z,mary) @ x.love(x,mary) P(Q.z(P(z)Q(z)))@y.man(y) “John … loves Mary” x.love(x,mary) @ john not systematic! john @ x.love(x,mary) not reducible! @ x.love(x,mary) better! x.love(x,mary)@john P.P@john love(john,mary) So: fine! “John” ~> P.P(john) Transitive Verbs What about transitive verbs (like "love")? "loves" ~> yx.love(x,y) won't do: "Mary" ~> Q.Q(mary) "loves Mary" ??? x.love(x,Q.Q(mary)) ~> yx.love(x,y)@Q.Q(mary) How about something a little more complicated: "loves" ~> Rx(R@y.love(x,y)) The only way to understand this is to see it in action... "John loves Mary" again... love(john,mary) x.love(x,mary)(john) love(john,mary) x(P.P(mary)@y.love(x,y)) P.P(john) @x(y.love(x,y)(mary)) x.love(x,mary) P.P(john) @ John ( Rx(R@y.love(x,y)) loves @ P.P(mary) ) Mary Summing up • • • • • nouns: intransitive verbs: determiner: proper names: transitive verbs: “man” „smoke“ „a“ „mary“ “love” ~> x.man(x) ~> x.smoke(x) ~> P(Q.z(P(z) Q(z))) ~> P.P(mary) ~> Rx(R@y.love(x,y)) Today‘s first success What we can do now (and could not do yesterday): • Complex NPs (with determiners) • Transitive verbs … and all in the same way. Key ideas: • Extra λs for NPs • Variables for predicates • Apply subject NP to VP Yesterday’s implementation s(VP@NP) --> np(NP),vp(VP). np(john) --> [john]. np(mary) --> [mary]. tv(lambda(X,lambda(Y,love(Y,X)))) --> [loves], {vars2atoms(X),vars2atoms(Y)}. iv(lambda(X,smoke(X))) --> [smokes], {vars2atoms(X)}. iv(lambda(X,snore(X))) --> [snorts], {vars2atoms(X)}. vp(TV@NP) --> tv(TV),np(NP). vp(IV) --> iv(IV). % This doesn't work! np(exists(X,man(X))) --> [a,man], {vars2atoms(X)}. Was this a good implementation? A Nice Implementation What is a nice implementation? It should be: – Scalable: If it works with five examples, upgrading to 5000 shouldn’t be a great problem (e.g. new constructions in the grammar, more words...) – Re-usable: Small changes in our ideas about the system shouldn’t lead to complex changes in the implementation (e.g. a new representation language) Solution: Modularity • Think about your problem in terms of interacting conceptual components • Encapsulate these components into modules of your implementation, with clean and abstract pre-defined interfaces to each other • Extend or change modules to scale / adapt the implementation Another look at yesterday’s implementation • Okay, because it was small • Not modular at all: all linguistic functionality in one file, packed inside the DCG • E.g. scalability of the lexicon: Always have to write new rules, like: tv(lambda(X,lambda(Y,visit(Y,X)))) --> [visit], {vars2atoms(X),vars2atoms(Y)}. • Changing parts for Adaptation? Change every single rule! Let's modularize! Semantic Construction: Conceptual Components smoke(j) Black Box “John smokes” Semantic Construction: Inside the Black Box Syntax Phrases (combinatorial) Semantics combine-rules DCG Black Box Words (lexical) lexicon-facts DCG The DCG-rules tell us what phrases are acceptable (mainly). Their basic structure is: s(...) --> np(...), vp(...), {...}. np(...) --> det(...), noun(...), {...}. np(...) --> pn(...), {...}. vp(...) --> tv(...), np(...), {...}. vp(...) --> iv(...), {...}. (The gaps will be filled later on) combine-rules The combine-rules encode the actual semantic construction process. That is, they glue representations together using @: combine(s:(NP@VP),[np:NP,vp:VP]). combine(np:(DET@N),[det:DET,n:N]). combine(np:PN,[pn:PN]). combine(vp:IV,[iv:IV]). combine(vp:(TV@NP),[tv:TV,np:NP]). Lexicon The lexicon-facts hold the elementary information connected to words: lexicon(noun,bird,[bird]). lexicon(pn,anna,[anna]). lexicon(iv,purr,[purrs]). lexicon(tv,eat,[eats]). Their slots contain: 1. syntactic category 2. constant / relation symbol (“core” semantics) 3. the surface form of the word. Interfaces Syntax Phrases (combinatorial) Semantics combine-rules DCG combine-calls lexicon-calls Words (lexical) Semantic macros lexicon-facts Interfaces in the DCG Information is transported between the three components of our system by additional calls and variables in the DCG: • Lexical rules are now fully abstract. We have one for each category (iv, tv, n, ...). The DCG uses lexicon-calls and semantic macros like this: iv(IV)--> {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word. pn(PN)--> {lexicon(pn,Sym,Word),pnSem(Sym,PN)}, Word. • In the combinatorial rules, using combine-calls like this: vp(VP)--> iv(IV),{combine(vp:VP,[iv:IV])}. s(S)--> np(NP), vp(VP), {combine(s:S,[np:NP,vp:VP])}. Interfaces: How they work iv(IV)--> {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word. When this rule applies, the syntactic analysis component: • looks up the Word found in the (e.g. “smokes”) string, ... • ... checks that its category is iv, ... • ... and retrieves the relation symbol Sym to be used in the semantic construction. So we have: Word = [smokes] Sym = smoke lexicon(iv, smoke, [smokes]) Sym = smoke Interfaces: How they work II iv(IV)--> {lexicon(iv,Sym,Word),ivSem(Sym,IV)}, Word. Then, the semantic construction component: • takes Sym ... Sym = smoke • ... and uses the semantic macro ivSem ... ivSem(smoke,lambda(X, smoke(X))) ivSem(smoke,IV) ivSem(Sym,IV) • ... to transfer it into a full semantic representation for an intransitive verb. IV = lambda(X, smoke(X)) The DCG-rule is now fully instantiated and looks like this: iv(lambda(X, smoke(X)))--> {lexicon(iv,smoke,[smokes]), ivSem(smoke, lambda(X, smoke(X)))}, [smokes]. What’s inside a semantic macro? Semantic macros simply specify how to make a valid semantic representation out of a naked symbol. The one we’ve just seen in action for the verb “smokes” was: ivSem(Sym,lambda(X,Formula)):compose(Formula,Sym,[X]). compose builds a first-order formula out of Sym and a new variable X: Formula = smoke(X) This is then embedded into a - abstraction over the same X: lambda(X, smoke(X)) Another one, without compose: pnSem(Sym,lambda(P,P@Sym)). john lambda(P,P@john) Syntax Semantics s(S)--> np(NP), vp(VP),{combine(s:S,[np:NP,vp:VP])}. Phrases (combinatorial) np(NP) vp(VP) pn(PN) iv(IV) --> --> --> --> Word =[john] Word = [smokes] Words (lexical) NP VP PN IV …,pn(PN) …,iv(IV) …,[john] …,[smokes] = = = = lambda(P,P@john) lambda(X,smoke(X)) lambda(P,P@john) lambda(X,smoke(X)) pnSem(Sym,PN) Sym = john ivSem(Sym,IV) Sym = smoke lexicon(pn,john,[john]). lexicon(iv,smoke,[smokes]). “John smokes” A look at combine combine(s:NP@VP,[np:NP,vp:VP]). S = NP@VP NP = lambda(P,P@john) VP = lambda(X,smoke(X)) So: S = lambda(P,P@john)@lambda(X,smoke(X)) That’s almost all, folks… betaConvert(lambda(P,P@john)@lambda(X,smoke(X), Converted) Converted = smoke(john) Little Cheats A few “special words” are dealt with in a somewhat different manner: Determiners: ("every man") • No semantic Sym in the lexicon: lexicon(det,_,[every],uni). • Semantic representation generated by the macro alone: detSem(uni,lambda(P,lambda(Q, forall(X,(P@X)>(Q@X))))). Negation – same thing: ("does not walk") • No semantic Sym in the lexicon: lexicon(mod,_,[does,not],neg). • Representation solely from macro: modSem(neg,lambda(P,lambda(X,~(P@X)))). The code that's online (http://www.coli.uni-sb.de/cl/projects/milca/esslli) • lexicon-facts have fourth argument for any kind of additional information: lexicon(tv,eat,[eats],fin). e.g. fin/inf, gender • iv/tv have additional argument for infinite /fin.: e.g. "eat" vs. "eats" iv(I,IV)--> {lexicon(iv,Sym,Word,I),…}, Word. • limited coordination, hence doubled categories: e.g. "talks and walks" vp2(VP2)--> vp1(VP1A), coord(C), vp1(VP1B), {combine(vp2:VP2,[vp1:VP1A,coord:C,vp1:VP1B])}. vp1(VP1)--> v2(fin,V2), {combine(vp1:VP1,[v2:V2])}. A demo lambda :readLine(Sentence), parse(Sentence,Formula), resetVars, vars2atoms(Formula), betaConvert(Formula,Converted), printRepresentations([Converted]). Evaluation Our new program has become much bigger, but it's… • Modular: everything's in its right place: – Syntax in englishGrammar.pl – Semantics (macros + combine) in lambda.pl – Lexicon in lexicon.pl • Scalable: E.g. extend the lexicon by adding facts to lexicon.pl • Re-usable: E.g change only lambda.pl and keep the rest for changing the semantic construction method (e.g. to CLLS on Thursday) What we‘ve done today • Complex NPs, PNs and TVs in λ-based semantic construction • A clean semantic construction framework in Prolog • Its instantiation for -based semantic construction Ambiguity • Some sentences have more than one reading, i.e. more than one semantic representation. • Standard Example: "Every man loves a woman": – Reading 1: the women may be different x(man(x) -> y(woman(y) love(x,y))) – Reading 2: there is one particular woman y(woman(y) x(man(x) -> love(x,y))) • What does our system do? Excursion: lambda, variables and atoms • Question yesterday: Why don't we use Prolog variables for FO-variables? • Advantage (at first sight): -reduction as unification: betaReduce(lambda(X, F)@X,F). Now: X = john, F = walk(X) ("John walks") betaReduce(lambda(X, betaReduce(lambda(john,walk(john))@john, F)@X,F). walk(john)) F = walk(john) Nice, but… Problem: Coordination "John and Mary" P((Q.Q(john)@P) (R.R(mary)@P)) P(P(john) P(mary)) (X. Y.P((X@P) (Y@P))@ Q.Q(john))@R.R(mary) "John and Mary walk" x.walk(x)@john P(P(john) x.walk(x)@mary P(mary))@ x.walk(x) lambda(X,walk(X))@john & lambda(X,walk(X))@mary -reduction as unification: X = john X = mary