* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download John ate the cake
Udmurt grammar wikipedia , lookup
Navajo grammar wikipedia , lookup
Lithuanian grammar wikipedia , lookup
Chinese grammar wikipedia , lookup
Old Norse morphology wikipedia , lookup
Old English grammar wikipedia , lookup
Old Irish grammar wikipedia , lookup
Compound (linguistics) wikipedia , lookup
Kannada grammar wikipedia , lookup
Portuguese grammar wikipedia , lookup
Modern Hebrew grammar wikipedia , lookup
Malay grammar wikipedia , lookup
Latin syntax wikipedia , lookup
Modern Greek grammar wikipedia , lookup
Spanish grammar wikipedia , lookup
PRO (linguistics) wikipedia , lookup
Icelandic grammar wikipedia , lookup
Latvian declension wikipedia , lookup
Zulu grammar wikipedia , lookup
Arabic nouns and adjectives wikipedia , lookup
Swedish grammar wikipedia , lookup
Proper noun wikipedia , lookup
Serbo-Croatian grammar wikipedia , lookup
Romanian grammar wikipedia , lookup
Esperanto grammar wikipedia , lookup
Vietnamese grammar wikipedia , lookup
French grammar wikipedia , lookup
Arabic grammar wikipedia , lookup
Turkish grammar wikipedia , lookup
Ancient Greek grammar wikipedia , lookup
Polish grammar wikipedia , lookup
Pipil grammar wikipedia , lookup
Determiner phrase wikipedia , lookup
Yiddish grammar wikipedia , lookup
Romanian nouns wikipedia , lookup
NLP: Parsing and more Grammar Topics: DCG recognisor • two implementations Parsing • Parse tree • DCG recognisor with grammar features A Simple Grammar S NP VP VP V NP NP Proper_N NP det N Proper_N John Proper_N Mary N cake V loves V ate det the Sentences in this language: “John loves Mary” “John ate the cake” “John loves the cake” Definite Clause Grammars (DCGs) The above grammar can be simply implemented in DCG notation as follows: s --> np, vp. vp --> v, np. np --> proper_n. np --> det, n. proper_n --> [john]. proper_n --> [mary]. n --> [cake]. v --> [loves]. v --> [ate]. det --> [the]. Translating DCG Consider the rule s --> np, vp. Prolog translates this as: s(Ws1,Ws2) :- np(Ws1,Ws),vp(Ws,Ws2). This says that after taking an s off the start of Ws1, Ws2 remains The rule proper_n --> [john]. is translated as proper_n([john|Ws],Ws). Query • s([john, ate, the cake],[]). • Yes • s([ate, john, cake, the],[]). • No A Second Implementation A more efficient implementation: s --> np, vp. vp --> [V],{verb(V)},np. np --> [N],{proper_n(N)}. np --> [Det],{det(Det)}, [N],{noun(N)}. proper_n(john). proper_n(mary). noun(cake). verb(loves). verb(ate). det(the). Notes: 1. The {} allow Prolog code to be embedded in DCG rules 2. The cost of processing is now less dependent on lexicon size (given good indexing) 3. So far we have implemented only a DCG recogniser, not a parser. Parsing Consider the goal s([john,ate,the,cake],[]) It will have the following parse/proof tree s np vp proper_n v [john] [ate] np det n [the] [cake] How to represent a tree in Prolog? A DCG Parser We can include extra arguments on nonterminals in our grammar, these allow us to record the parse tree: s(s(NP,VP)) --> np(NP), vp(VP). vp(vp(verb(V), NP)) --> [V], {verb(V)}, np(NP). np(np(proper_n(N))) --> [N], {proper_n(N)}. np(np(det(Det), noun(N))) --> [Det], {det(Det)}, [N], {noun(N)}. A DCG Parser (ctd.) Arguments to DCG non-terminals are expanded by Prolog like this: s(s(NP,VP),Ws1,Ws2) :np(NP,Ws1,Ws),vp(VP,Ws,Ws2). Here, arguments are used to build up a parse tree: ?- s(Parse, [john,ate,the,cake], []). Parse = s(np(proper_n(john)), vp(verb(ate), np(det(the), noun(cake)))) Eliminating Left Recursion Example with left recursion: S S, PP S NP, VP Will get into infinite loop! S S NP PP VP Example with left recursion eliminated: S S S1, S-PP S-PP S1 S-PP S-PP PP, S-PP NP VP PP S-PP S1 NP, VP Here's how to get the same parse tree as in the 1st grammar, but using the 2nd grammar: s(S) --> s1(S1), s_pp(S1,S). s_pp(S,S) --> []. s_pp(S0,S) --> pp(PP), s_pp(s(S0,PP),S). s1(s(NP,VP)) --> np(NP), vp(VP). Grammatical vs Ungrammatical sentences ungrammatical (syntactically incorrect) sentences – – – – – The John eats the hamburger. John eat the hamburger. John died Mary. A hostages ate the hamburgers. Him ate him. syntactically correct sentences: – – – – – John eats the hamburger. The people eat the hamburgers. Hostages eat hamburgers. Hamburgers killed Jim. Cholesterol killed Jim. Some Grammatical Features. We can use the idea of grammatical features to solve this problem: • Number: singular, plural: hamburger, hamburgers • Person: 1st, 2nd, or 3rd: eat, eats • Proper noun: true or false: John, hamburger • Mass noun: true (bread) or false (bun): compare “a bread” & “a bun” • Transitivity: transitive or intransitive: give, die • Case: subjective (I, he, she, they), objective (me, him, her, them) Modifying a DCG to handle case subjective objective I me you you he him she her it it they them Examples He saw her. She saw him. You gave it to me. I gave them to you. They gave it to them. They gave them it. s --> np_sub, vp. vp --> v, np_obj. vp --> v. pp --> prep, np_obj. np_obj -> pro_obj. np_obj -> det, n. np_sub -> pro_sub. np_sub -> det, n. pro_sub --> [i]. pro_sub --> [you]. pro_sub --> [he]. pro_sub --> [she]. pro_obj --> [me]. pro_obj --> [you]. pro_obj --> [him]. pro_obj --> [her]. Augmenting a DCG to handle case There are other grammatical features we need to match, and if we keep making more specialized grammar rules, we will end up with exponential number of rules. Alternative is to augment grammar. s --> np(sub), vp. vp --> v, np(obj). vp --> v. pp --> prep, np(obj). np(Case) -> pro(Case). np(_) --> det, n. pro(Case) -> [Pro],{pro(Pro,Case)}. % Lexical entries: pro(i, sub). pro(you, sub). pro(he, sub). pro(she, sub). pro(me, obj). pro(you, obj). pro(him, obj). pro(her, obj). Subject-verb agreement In English, number and person of the subject must agree with the verb: “I am” / “she is” / “they are” Person Number Pronoun Verb examples… –––––––––––––––––––––––––––––––––––––––––– 1st sing I eat am have 2nd – you eat are have 3rd sing he/she/it eats is has 1st plur we eat are have 3rd plur they eat are have Implementing subject-verb agreement Can combine checking subject-verb agreement in person/number with case checking: % subject must agree with verb s --> np(Per, Num, sub), vp(Per, Num). % person and number of object doesn’t matter vp(Per, Num) --> v(Per, Num), np(_, _, obj). vp(Per, Num) --> v(Per, Num). % look up V, retrieve its person and number v(Per, Num) --> [V], {v(V, Per, Num)}. % person, number and case comes from pronoun np(Per, Num, Case) --> pro(Per, Num, Case). % look up person, number and case of pronoun pro(Per, Num, Case) --> [Pro], {pro(Pro, Per, Num, Case)}. % lexical entries pro(she, second, sing, obj). v(eats, third, sing). Agreement for nouns & determiners Articles (a type of determiner) have restrictions based on number and whether the noun is mass or count Article Number Def/Indef Example ––––––––––––––––––––––––––––––––––––– a/an singular indefinite a man saw … plural indefinite men saw… the singular definite the man saw … the plural definite the men saw… Mass nouns (bread, money, water) take no article when used in the indefinite, and go with a verb in the singular Determiners Determiners can be quite complicated: can involve numbers (& many other things) Definite Indefinite ––––––––––––––––––––––––––– the dog a dog the red dogs blue dogs the three dogs three dogs my uncle Bob’s friend’s dog every dog every pink large dog all but three of the red dogs Proper nouns take no determiners Noun/determiner agreement Mass nouns Count nouns –––––––––––––––––––––––––––––––––––– the bread tastes good. the loaf tastes good. *a bread taste good. (don’t use indef article with mass nouns) bread tastes good. a loaf tastes good. *bread taste good loaves taste good. (mass nouns go with sing. verb) *the three bread … the three loaves cost $4 *three bread … three loaves cost $4 (can’t use numbers with mass nouns) the money is heavy. money buys power. the coins are heavy. the coin is heavy. a coin buys … coins buy icecreams DCG rules for noun/determiner agreement % Proper nouns are noun-phrases np(Number, _Person, _Case) --> proper_n(Number, _). % Mass nouns need no determiners np(sing, _Person, _Case) --> n1(sing, mass). % Determiner-Noun (can’t have indef % determiner + mass noun) np(Number, _, _) --> det(Number, Definite), n1(Number, Mass), {\+ (Mass=mass, Definite=indef)}. n1(Number, Mass) --> adj, n1(Number, Mass). n1(Number, Mass) --> n(Number, Mass). DCG rules for noun/ determiner agreement (ctd) n(Number, Mass) --> [N], {noun(N, Number, Mass)}. proper_n(Number, Mass) --> [N],{proper_noun(N, Number, Mass)}. det(Number, Definite) --> [], {det([], Number, Definite)}. det(Number, Definite) --> [Det], {det(Det, Number, Definite)}. adj --> [Adj], {adj(Adj)}. proper_noun(john, sing, count). noun(loaf, sing, count). noun(loaves, plur, count). noun(bread, sing, mass). det(a, sing, indef). det([], plur, indef). det(the, _, def). adj(three). adj(large).