* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Context-Free Grammars (CFGs) Parsing: Assigning Structure to
Integrational theory of language wikipedia , lookup
Georgian grammar wikipedia , lookup
Swedish grammar wikipedia , lookup
Modern Hebrew grammar wikipedia , lookup
Old English grammar wikipedia , lookup
Distributed morphology wikipedia , lookup
French grammar wikipedia , lookup
Malay grammar wikipedia , lookup
Portuguese grammar wikipedia , lookup
Ancient Greek grammar wikipedia , lookup
Controlled grammar wikipedia , lookup
Yiddish grammar wikipedia , lookup
Japanese grammar wikipedia , lookup
Preposition and postposition wikipedia , lookup
Esperanto grammar wikipedia , lookup
Spanish grammar wikipedia , lookup
Serbo-Croatian grammar wikipedia , lookup
Probabilistic context-free grammar wikipedia , lookup
Vietnamese grammar wikipedia , lookup
English clause syntax wikipedia , lookup
Chinese grammar wikipedia , lookup
Polish grammar wikipedia , lookup
Antisymmetry wikipedia , lookup
Musical syntax wikipedia , lookup
Transformational grammar wikipedia , lookup
Latin syntax wikipedia , lookup
Construction grammar wikipedia , lookup
Junction Grammar wikipedia , lookup
Scottish Gaelic grammar wikipedia , lookup
Lexical semantics wikipedia , lookup
Dependency grammar wikipedia , lookup
Pipil grammar wikipedia , lookup
Context-free grammar wikipedia , lookup
Determiner phrase wikipedia , lookup
Context-Free Grammars (CFGs) Parsing: Assigning Structure to Sentences Syntax Syntax Linear order Linear order Constituency Constituency Parsing: take in an input sentence & assign a structure Categories Phrases CFGs Context-Free Grammars (CFGs) Context-Free Grammars (CFGs) Other constructions I Input: The man left the room. I Output: (S (NP (DT The) (NN man)) (VP (VBD left) (NP (DT the) (NN room)))) Categories Phrases CFGs Other constructions Why this sort of representation? L545 Dept. of Linguistics, Indiana University Spring 2013 I Why do we group words as we do? I What are these categories & what do they mean? Today: linguistic motivation for CFGs I Later: formal properties 1 / 32 Syntax Context-Free Grammars (CFGs) 2 / 32 Linear order Syntax Syntax = the study of the way that sentences are constructed from smaller units. No “dictionary” for sentences → infinite number of possible sentences. Syntax Linear order Linear order Constituency Constituency Categories Categories Phrases Phrases CFGs Other constructions Linear order = the order of words in a sentence. A sentence has different meanings based on its linear order. The house is large. I John loves Mary. I John believes that the house is large. I Mary loves John. I Mary says that John believes that the house is large. I Linear order I Hierarchial structure (Constituency) I Subcategorization & Grammatical relations CFGs Other constructions Linear order is a guiding principle for organizing words into meaningful sentences Some basic principles of sentence organization: I Context-Free Grammars (CFGs) I Languages vary as to what extent this is true 4 / 32 3 / 32 Constituency We can’t only use linear order to determine sentence organization Context-Free Grammars (CFGs) Constituency tests There are many “tests” to determine what a constituent is (though, they are prone to error) Syntax Linear order Constituency Categories I Phrases I I eat at really fancy restaurants. CFGs I Many executives eat at really fancy restaurants. Other constructions (1) a. On September seventeenth, I’d like to fly from Atlanta to Denver. What are the “meaningful units” of the sentence Many executives eat at really fancy restaurants? I Many executives I really fancy I really fancy restaurants I at really fancy restaurants I eat at really fancy restaurants Preposed/Postposed constructions—i.e., can you move the grouping around? Context-Free Grammars (CFGs) Syntax Linear order Constituency Categories Phrases CFGs Other constructions b. I’d like to fly on September seventeenth from Atlanta to Denver. c. I’d like to fly from Atlanta to Denver on September seventeenth. I Pro-form substitution (2) John has some very heavy books, but he didn’t want them. We refer to these meaningful groupings as constituents (3) I want to go home, and John wants to do so, too. 5 / 32 6 / 32 Hierarchical structure Context-Free Grammars (CFGs) Syntactic tree (first pass) Context-Free Grammars (CFGs) Syntax Syntax Linear order Linear order Constituency Constituency a Categories Phrases Categories Phrases CFGs Note that constituents appear within other constituents. We can represent this in a bracket form or in a syntactic tree CFGs c b Other constructions Other constructions many executives eat Bracket form: [[Many executives] [eat [at [[really fancy] restaurants]]]] d at e Syntactic tree is on the next page ... f restaurants really fancy 7 / 32 Categories Context-Free Grammars (CFGs) 8 / 32 Lexical categories Syntax Syntax Linear order Linear order Constituency Lexical categories are simply word classes, or parts of speech. The main ones are: Categories Phrases Goal: be able to say that: I Many executives and really fancy restaurants are the same type of grouping, or constituent I at really fancy restaurants is something else CFGs Other constructions For this, we will talk about different categories I I Context-Free Grammars (CFGs) Lexical Phrasal I verbs: eat, drink, sleep, ... I nouns: gas, food, lodging, ... I adjectives: quick, happy, brown, ... I adverbs: quickly, happily, well, westward I prepositions: on, in, at, to, into, of, ... I determiners/articles: a, an, the, this, these, some, much, ... I conjunctions: and, but, or, since, while, ... Constituency Categories Phrases CFGs Other constructions 9 / 32 Determining lexical categories Context-Free Grammars (CFGs) 10 / 32 Closed & Open classes Syntax Syntax Linear order Open classes: new words can be easily added (tend to carry meaning): Constituency Categories Phrases How do we determine which category a word belongs to? I Distribution: where these words can appear in a sentence I I CFGs Other constructions e.g., Nouns like mouse can appear after articles (“determiners”) like the, while a verb like eat cannot. Morphology: what kinds of prefixes/suffixes a word can take I Context-Free Grammars (CFGs) I verbs I nouns I adjectives I adverbs Linear order Constituency Categories Phrases CFGs Other constructions Closed classes: new words cannot be easily added (tend to be function words): e.g., Verbs like walk can take a ed ending to mark them as past tense. A noun like mouse cannot. 11 / 32 I prepositions I determiners I conjunctions 12 / 32 Phrasal categories Context-Free Grammars (CFGs) Syntactic tree Context-Free Grammars (CFGs) Syntax Syntax Linear order Examining the distribution of phrases, some behave in the same way I The joggers ran through the park. Linear order Constituency Phrases Categories Phrases CFGs CFGs NP Other constructions Other phrases which can be put in place of The joggers: Susan you some children my friends from Brazil Constituency S Categories VP many executives eat students most dogs a huge, lovable bear the people that we interviewed Other constructions PP at NP AP Since all of these contain nouns, we consider these to be noun phrases (NPs). restaurants really fancy 13 / 32 Phrases Context-Free Grammars (CFGs) Noun Phrases 14 / 32 Phrases Determiner Phrases? Syntax Syntax Linear order It’s not entirely clear that these phrases should be NPs; maybe they should be DPs Constituency Categories Noun phrases, like other kinds of phrases, are headed: there is a designated item (the noun) which determines the properties of the whole phrase I Phrases CFGs Other constructions I Before the noun, you can have determiners (and pre-determiners) and adjective phrases I After the noun, you can have prepositional phrases, gerunds (and other verbal clauses), and relative clauses I You can also have noun-noun compounds Context-Free Grammars (CFGs) There generally must be a noun in an NP, but often there must also be a determiner; in fact, determiners can sometimes appear alone. Linear order Constituency Categories Phrases CFGs Other constructions (4) {*Student/The student} laughed. (5) { These/These students} think a lot. I ⇒ General rule: The category of the head word percolates The determiner actually scopes over the noun semantically (6) All/Some/No students are happy. up to the phrase level I For some theories, a DP is more uniform with other parts of the syntax 15 / 32 Phrases Context-Free Grammars (CFGs) Verb Phrases: Subcategorization 16 / 32 Phrases Grammatical relations Syntax Syntax Linear order Verbs tend to drive the analysis of a sentence because they subcategorize for elements Linear order Constituency Constituency Categories Categories Phrases Phrases CFGs CFGs Other constructions We can say that verbs have subcategorization frames I sleep: subject I find: subject, object I show: subject, object, second object I want: subject, object, infinitive verb phrase I think: subject, sentential complement Context-Free Grammars (CFGs) Grammatical relations are the basic relations between words in a sentence Other constructions (7) She eats a mammoth breakfast. 17 / 32 I In this sentence, She is the subject, while a mammoth breakfast is the object I In English, the subject must agree in person and number with the verb. 18 / 32 Phrase Structure Rules (PSRs) Context-Free Grammars (CFGs) Important Properties of Phrase Structure Rules Syntax Syntax Linear order Rules for building these phrases I Phrase structure rules (PSRs) build larger constituents from smaller ones. e.g., S → NP VP I I I Context-Free Grammars (CFGs) Linear order Constituency Constituency Categories Categories I Phrases CFGs Other constructions A sentence (S) constituent is composed of a noun phrase (NP) constituent and a verb phrase (VP) constituent. (hierarchy) The NP must precede the VP. (linear order) I Put PSRs together, and you have a context-free grammar (CFG) recursive = a rule can be reapplied (within its hierarchical structure). I NP → NP PP I PP → P NP The property of recursion means that the set of potential sentences in a language is infinite. Phrases CFGs Other constructions potentially (structurally) ambiguous = have more than one analysis (8) I [VP saw [NP [NP the man] [PP with the telescope]]] (9) I [VP saw [NP the man] [PP with the telescope]] 20 / 32 19 / 32 Formal definition of CFGs Context-Free Grammars (CFGs) Other constructions to capture Syntax Syntax Linear order 1. N: a set of non-terminal (phrasal) symbols, e.g., NP, VP, etc. Context-Free Grammars (CFGs) Linear order Constituency Constituency Categories Categories Phrases Phrases CFGs CFGs Other constructions 2. Σ: a set of terminal (lexical) symbols N and Σ are disjoint 3. P: a set of productions (rules) of the form A → α, where A is a non-terminal and α is a collection of terminals and non-terminals Other constructions I Coordination I Active & Passive Constructions I Raising & Control Constructions I Unbounded Dependency Constructions (UDCs) 4. S: a designated start symbol Question: Are CFGs capable of covering language? 22 / 32 21 / 32 Coordination Context-Free Grammars (CFGs) Difficulties with coordination Syntax Syntax Linear order Coordination turns out to have particularly difficult properties for linguistic analysis Constituency One type of phrase we have not mentioned yet is the coordinate phrase, for example John and Mary I Coordination can generally apply to any kinds of (identical) phrases I This makes it ambiguous and cause problems for parsing Context-Free Grammars (CFGs) Categories Phrases CFGs Other constructions I The conjunction of two elements does not obey the same properties as each element. Linear order Constituency Categories Phrases CFGs Other constructions (11) a. *Me went to the store. b. Me and John went to the store. I (10) I saw John and Mary left early. Coordination can be with “unlike” constituents (12) Robin is [NP a Republican] and [ADJP proud of it] ⇒ At some point, a parser has to decide between and I joining NPs and joining Ss. Coordination can be with non-constituents (13) John gave me the bread and Mary the sugar. 23 / 32 24 / 32 Active & Passive Constructions Context-Free Grammars (CFGs) Relating active and passive constructions Syntax Syntax Linear order It is well-established that sentences occur in both active and passive forms: Linear order Constituency Constituency Categories Categories Phrases Phrases CFGs Other constructions (14) a. Sandy saw Kim. I CFGs can clearly handle such sentences, along the lines of: I I Even if a CFG can license such constructions, questions remain: I b. Kim was seen by Sandy. I Context-Free Grammars (CFGs) I VP → Vbe VPpass Other constructions How many rules will it take to capture every relevant grammatical distinction? How are the active and passive forms related? I VP → Vfin NP CFGs I Through movement? Through lexical rules? They’re not related? VPpass → Vpass (PPby ) 25 / 32 Raising & Control Constructions Context-Free Grammars (CFGs) 26 / 32 Capturing the raising/control generalizations Syntax Some verbs look similar in some syntactic contexts, but behave quite differently in others (15) a. John seems to be happy. Syntax Linear order Linear order Constituency Constituency Categories Categories Phrases Phrases How do we distinguish raising and control verbs in CFGs? CFGs Other constructions b. It seems to be raining. c. John tries to be happy. Generalization: I I In both cases, it seems like we have the pattern NP V VPinf CFGs Other constructions Solutions seem to require one or more of the following: d. *It tries to be raining. I Context-Free Grammars (CFGs) Raising verbs (e.g., seem): the subject of the higher clause is the “same” as the subject of the lower clause I An empty subject in the lower clause I Sharing of subjects (or subject properties) between upper and lower verbs, perhaps involving new features I A closer connection to sentence semantics Control (or equi) verbs (e.g., try): the subject of the higher clause “controls” the subject of the lower clause, but has certain restrictions on it. 28 / 32 27 / 32 Unbounded Dependency Constructions (UDCs) Context-Free Grammars (CFGs) Example: Wh-elements Context-Free Grammars (CFGs) Syntax Syntax Wh-elements can have different functions: Linear order Constituency Linear order Constituency Categories Categories Phrases (16) a. Who did Hobbs see ? CFGs An unbounded dependency construction has an element realized non-locally and: I involves constituents with different functions I involves constituents of different categories I is in principle unbounded Other constructions b. Who do you think saw the man? c. Who did Hobbs give the book to ? d. Who did Hobbs consider to be a fool? Object of verb Subject of verb Phrases CFGs Other constructions Object of prep Object of obj-control verb Wh-elements can also occur in subordinate clauses: (17) a. I asked who the man saw . b. I asked who the man considered to be a fool . c. I asked who Hobbs gave the book to . d. I asked who you thought saw Hobbs. 29 / 32 30 / 32 Wh-elements (cont.) Context-Free Grammars (CFGs) Different categories can be extracted: Accounting for UDCs Syntax Syntax Linear order (18) a. Which man did you talk to ? NP b. [To [which man]] did you talk ? PP c. [How ill] has the man been ? AdjP d. [How frequently] did you see the man ? AdvP Context-Free Grammars (CFGs) Linear order Constituency Constituency Categories Categories Phrases Phrases CFGs CFGs Other constructions This sometimes provides multiple options for a constituent: (19) a. Who does he rely [on ]? b. [On whom] does he rely ? How does one account for UDCs? I Invoke a notion of movement during an analysis I Include features which “pass” information about the non-local element I Use some formalism more powerful than a CFG (e.g., Tree-Adjoining Grammar) Other constructions Unboundedness: (20) a. Who do you think Hobbs saw ? b. Who do you think Hobbs said he saw ? c. Who do you think Hobbs said he imagined that he saw ? 31 / 32 32 / 32