Download Context-Free Grammars (CFGs) Parsing: Assigning Structure to

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inflection wikipedia , lookup

Integrational theory of language wikipedia , lookup

Georgian grammar wikipedia , lookup

Swedish grammar wikipedia , lookup

Modern Hebrew grammar wikipedia , lookup

Old English grammar wikipedia , lookup

Distributed morphology wikipedia , lookup

French grammar wikipedia , lookup

Malay grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Controlled grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Japanese grammar wikipedia , lookup

Preposition and postposition wikipedia , lookup

Esperanto grammar wikipedia , lookup

Spanish grammar wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Probabilistic context-free grammar wikipedia , lookup

Vietnamese grammar wikipedia , lookup

English clause syntax wikipedia , lookup

Chinese grammar wikipedia , lookup

Polish grammar wikipedia , lookup

Antisymmetry wikipedia , lookup

Musical syntax wikipedia , lookup

Transformational grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Construction grammar wikipedia , lookup

Junction Grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Pleonasm wikipedia , lookup

Lexical semantics wikipedia , lookup

Dependency grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Context-free grammar wikipedia , lookup

Determiner phrase wikipedia , lookup

English grammar wikipedia , lookup

Parsing wikipedia , lookup

Transcript
Context-Free
Grammars (CFGs)
Parsing: Assigning Structure to Sentences
Syntax
Syntax
Linear order
Linear order
Constituency
Constituency
Parsing: take in an input sentence & assign a structure
Categories
Phrases
CFGs
Context-Free Grammars (CFGs)
Context-Free
Grammars (CFGs)
Other constructions
I
Input: The man left the room.
I
Output: (S (NP (DT The) (NN man)) (VP (VBD left) (NP
(DT the) (NN room))))
Categories
Phrases
CFGs
Other constructions
Why this sort of representation?
L545
Dept. of Linguistics, Indiana University
Spring 2013
I
Why do we group words as we do?
I
What are these categories & what do they mean?
Today: linguistic motivation for CFGs
I
Later: formal properties
1 / 32
Syntax
Context-Free
Grammars (CFGs)
2 / 32
Linear order
Syntax
Syntax = the study of the way that sentences are
constructed from smaller units.
No “dictionary” for sentences → infinite number of possible
sentences.
Syntax
Linear order
Linear order
Constituency
Constituency
Categories
Categories
Phrases
Phrases
CFGs
Other constructions
Linear order = the order of words in a sentence.
A sentence has different meanings based on its linear order.
The house is large.
I
John loves Mary.
I
John believes that the house is large.
I
Mary loves John.
I
Mary says that John believes that the house is large.
I
Linear order
I
Hierarchial structure (Constituency)
I
Subcategorization & Grammatical relations
CFGs
Other constructions
Linear order is a guiding principle for organizing words into
meaningful sentences
Some basic principles of sentence organization:
I
Context-Free
Grammars (CFGs)
I
Languages vary as to what extent this is true
4 / 32
3 / 32
Constituency
We can’t only use linear order to determine sentence
organization
Context-Free
Grammars (CFGs)
Constituency tests
There are many “tests” to determine what a constituent is
(though, they are prone to error)
Syntax
Linear order
Constituency
Categories
I
Phrases
I
I eat at really fancy restaurants.
CFGs
I
Many executives eat at really fancy restaurants.
Other constructions
(1) a. On September seventeenth, I’d like to fly from
Atlanta to Denver.
What are the “meaningful units” of the sentence Many
executives eat at really fancy restaurants?
I
Many executives
I
really fancy
I
really fancy restaurants
I
at really fancy restaurants
I
eat at really fancy restaurants
Preposed/Postposed constructions—i.e., can you move
the grouping around?
Context-Free
Grammars (CFGs)
Syntax
Linear order
Constituency
Categories
Phrases
CFGs
Other constructions
b. I’d like to fly on September seventeenth from
Atlanta to Denver.
c. I’d like to fly from Atlanta to Denver
on September seventeenth.
I
Pro-form substitution
(2) John has some very heavy books, but he didn’t
want them.
We refer to these meaningful groupings as constituents
(3) I want to go home, and John wants to do so, too.
5 / 32
6 / 32
Hierarchical structure
Context-Free
Grammars (CFGs)
Syntactic tree (first pass)
Context-Free
Grammars (CFGs)
Syntax
Syntax
Linear order
Linear order
Constituency
Constituency
a
Categories
Phrases
Categories
Phrases
CFGs
Note that constituents appear within other constituents. We
can represent this in a bracket form or in a syntactic tree
CFGs
c
b
Other constructions
Other constructions
many executives eat
Bracket form:
[[Many executives] [eat [at [[really fancy] restaurants]]]]
d
at
e
Syntactic tree is on the next page ...
f
restaurants
really fancy
7 / 32
Categories
Context-Free
Grammars (CFGs)
8 / 32
Lexical categories
Syntax
Syntax
Linear order
Linear order
Constituency
Lexical categories are simply word classes, or parts of
speech. The main ones are:
Categories
Phrases
Goal: be able to say that:
I
Many executives and really fancy restaurants are the
same type of grouping, or constituent
I
at really fancy restaurants is something else
CFGs
Other constructions
For this, we will talk about different categories
I
I
Context-Free
Grammars (CFGs)
Lexical
Phrasal
I
verbs: eat, drink, sleep, ...
I
nouns: gas, food, lodging, ...
I
adjectives: quick, happy, brown, ...
I
adverbs: quickly, happily, well, westward
I
prepositions: on, in, at, to, into, of, ...
I
determiners/articles: a, an, the, this, these, some,
much, ...
I
conjunctions: and, but, or, since, while, ...
Constituency
Categories
Phrases
CFGs
Other constructions
9 / 32
Determining lexical categories
Context-Free
Grammars (CFGs)
10 / 32
Closed & Open classes
Syntax
Syntax
Linear order
Open classes: new words can be easily added (tend to
carry meaning):
Constituency
Categories
Phrases
How do we determine which category a word belongs to?
I
Distribution: where these words can appear in a
sentence
I
I
CFGs
Other constructions
e.g., Nouns like mouse can appear after articles
(“determiners”) like the, while a verb like eat cannot.
Morphology: what kinds of prefixes/suffixes a word can
take
I
Context-Free
Grammars (CFGs)
I
verbs
I
nouns
I
adjectives
I
adverbs
Linear order
Constituency
Categories
Phrases
CFGs
Other constructions
Closed classes: new words cannot be easily added (tend to
be function words):
e.g., Verbs like walk can take a ed ending to mark them
as past tense. A noun like mouse cannot.
11 / 32
I
prepositions
I
determiners
I
conjunctions
12 / 32
Phrasal categories
Context-Free
Grammars (CFGs)
Syntactic tree
Context-Free
Grammars (CFGs)
Syntax
Syntax
Linear order
Examining the distribution of phrases, some behave in the
same way
I
The joggers ran through the park.
Linear order
Constituency
Phrases
Categories
Phrases
CFGs
CFGs
NP
Other constructions
Other phrases which can be put in place of The joggers:
Susan
you
some children
my friends from Brazil
Constituency
S
Categories
VP
many executives eat
students
most dogs
a huge, lovable bear
the people that we interviewed
Other constructions
PP
at
NP
AP
Since all of these contain nouns, we consider these to be
noun phrases (NPs).
restaurants
really fancy
13 / 32
Phrases
Context-Free
Grammars (CFGs)
Noun Phrases
14 / 32
Phrases
Determiner Phrases?
Syntax
Syntax
Linear order
It’s not entirely clear that these phrases should be NPs;
maybe they should be DPs
Constituency
Categories
Noun phrases, like other kinds of phrases, are headed:
there is a designated item (the noun) which determines the
properties of the whole phrase
I
Phrases
CFGs
Other constructions
I
Before the noun, you can have determiners (and
pre-determiners) and adjective phrases
I
After the noun, you can have prepositional phrases,
gerunds (and other verbal clauses), and relative clauses
I
You can also have noun-noun compounds
Context-Free
Grammars (CFGs)
There generally must be a noun in an NP, but often
there must also be a determiner; in fact, determiners
can sometimes appear alone.
Linear order
Constituency
Categories
Phrases
CFGs
Other constructions
(4) {*Student/The student} laughed.
(5) { These/These students} think a lot.
I
⇒ General rule: The category of the head word percolates
The determiner actually scopes over the noun
semantically
(6) All/Some/No students are happy.
up to the phrase level
I
For some theories, a DP is more uniform with other
parts of the syntax
15 / 32
Phrases
Context-Free
Grammars (CFGs)
Verb Phrases: Subcategorization
16 / 32
Phrases
Grammatical relations
Syntax
Syntax
Linear order
Verbs tend to drive the analysis of a sentence because they
subcategorize for elements
Linear order
Constituency
Constituency
Categories
Categories
Phrases
Phrases
CFGs
CFGs
Other constructions
We can say that verbs have subcategorization frames
I
sleep: subject
I
find: subject, object
I
show: subject, object, second object
I
want: subject, object, infinitive verb phrase
I
think: subject, sentential complement
Context-Free
Grammars (CFGs)
Grammatical relations are the basic relations between words
in a sentence
Other constructions
(7) She eats a mammoth breakfast.
17 / 32
I
In this sentence, She is the subject, while a mammoth
breakfast is the object
I
In English, the subject must agree in person and
number with the verb.
18 / 32
Phrase Structure Rules (PSRs)
Context-Free
Grammars (CFGs)
Important Properties of Phrase Structure Rules
Syntax
Syntax
Linear order
Rules for building these phrases
I
Phrase structure rules (PSRs) build larger
constituents from smaller ones.
e.g., S → NP VP
I
I
I
Context-Free
Grammars (CFGs)
Linear order
Constituency
Constituency
Categories
Categories
I
Phrases
CFGs
Other constructions
A sentence (S) constituent is composed of a noun
phrase (NP) constituent and a verb phrase (VP)
constituent. (hierarchy)
The NP must precede the VP. (linear order)
I
Put PSRs together, and you have a context-free
grammar (CFG)
recursive = a rule can be reapplied (within its
hierarchical structure).
I NP → NP PP
I PP → P NP
The property of recursion means that the set of
potential sentences in a language is infinite.
Phrases
CFGs
Other constructions
potentially (structurally) ambiguous = have more than
one analysis
(8) I [VP saw [NP [NP the man] [PP with the telescope]]]
(9) I [VP saw [NP the man] [PP with the telescope]]
20 / 32
19 / 32
Formal definition of CFGs
Context-Free
Grammars (CFGs)
Other constructions to capture
Syntax
Syntax
Linear order
1. N: a set of non-terminal (phrasal) symbols, e.g., NP, VP,
etc.
Context-Free
Grammars (CFGs)
Linear order
Constituency
Constituency
Categories
Categories
Phrases
Phrases
CFGs
CFGs
Other constructions
2. Σ: a set of terminal (lexical) symbols
N and Σ are disjoint
3. P: a set of productions (rules) of the form A → α, where
A is a non-terminal and α is a collection of terminals
and non-terminals
Other constructions
I
Coordination
I
Active & Passive Constructions
I
Raising & Control Constructions
I
Unbounded Dependency Constructions (UDCs)
4. S: a designated start symbol
Question: Are CFGs capable of covering language?
22 / 32
21 / 32
Coordination
Context-Free
Grammars (CFGs)
Difficulties with coordination
Syntax
Syntax
Linear order
Coordination turns out to have particularly difficult properties
for linguistic analysis
Constituency
One type of phrase we have not mentioned yet is the
coordinate phrase, for example John and Mary
I
Coordination can generally apply to any kinds of
(identical) phrases
I
This makes it ambiguous and cause problems for
parsing
Context-Free
Grammars (CFGs)
Categories
Phrases
CFGs
Other constructions
I
The conjunction of two elements does not obey the
same properties as each element.
Linear order
Constituency
Categories
Phrases
CFGs
Other constructions
(11) a. *Me went to the store.
b. Me and John went to the store.
I
(10) I saw John and Mary left early.
Coordination can be with “unlike” constituents
(12) Robin is [NP a Republican] and [ADJP proud of it]
⇒ At some point, a parser has to decide between and
I
joining NPs and joining Ss.
Coordination can be with non-constituents
(13) John gave me the bread and Mary the sugar.
23 / 32
24 / 32
Active & Passive Constructions
Context-Free
Grammars (CFGs)
Relating active and passive constructions
Syntax
Syntax
Linear order
It is well-established that sentences occur in both active and
passive forms:
Linear order
Constituency
Constituency
Categories
Categories
Phrases
Phrases
CFGs
Other constructions
(14) a. Sandy saw Kim.
I
CFGs can clearly handle such sentences, along the lines of:
I
I
Even if a CFG can license such constructions, questions
remain:
I
b. Kim was seen by Sandy.
I
Context-Free
Grammars (CFGs)
I
VP → Vbe VPpass
Other constructions
How many rules will it take to capture every relevant
grammatical distinction?
How are the active and passive forms related?
I
VP → Vfin NP
CFGs
I
Through movement?
Through lexical rules?
They’re not related?
VPpass → Vpass (PPby )
25 / 32
Raising & Control Constructions
Context-Free
Grammars (CFGs)
26 / 32
Capturing the raising/control generalizations
Syntax
Some verbs look similar in some syntactic contexts, but
behave quite differently in others
(15) a. John seems to be happy.
Syntax
Linear order
Linear order
Constituency
Constituency
Categories
Categories
Phrases
Phrases
How do we distinguish raising and control verbs in CFGs?
CFGs
Other constructions
b. It seems to be raining.
c. John tries to be happy.
Generalization:
I
I
In both cases, it seems like we have the pattern NP V
VPinf
CFGs
Other constructions
Solutions seem to require one or more of the following:
d. *It tries to be raining.
I
Context-Free
Grammars (CFGs)
Raising verbs (e.g., seem): the subject of the higher
clause is the “same” as the subject of the lower clause
I
An empty subject in the lower clause
I
Sharing of subjects (or subject properties) between
upper and lower verbs, perhaps involving new features
I
A closer connection to sentence semantics
Control (or equi) verbs (e.g., try): the subject of the
higher clause “controls” the subject of the lower clause,
but has certain restrictions on it.
28 / 32
27 / 32
Unbounded Dependency Constructions (UDCs)
Context-Free
Grammars (CFGs)
Example: Wh-elements
Context-Free
Grammars (CFGs)
Syntax
Syntax
Wh-elements can have different functions:
Linear order
Constituency
Linear order
Constituency
Categories
Categories
Phrases
(16) a. Who did Hobbs see ?
CFGs
An unbounded dependency construction has an element
realized non-locally and:
I
involves constituents with different functions
I
involves constituents of different categories
I
is in principle unbounded
Other constructions
b. Who do you think saw the man?
c. Who did Hobbs give the book to ?
d. Who did Hobbs consider to be a fool?
Object of verb
Subject of verb
Phrases
CFGs
Other constructions
Object of prep
Object of
obj-control verb
Wh-elements can also occur in subordinate clauses:
(17) a. I asked who the man saw .
b. I asked who the man considered to be a fool .
c. I asked who Hobbs gave the book to .
d. I asked who you thought saw Hobbs.
29 / 32
30 / 32
Wh-elements (cont.)
Context-Free
Grammars (CFGs)
Different categories can be extracted:
Accounting for UDCs
Syntax
Syntax
Linear order
(18) a. Which man did you talk to ?
NP
b. [To [which man]] did you talk ?
PP
c. [How ill] has the man been ?
AdjP
d. [How frequently] did you see the man ?
AdvP
Context-Free
Grammars (CFGs)
Linear order
Constituency
Constituency
Categories
Categories
Phrases
Phrases
CFGs
CFGs
Other constructions
This sometimes provides multiple options for a constituent:
(19) a. Who does he rely [on ]?
b. [On whom] does he rely ?
How does one account for UDCs?
I
Invoke a notion of movement during an analysis
I
Include features which “pass” information about the
non-local element
I
Use some formalism more powerful than a CFG (e.g.,
Tree-Adjoining Grammar)
Other constructions
Unboundedness:
(20) a. Who do you think Hobbs saw ?
b. Who do you think Hobbs said he saw ?
c. Who do you think Hobbs said he imagined that he
saw ?
31 / 32
32 / 32