Download Chomsky hierarchy

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
The Chomsky Hierarchy
Sentences
The sentence as a string of words
E.g
I saw the lady with the binoculars
string = a b c d e b f
The relations of parts of a string to
each other may be different
I saw the lady with the binoculars
is stucturally ambiguous
Who has the binoculars?
[ I ] saw the lady [ with the binoculars ]
= [a] b c d [e b f]
I saw [ the lady with the binoculars]
= a b [c d e b f]
How can we represent the difference?
By assigning them different structures.
We can represent structures with 'trees'.
I
read
the
book
a. I saw the lady with the binoculars
S
NP
VP
V
NP
NP
I
saw
PP
the lady with the binoculars
I saw [the lady with the binoculars]
b. I saw the lady with the binoculars
S
NP
VP
VP
I
saw the lady
PP
with the binoculars
I [ saw the lady ] with the binoculars
birds fly
S
NP
VP
N
birds
V
fly
Syntactic rules
S → NP
NP → N
VP → V
Graphs and trees
VP
S
NP
VP
birds
a
fly
b
ab
Graphs and trees
= string
S
A
B
a
b
ab
S→A B
A→ a
B→b
Graphs and trees
Rules
Assumption:
natural language grammars are a rule-based
systems
What kind of grammars describe natural language
phenomena?
What are the formal properties of grammatical
rules?
Chomsky (1957) Syntactic Struc-tures. The
Hague: Mouton
Chomsky, N. and G.A. Miller (1958) Finitestate languages Information and Control 1, 99112
Chomsky (1959) On certain formal properties of
languages. Information and Control 2, 137-167
Rules in Linguistics
1. PHONOLOGY
/s/ → [θ]  V ___V
Rewrite /s/ as [θ] when /s/ occurs in
context V ____ V
With:
V =
s, θ =
auxiliary node
terminal nodes
Rules in Linguistics
2. SYNTAX
S → NP VP
VP → V
NP → N
Rewrite S as NP VP in any context
With:
S, NP, VP
= auxiliary nodes
V, N
= terminal node
PHONOLOGY (sound system)
Maltese – Word-final devoicing
Orthography
(spelling)
Pronunciation
(sound)
Sabet sab
Ħobża ħobż
Vjaġġi vjaġġ
[sa-bet]
[hob-za]
[vjağ-ği]
voiced [+vd]
[b, z, ğ]
voiceless [-vd]
[p, s, č]
[+vd]
→
[-vd]
(for # = end of word)
/____ #
[sap]
[hops]
[vjačč]
MORPHOLOGY (word formation)
Maltese – Progressive assimilation in 3fsg imprefective (present)
Marker for verb in 3rd person feminine singular imperfective t- (3fsgimpf = she)
e.g. she breaks =
I break
=
t-kisser
n-kisser
t-kisser
3fsg-break
she breaks
t-ressaq
3fsg-move
she moves
s-sakkar
3fsg-lock
she locks
d-dur
3fsg-turn
she turns
*t-sakkar
* t-dur
t →
s,d,etc.
/____ [s,d,etc.
|
[+cor]
μ
[3fsg]
(with μ = morpheme, C = consonant,
cor = coronal
SYNTAX (phrase/sentence formation)
SENTENCE:
The boy
SUBJECT
kissed the girl
NOUN PHRASE
ART + NOUN
S
VP
NP
→
→
→
PREDICATE
VERB PHRASE
VERB + NOUN PHRASE
NP VP
V
NP
ART N
SEMANTICS (meaning)
The lion attacks the hunter
ATTACK
a
(a, b)
λy [ATTACK (y, b)]
λz λy [ATTACK (y, z)]
(with a = the lion, b = the hunter)
b
Chomsky Hierarchy
0. Type 0 (recursively enumerable) languages
Only restriction on rules: left-hand side cannot be the
empty string
(* Ø  …….)
1. Context-Sensitive languages - Context-Sensitive (CS)
rules
2. Context-Free languages - Context-Free (CF) rules
3. Regular languages - Non-Context-Free (CF) rules
0 ⊇ 1, 1 ⊇ 2, 2 ⊇ 3
a ⊇ b meaning a properly includes b (a is a superset of b),
i.e. b is a proper subset of a or b is in a
Generative power
0.Type 0 (recursively enumerable) languages
Only restriction on rules: left-hand side cannot
be the empty string (* Ø  …….)
is the most powerful system
3. Type 3(regular language)
is the least powerful
Superset/subset relation
S1
S2
a
a
c
b
b
f
d
g
Rule Type – 3
Name: Regular
Example: Finite State Automata (Markov-process Grammar)
Rule type:
a) right-linear
A  xB or
Ax
with:
A, B = auxiliary nodes and
x = terminal node
b) or left-linear
A  Bx or
Ax
Generates: ambn with m,n  1
Cannot guarantee that there are as many a’s as b’s; no embedding
A regular grammar for natural language sentences
S →
the
A →
A →
A →
cat
B
mouse B
duck
B
B →
B →
B →
bites
sees
eats
C
C
C
C →
the
D
D →
D →
D →
boy
girl
monkey
A
the cat bites the boy
the mouse eats the monkey
the duck sees the girl
Regular grammars
Grammar 1:
A→a
A→aB
B→bA
Grammar 2:
A→a
A→Ba
B→Ab
Grammar 3:
A→a
A→aB
B→b
B→bA
Grammar 4:
A→a
A→Ba
B→b
B→Ab
Grammar 5:
S → aA
S → bB
A → aS
B → bbS
S → 
Grammar 6:
A→Aa
A→Ba
B→b
B→Ab
A→a
Grammars
Grammar 6:
S→ A B
S → bB
A→ aS
B → bbS
S→ 
Grammar 7:
A→a
A→Ba
B→b
B→bA
Finite-State Automaton
article
noun
NP
NP1
adjective
NP2
NP
article
NP1
adjective
NP1
noun
NP → article NP1
NP1 →adjective NP1
NP1 → noun NP2
NP2
A parse tree
S
NP
N
root node
VP
V
interior
nodes
NP
DET
terminal nodes
N
Rule Type – 2
Name: Context Free
Example:
Phrase Structure Grammars/
Push-Down Automata
Rule type:
A
with:
A = auxiliary node
 = any number of terminal or auxiliary nodes
Recursiveness allowed:
A  A
CF Grammar
A Context Free grammar consists of:
a) a finite terminal vocabulary VT
b) a finite auxiliary vocabulary VA
c) an axiom S  VA
d)
a finite number of context free rules of
form A →
γ,
where
A

VA
and
γ

{VA  VT}*
In natural language syntax S is interpreted as the start symbol for
sentence, as in S → NP VP
CF Grammars
The following languages cannot be generated by a regular
grammar
Language 1:
anbn
Language 2:
mirror image
ab
aabb
abaaba
abbaabba
Context-Free rules:
A → aAa
A→ aAa
A→ ab
A→ bAb
Natural language
Is English regular or CF?
If centre embedding is required, then it cannot be regular
Centre Embedding:
1. [The cat]
[likes tuna fish]
a
b
2. The cat the dog chased likes tuna fish
a
a
b
b
3. The cat the dog the rat bit chased likes tuna fish
a
a
a b
b
b
4. The cat the dog the rat the elephant admired bit chased likes tuna fish
a
a
a
a
b
b
b
b
ab
aabb
aaabbb
aaaabbbb
Centre embedding
S
NP
the
cat
a
= ab
VP
likes
tuna
b
S
NP
NP
the
cat
a
= aabb
VP
likes
S
tuna
b
NP
VP
the chased
dog
b
a
S
NP
VP
likes
NP
S
tuna
the
b
cat
NP
VP
a
chased
NP
S
b
the
dog NP
VP
a the
bit
rat
b
a
=
aaabbb
Natural language
Is English regular or CF?
If centre embedding is required, then it cannot be
regular
Centre Embedding
1.[The cat][likes tuna fish]
a
b
= ab
2.[The cat] [the dog] [chased] [likes tuna fish]
a
a
b
b
= aabb
[The cat]
a
[likes tuna fish]
b
2.[The cat] [the dog] [chased] [likes ...]
a
a
b
b
3. [The cat] [the dog] [the rat] [bit] [chased] [likes ...]
a
a
a
b
b
b
4. [The cat] [the dog] [the rat] [the elephant] [admired] [bit] [chased] [likes
....]
=
a
a
a
a
b
b
b
b
aaabbb
aaaabbbb
Natural language 2
More Centre Embedding:
1. If S1, then S2
a
a
2. Either S3, or S4
b
b
3. The man who said S5 is arriving today

4. The man who said S6 is arriving the day after

Sentence with embedding:
If either the man who said S5 is arriving today or the man who said S5 is arriving
tomorrow, then the man who said S6 is arriving the day after
abba = abba
Natural language 2
More Centre Embedding:
1. If S1, then S2
a
a
2. Either S3, or S4
b
b
Sentence with embedding:
If either the man is arriving today or the woman is arriving tomorrow, then the child is
arriving the day after.
a = [if
b = [either the man is arriving today]
b = [or the woman is arriving tomorrow]]
a = [then the child is arriving the day after]
= abba
CS languages
The following languages cannot be generated by a CF grammar (by
pumping lemma):
anbmcndm
Swiss German:
A string of dative nouns (e.g. aa), followed by a string of accusative nouns
(e.g. bbb), followed by a string of dative-taking verbs (cc), followed by
a string of accusative-taking verbs (ddd)
= aabbbccddd
= anbmcndm
Swiss German:
Jan sait das (Jan says that) …
mer em Hans
es Huus
we Hans/DAT the house/ACC
we helped Hans paint the house
hälfed aastriiche
helped paint
abcd
NPdat NPdat NPacc NPacc Vdat Vdat Vacc Vacc
a
a
b
b
c
c
d
d
Natural language 3
Inadequacy of phrase structure rules (CF rules)
Transformations:
Passive
NP1 – Aux – V – NP2
→
NP2 – Aux + be – V – by + NP1
Transformations are Turing powerful, i.e. can do anything to
anything: inversion, deletion
Developments in syntax
a) do away with or severely constrain:
1. Phrase Structure rules
2. Transformational Rules
b) move away from:
derivational/procedural models
to:
constraint-based/declarative models
Head-Driven Phrase Structure Grammar (HPSG) and Optimality Theory (OT)
c) development of context-free rules with non-terminals structured as sets of
features and values
E.gs.
N=
[+N, -V]
V=
[-N, +V]
sleeps =
[-N,+V,-PST,AGR:[+N,-V,+3,-PLU]]
Rules in Linguistics
Traditional syntactic rules
VP →
V
NP
NP →
DET
N
PP →
P
NP
etc.
X-bar syntax
(' = bar/level one, '' = bar/level two)
N'' →
N' →
DET
N
N’
V'' →
V' →
ADV
V
V’
A'' →
A' →
DEG
A
A’
P'' →
P' →
DET
P
P’
X-bar rule schema
X’’ → X’
X’ → X
X’’
│
X’
│
X
X-bar Syntax
X’’ → (SPEC)
X’’ → X’’
X’ → X’
X’ → X
X’
MODIFIER
MODIFIER
(COMPLEMENT)
X''
X''
(SPEC) X'
X'
X'
X
MODIFIER
MODIFIER
(COMPLEMENT)
the girl often plays the violin
S
N''
V''
Det
N'
ADV
V'
the
often
N
V
N''
girl
plays Det
N'
the
N
violin
Related documents