Download fbi.h-da.de

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inflection wikipedia , lookup

Lithuanian grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Arabic grammar wikipedia , lookup

Macedonian grammar wikipedia , lookup

Chinese grammar wikipedia , lookup

Lexical semantics wikipedia , lookup

Pleonasm wikipedia , lookup

Kannada grammar wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Context-free grammar wikipedia , lookup

Probabilistic context-free grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Esperanto grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Construction grammar wikipedia , lookup

Antisymmetry wikipedia , lookup

Spanish grammar wikipedia , lookup

Determiner phrase wikipedia , lookup

Polish grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Transformational grammar wikipedia , lookup

Junction Grammar wikipedia , lookup

Parsing wikipedia , lookup

Transcript
Parsing in functional unification grammar
MARTIN K A Y
Language is a system for encoding and transmitting ideas. A theory that
seeks to explain linguistic phenomena in terms of this fact is a functional
theory. One that does not misses the point. In particular, a theory that
shows how the sentences of a language are all generable by rules of a
particular formal system, however restricted that system may be, does
not explain anything. It may be suggestive, to be sure, because it may
point to the existence of an encoding device whose structure that formal
system reflects. But, if it points to no such device, it simply constitutes
a gratuitous and wholly unwelcome addition to the set of phenomena to
be explained.
A formal system that is decorated with informal footnotes and amendments explains even less. If I ask why some phenomenon, say relativization from within the subject of a sentence, does not take place in English
and am told that it is because it does not take place in any language, 1 go
away justifiably more perplexed than 1 came. The theory that attempts
to explain things in this way is not functional. It tells me only that the
source of my perplexity is more widespread than 1 had thought. The putative explanation makes no reference to the only assertion that is sufficiently self-evident to provide a basis for linguistic theory, namely that
language is a system for encoding and transmitting ideas.
But, surely there is more to functionalism than this. To fill their role
as systems for encoding and transmitting ideas, languages must first be
learnable. Learnability is a functional property and language learning
needs to be explained. But what is involved here is a derivative notion
of function. A satisfactory linguistic theory will at least make it plausible
that children could learn to use language as a system for encoding and
transmitting ideas. It will not show how a child might learn to distinguish
sentences from nonsentences. a skill with little survival value and one for
which evolution probably furnished no special equipment.
It follows that any reasonable linguistic theory will be functional. To
use the word to characterize one particular theory, as 1 shall shortly do,
must therefore be accounted pretentious. However, while it i s true that
the label has been used before, it is manifestly untrue that linguistic theories have typically been functional. Just recently, there has been a partial
and grudging retreat from the view that to formalize is to explain. This
has not been because of any widespread realization of the essential vacuity
of purely formal explanations. but for two other reasons. The first is the
failure of the formalists to produce workable criteria on which to distinguish competing theories, and the second is the apparent impossibility of
constructing theories whose most cogent remarks are made within the
formalism rather than in footnotes and amendments. The search for
sources of constraint to impose on formal grammar has led to an uneasy
alliance with the psychologists and a belated rekindling of interest in parsing and other performance issues (for some discussion, see Gazdar, 1982;
Kaplan, 1972. 1973, 1978; Kay, 1973. 1977, 1979).
My aim in this polemic is not to belittle the value of formalisms. Without
them linguistics,like most other scientific enterprises. would be impotent.
It is only to discredit them as an ultimate basis for the explanation of the
contingent matters that are the stuff of science. What I shall describe
under the banner of functional unification grammar is indeed a formalism,
but one that has been designed to accommodate functionally revealing,
and therefore explanatorily satisfying, grammars.
7. I. Functional unification grammar
A hnctionally adequate grammar must either be a particular transducer,
or some kind of data structure that more general kinds of transducer generators andpqrsers - can interpret. It must show not only what strings
can be associated with what meanings, but how a given meaning c a n h
expressed and a given utterance understood. Furthermore, it cannot take
logic as the measure of meaning, abjuring any responsibility for distinctions that are not readily reflected in the predicate calculus. The semantic
side of the transducer must traffic in quantifiers and connectives, to be
sure, but also in topic and emphasis and Biven and new. Functional grammar will, therefore, at the very least, adopt some form offunctional sentence perspective.
In practice, I take it that the factors that govern the production of a
sentence typically come from a great variety of different sources, logical,
textual. interpersonal, and so forth. In general, each of these, taken by
itself. underdetermines what comes out. When they jointly overdetermine
it, there must be priorities enabling a choice to be made among the demands of the different sources. When they jointly underdetermine the
outcome. the theory must provide defaults and unmarked cases. The point
is that we must be prepared to take seriously the claim that language in
f
f5’
09
I”
?
e
2.
E
5I
Ie
general, and individual utterances in particular, fill many different functions and that these all affect the theory. even at the syntactic level.
1 have outlined a broad program for linguistic theory, and the possibility
o f carrying it through rests on our being able to design clean, simple
formalisms. This applies to the formal descriptions by which words,
phrases, sentences, grammars, and languages are known and also to the
operations that the theory allows to be performed on these descriptions.
Much of the character of the theory I shall outline comes from the set.heoretic properties that are imputed to descriptions of all kinds in everyday life. These properties do not, in fact, carry over to the descriptions
provided for in most linguistic formalisms. In this theory, there is no limit
on how detailed a description can be and no requirement that everything
in it should serve some grammatical end. Generally speaking, to add more
detail to a description is to narrow the class of objects described, and to
remove material from a description is lo widen its coverage. I n fact, descriptions and the sets of things they refer to are mathematical duals of
one another with respect to the operations of set theory. In other words,
the intersection of a pair of descriptions describes the union of the sets
of objects that they describe separately and the union of a pair of descriptions describes the intersection of the corresponding pairs of sets.
These are properties we are entitled to look for in anything to which the
term "description" is seriously applied.
Descriptions are sets of descriptors and prominent among the kinds of
object that go to make up the sets are pairs consisting of an attribute, like
number. with an associated value, like singular. An important subclass
of attributes consists of grammaticalfunctions like Subject, Modifier, and
Connective.
The claim that this theory makes on the word "functional" in its title
is therefore supported in three ways. First, it gives primary status to those
aspects of language that have often been called functional; logical aspects
are not privileged. Second, it describes linguistic structures in terms of
the function that a pari fills in a whole, rather than in terms of parts of
speech and ordering relations. Third, and most important for this paper,
it requires its grammars tojirncrion; that is, they must support the practical
enterprises of language generation and analysis.
7.1.1. Compilation
The view that a grammar is a data structure that can be interpreted by
one or more transducers has some attractions over the one that would
have it actually be a transducer. On the one hand, while the grammar
itself remains the proper repository of linguistic knowledge, and the formalism, an encapsulation of formal linguistic universals, the language-
independent transducer becomes the embodiment o f a theory of performance. But there is no reason to suppose that the same transducer should
be reversible, taking responsibility for both generation and parsing. While
the strongest hypothesis would indeed provide only one device, a more
conservative position would separate these functions. This paper takes
the conservative position. Given that generation and parsing are done by
different transducers, a strong hypothesis would have both transducers
interpret the same grammar; a more conservative position would associate
a different formalism with each transducer and provide some mechanism
for ensuring that the same facts were represented in each. This paper
takes the conservative position. Specifically. the position is that the
generator operates directly on the canonical form of the grammar - the
competence grammar - and that the parser operates on a translation of
this grammar into a different formalism. This paper will concentrate on
how this translation is actually camed out; it will, in short, be about
machine translation between grammatical formalisms.
The kind of translation to be explored here is known in computer science as compilation, and the computer program that does it is called a
compiler. Typically, compilers are used to translate programs from socalled high-lewl programming'like FORTRAN or ALGOL, into the lowlevel languages that directly control computers. But, compilers have also
proved very useful in other situations.
Whatever the specific application, the term "compilation" almost always refers,to a process that translates a text produced by a human into
a text that IS functionally equivalent, but not intended for human consumption. There 'are at least two reasons for doing this. One is that those
properties of a language that make for perspicuity and ease of expression
are very different from those that make for simple, cheap, reliable computing. Put simply, computers are not easy to talk to and it is sometimes
easier to use an interpreter. It is economical and efficieot for man and
machine to talk different languages and to intemct through an intermediary. The second reason has to do with flexibility and is therefore also
economic. By compilation, the programming enterprise is kept. in large
measure, independent of particular machines. Through different compil. ers, programs can be made to run on a variety o f computers. existing or
yet to be invented. All in all, compilers facilitate the business of writing
programs for computers to execute. The kind of compiler lo be described
here confers practical benefits on the linguisl by facilitating the business
of obtaining parsing grammars. It does this by making the enterprise essentially indistinguishable from writing competence grammars.
It is possible to say things in the native language of a computer that
never should be said, because they are either redundant or meaningless.
Adding zero to a humber, or calculating a value that is never used, is
I
redundant and 11 erefore wasteful. Attempting to divide a number by zero
makes no sense ind an endless cycle of operations prevents a result ever
being reached. 5 hese things can almost always be assumed to be errors
on the part of tCe programmer. Some of these errors are dificult or impossible to detect simply by inspecting the program and emerge only in
the course of thr actual computation. But some errors can be dealt with
by simply not providing any way of committing them in the higher-level
language. Thus, an expression n10. meaning the result of dividing n by 0.
would be deemed outside the language, and the compiler would not be
able to translate a program that contained it. The general point is this:
compiling can confer additional benefits when it is used to translate one
language into a subset of another. In the case of programming languages,
the idea is to exclude from the subset useless constructions.
This benefit is easy to appreciate in the linguistic case. As 1 have said,
competence grammars are written in a formal language designed to enshrine restrictions that have been found characteristic of the human linguistic faculty, in short, linguistic universals. The guarantee that performance grammars reflect these same universals can be provided by
embedding them in the language of those grammars. But it can also be
provided by deriving the performance grammars from the competence
grammars by means of a process that is guaranteed to preserve all important properties. The language of the performance grammar is relatively
unconstrained and would allow the expression of theoretically unmotivated things, but the translation procedure ensures that only a properly
motivated subset is used. This strategy clearly makes for a stronger theory
by positing a close connection between competence and performance. It
has the added advantage of freeing the designer of the performance formalism of any concern for matters already accounted for in the competence system.
An attribute is a symbol, that is, a string of letters. A value is either a
symbol or another FD. The equal sign, " = is used to separate an attribute from its value so that, in a = @. a is the attribute and p the value.
Thus, for example. ( I ) might be an FD. albeit a simple one, of the sentence
He sun! her.
".
CAT
CAT
= PRON
GENDER = MASC
PERSON
CAT
= PRON
PERSON
VERB
= SEE
TENSE = PAST
b O l C E = ACTIVE
CAT
I .
=
S
["*'
z..
GENDER = MASC
NUMBER = SING
PERSON = 3
GOAL
= [=DER
SING
NUMBER
PERSON = 3
VERB
= SEE
7.1.2. Attributes and values
As 1 have already pointed out. functional unification grammar knows
things by their Junctional desrriptions. (FDs). A simple FD is a set of
descriptors and a descriptor is a constituent set, a pattern. or an attribute
with an associated value. 1 shall come to the form and function of constituent sets and patterns shortly. For the moment, we consider only
attribute-value pairs.
The list of descriptors that make up an F D is written in square brackets,
no significance attaching to the order. The attributes in an FD must be
distinct from one another so that if an FD F contains the attribute u , it
is always possible to use the phrase "the u of F" to refer unambiguously
to a value.
= S
f5
09
5
TENSE = PAST
I f the values of SUBJ and DOBJ are reversed in (I). and the value of VOICE
changed to PASSIVE, it becomes an FD for the sentence She was seen by
him. However, in both this and the original sentence, he is the protagonist
(PROT), or logical subject. and she the goal (GOAL) of the action, or logical
direct object. In other words. both sentences are equally well described
by (2). In the sense of transformational grammar ( 2 ) shows a deeper struc-
P
G
U
G
OD
(3)
-
CAT
=s
CAT
= PRON
GENDER = MASC
(4)
-
= PRON
GENDER = FEM
NUMBER = SING
PERSON = 3
I
= SEE
TENSE
LVOICE
= PAST
= ACTIVE
I
CAT
= S
SUBJ
= he
DOBJ
=
CAT = NP
HEAD = books
CAT = PRESP
= [LEX = WRITE
VERB
= LIKE
TENSE
= PRES
VOICE
= ACTIVE
11
-
J
ture than (1). However, in functional unification grammar, if a given linguistic entity has two different FDs, a single F D containing the information in both can be constructed by the process of unification, which
we shall examine in detail shortly. The FD (3) results from unifying ( I )
and (2).
A pair of FDs is said to be incompatible if they have a common attribute
with different symbols, or incompatible FDs, as values. Grammatically
ambiguous sentences have two or more incompatible FDs. Thus, for example, the sentence i f e likes wifing books might be described by (4) or
(5). Incompatible simple FDs F I ... F k can be combined into a single
complex FD { F l ... F A )which describes the union of the sets of objects
that its components describe. The notation allows common parts of components to be factored in the obvious way, so that (6) describes all those
objects that are described by eirher (4) or ( 5 ) .
The use of braces to indicate alternation between incompatible FDs or
sub-FDs provides a compact way of describing large classes of disparate
objects. In fact. as we shall see, given a few extra conventions, it makes
it possible to claim that the grammar of a language is nothing more than
a single complex FD.
7 .I .3. Unification
A string of atoms enclosed in angle brackets constitutes a pufh and there
is at least one that identifies every value in an FD. The path ( a , u 2 ... u r c )
identifies the value of the attribute uk in the FD that is the value of (u , u 2
-.
1
SUBJ
= he
rCAT
= NP
r
DOBJ
=
HEAD =
L
VERB
11
J
=’ LIKE-
TENSE = PRES
VOICE = ACTIVE
1
CAT
CAT
= S
SUBJ
= he
= s
GENDER
SUBJ = PRO1
'CAT = NP
rHEAD = books
=
NUMBER
PERSON
PRON
FEM
ACC
= SING
= 3
=
=
=
GENDER
DOBJ
=
r
i
DOBJ = GOAL
r CAT
=
s
=
NUMBER
PERSON
1
VERB
=
= SEE
HEAD =
VERB
= LIKE
TENSE
= PAST
VOICE
= ACTIVE
ASPECT
PERFECT
PROGRESSIVE =
=
1
1
PRON
MASC
NOM
SING
= 3
=
=
=
=
I'
-
TENSE = PRES
VOICE = ACTIVE
aI . Paths are always interpreted as beginning in the largest
F D that encloses them. Attributes are otherwise taken as belonging to the smallest
enclosing FD. Accordingly,
A pair consisting of a path in an FD and the value that the path leads to
is afcafure of the object described. If the value is a symbol, the pair is
a basic feature of the FD. Any FD can be represented as a list of basic
features. For example, ( 8 ) can be represented by the list (9).
I t is in the nature of FDs that they blur the usual distinction between
features and structures. Example (8) shows FDs embedded in other FDs,
thus stressing their structural properties. Rewriting (8) as (9) stresses the
componential nature of FDs.
It is the possibility of viewing FDs as unstructured sets of features that
makes them subject to the standard operations of set theory. However,
it is also a crucial property of FDs that they are not closcd under set-
theoretic operations. Specifically, the union of a pair o f FDs is not, in
general, a well-formed FD. The reason is this: The requirement that a
given attribute appear only once in an F D implies a similar constraint on
the set of features corresponding to an FD. A path must uniquely identify.
a value. But, if the F D F I has the basic feature (a) = x and the F D F2
has the basic feature (a) = y, then either x = y or F I and F 2 are incompatible and their union is not a well-formed FD. So, for example, if fa
describes a sentence with a singular subject and F zdescribes a sentence
with a plural subject. then SI U S z , where SI and SZare the corresponding
sets of basic features. is not well formed because it would contain (SUSJ
NUMBER) = SINGULAR and (SUBJ NUMBER) = PLURAL.
When two or more simple FDs are compatible, they can be combined
into one simple F D describing those things that they both describe, by
the process of unification. Unification is the same as set union except
that it yields the null set when applied to incompatible arguments. The
- sign is used for unification, so that a = p denotes the result of
unifying a and p. Examples (10)-(12) show the results of unification in
some simple cases.
LL
3,
f5'
z.
.n
E
gE
5
R
i;'
1
0
a
B
(CAT) = S
The result of unifying-a pair of complex FDs is, in general, a complex
F D with one term for each compatible pair of terms in the original FDs.
Thus ( a , ... a,) = ( b l ... h,) becomes an F D of the form { c , ... ck) in
which each ch ( I c: h 5 k ) is the result of unifying a compatible pair a i
= b, (I
i 5 rn, 1 I: j 5 n). This is exemplified in (13).
(SUBJCAT) = PRON
(SUBJGENDER) = MASC
(SUBJCASE) = NOM
(sum NUMBER)
= SING
( S U N PERSON) = 3
(PROTCAT) = PRON
(PROT GENDER) = MASC
(PROTCASE) = NOM
(PROT NUMBER) = SING
(PROT PERSON) = 3
(OBJCAT) = PRON
(om GENDER)
= FEM
(OWCASE) = ACC
(OBJ NUMBER) = SING
(OBJ PERSON) = 3
CAT
I*'(
[CASE
PREP = DAT]
MIT
= PP
= [HEAD = [ C A T
= NP
CASE = (CASE)
(GOALGENDER) = FEM
(GOALNUMBER) = SING
(GOALPERSON) = 3
(VERB CAT) = VERB
(VERB WORD) = SEE
(TENSE) = PAST
(VOICE) = ACTIVE
(ASPECTPERFECT) =
(ASPECT PROGRESSIVE) =
+
-
..
.
)
.
.
1
F
= PP
HEAD =
(GOALCAT) = PRON
(GOALCASE) = FEM
]] 3EL: [
CAT
CAT = NP
CASE = (CASE)]
[TENSE :RES]
=
FORM =
IS
= [CAT
=
VERB]
TENSE = PAST
[
C-set
CAT
= VERB
j TENSE = PAST
FORM
was
~
Unification is the fundamental operation underlying the analysis and synthesis of sentences using functional unification grammar.
It is important to understand the difference between saying that a pair
of FDs are equal and saying that they are unified. Clearly, when a pair
of FDs has been unified, they are equal; the inverse is not generally true.
To say that a pair of FDs A and B are unified is to say that A and R are
two namesfor one and the same description. Consequently, if A is unified
with C. the effect is to unify A , B . and C. On the other hand. if A and R.
though possibly equal. are not unified. then unifying A and C does not
aNect B and. indeed, if A and C are not equal, a result of the unification
will be to make A and B different. A crucial consequence of this is that
the result of unifying various pair of FDs is independent of the order in
which the operations are carried out. This, in its turn, makes for a loose
coupling between the grammatical formalism as a whole and the algorithms that use it.
7.1.4. Patterns and constituent sets
We come now to the question of recursion in the grammar and how constituency is represented. I have already remarked that functional unification grammar deliberately blurs the distinction between structures and
sets of features. It is clear from the examples we have considered so far
that some parts of the F D of a phrase typically belong to the phrase as
a whole, whereas others belong to its constituents. For example, in (8).
the value of SUBJ is the F D of a constituent of the sentence. whereas the
value of ASPECT is not. The purpose of constituent sets and patterns is to
identify constituents and to state constraints on the order of their occurrence. Example (14) is a version of ( 8 ) that specifies the order. (SURJ VERR
DOBJ) is a pattern stating that the values of the attributes S U B J , V E R B , and
DOBJ are FDs of constituents and that they occur in that order. A s the
example illustrates, patterns appear in FDs as the values of attributes. In
general, the value is a list of patterns, and not just one as in (14). The
attributepattern is distinguished, its value being the one used to determine
the order of the immediate constituents of the item being described. The
attribute C-set is also distinguished, its value being a single list of paths
identifying the immediate constituents of the current item, but imposing
= (SUBJ VERB OBJ)
Pattern
= (SUBJ VERB OBJ)
CAT
= s
SUBJ = PRO1
NUMBER = SING
PERSON = 3
CAT
= PROF
GENDER = FEM
DOBJ = GOAL
NUMBER = SING
PERSON = 3
VERB
= [ERD
TENSE
= PAST
VOICE
= ACTIVE
I !:EB]
PERFECT
ASPECT
= +
no order on them. Here, as elsewhere in the formalism, one-step paths
canbe, and usually are, written without enclosing angle-brackets.
The patterns are templates that the string of immediate constituents
must match. Each pattern is a list whose members can be
1
I.
A parh. The path may have as its value
a. An FD. As in the case of the constituent set. the FD describes a
constituent.
4.
b. A pattern. The pattern is inserted into the current one at this point.
A srring ofdots. This matches any number of constituents.
The symbol #. This matches any one constituent.
An FD. This will match any constituent whose description is unifiable
with it. The unification is made with a ropy of the FD in the pattern,
rather than with the FD itself, because the intention is to impute its
properties to the constituent, but not to unify all the constituents that
5.
An expression offheform (* f d ) . where f d is on FD. This matches zero
2.
3.
match this part of the pattern.
or more constituents. provided they can all be unified with a copy of
f d.
The ordering constraints in (14) could have been represented by many
other sets of patterns. for example, those in (15).
t:
N
(IS)
Example (19) shows a simple grammar, corresponding to a context-free
grammar containing the single-rule (20).
...
(SUBJ V E R B
) (-. VERB DOBJ)
(SUB)
W B J ) (-. V F R B ...)
(... SUBJ
DOBJ) (X V E R B ...)
(... SUBJ ' ' ' V E R B ' " W B J )
(... SUBJ
VERB
) (.. DOBJ)
...
...
(SUBJ V E R B . . .)
...
CAT = S
iIr
The pattern (16) requires exactly~one constituent to have the properly
[TRACE = NP]; all others must have the property [TRACE = NONF].
(16) ((*
[TRACE
= NONE])
[TRACE
=
NPI (* (TRACT
=
SUBJ
VERB
(... V E R B
The last of these could be a compatible pair just in case VERB and OORJ
were unified. but the names suggest that the first would have the feature
[CAT = VERB] and the other [CAT = NP], ruling out this possibility.
The value of the C-set attribute covers all constituents. If two FDs are
unified. and both have C-set attributes, the C-set attribute of the result
is the intersection of these. If a member of fhe pair has no such attribute,
the value is taken from the other member. In other words, the universal
set of constituents can be written by simply omitting the attribute altogether. If there are constituents. but no patterns, then there are no constraints on the order in which the constituents can occur. I f there are
patterns, then each of the constituents must be assimilated to all of them.
7.1.5. Grammar
A functional unification grammar is a single FD. A sentence is well formed
if a constituent-structure tree can be assigned to it each of whose nodes
is labeled with an FD that is compatible with the grammar. The immediate
descendents of each node must be properly specified by constituent-sets
and patterns.
Generally speaking, grammars will take the form of alternations. each
clause of which describes a major category; that is, they will have the
form exhibited in (18). .
1. . . SCOMP)
SCOMP = [CAT = S)
[CAT = NP]
[CAT = VERB]
..' SllBJ ...)
(# SUB3
(SUBJ "')
(." SUBJ VERB "-) (.-.SUB) W B J "')
[CAT = NP]
[SCOMP = NONE]
NONT]))
Clearly, patterns, like attribute-value pairs, can be incompatible thuc
preventing the unification of FDs. This is the case in examples i n (17).
(17) (... SUBJ
=
(20)S -+ NP VERB (S)
FD (19) describes either sentences or verbs or noun phrases. Nothing is
said about the constituency of the verbs or noun phrases described - they
are treated as terminal constituents. The sentences have either two or
three constituents depending on the choice made in the embedded alternation. All constituents must match the FD(19). Since the first constituent
has the feature [CAT = NP]. it can only match the second term in the main
allernation. Likewise, the second constituent can only match the third
term. If there is a third constituent, it must match the first term in the
alternatiori. because it has the feature [CAT = sJ. It must therefore also
have two or fhree constituents also described by (19). If (19) consisted
only of the first term in the outer alternation. i t would have a null extension
because the first term, for example, would be required to have the incompatible features [CAT = NP] and [CAT = s]. On the other hand, if the
inner alternation were replaced by its second term, so that [scow =
NONE] were no longer an option. then the FD would correspond to the
rule (21), whose derivations do not terminate.
(21)
S -+
NP VERB S
7.2. The parser
7.2.I. The General Syntactic Processor
As I said at the outset, the transducer that is used for generating sentences
operates on grammars in essentially the form in which they have been
given here. The process is quite straightforward. The input is an FD that
constitutes the specification of a sentence to be uttered. The more detail
it contains. the yore closely it constrains the sentence that will be produced. The transducer attempts to unify this FD with the grammar. If
.c"
Y
this cannot be uone - that is. if it produces a null result because of incompatibilities bztween the two descriptions - there is no sentence in the
language that mc ets the specification. If the unification is successful, the
result will, in gewral, be to add detail to the FD originally provided. If
the FD that resiilts from this step has a constituenf set, the process is
repeated, unifying each constituent in turn with the grammar. A constiluent that has no constituents of its own is a terminal that must match
some entry in the lexicon.
Parsing is by no means so straightforward. Grammars, as I have characterized them, do not. for example, enable one to discern, in any immediate way, what properties the first or last word of a sentence might
have. It is mainly for this reason that a compiler is required to convert
the grammar into a new form. Compilation will result in a set of procedures, or miniature programs, designed to be embedded in a general parsing program that will call upon it at the appropriate time.
The general program, a version of the General Syntactic Processor,
has been described in several places. The basic idea on which it is based
is this. There are two principal data structures, the chart. and the agenda.
The chart is a directed graph each of whose edges maps onto a substring
of the sentence being analyzed. The chart therefore contains a vertex for
each of the possible end points of such substrings, k + 1 vertices for a
sentence of k words. Vertices are directed from left to right. The only
loops that can occur consist of single edges incident from and to the same
vertex.
Each word in the sentence to be parsed is represented by an edge
labeled with an FD obtained by looking that word up in the lexicon. If
the word is ambiguous, that is, if it has more than one FD. it is represented
by more than one edge. All the edges for the ith word clearly go from
vertex i - I to vertex i . As the parser identifies higher order constituents,
edges with the appropriate FDs are added to the chart. A particular analysis is complete when an edge is added from the first to the last vertex
and labeled with a suitable FD.say one with the feature [CAT = SI.
The edges just alluded to are all inactive; they represent words and
phrases. Active edges each represent a step in the recognition process.
Suppose a phrase with c constituents is recognized. Since the only knowledge the parser has of words and phrases is recorded in the chart, it must
be the case that the c6nstituent edges are entered in the chart before the
process of recognizing the phrase is complete, though the recognition
process can clearly get underway before they are all there. If the phrase
begins at vertex ul, the chart will contain c vertices (possibly among others) beginning at u1. each ending at a vertex where one of the constituents
ends. The one that ends where the final constituent ends is the one that
represents the phrase itself. That one is inactive. The remainder are ocrive
-'
--
edges and each records what is known about a phrase when only the first
of so many of its constituents have been seen, and also what action must
be taken to incorporate the next constituent to the right. The label on an
active edge therefore has two parts. an FD describing what is known about
the putative phrase, and a procedure that will carry the recognition of the
phrase one step further forward. It is these procedures that constitute the
parsing grammar and that the compiler is responsible for constructing.
Parsing proceeds in a series of steps in each of which the procedure
on an active edge is applied to a pair of FDs, one coming from that same
active edge, and the other from an inactive edge that leaves the vertex
where the active edge ends. In other words, if o and i are an active and
an inactive edge respectively, o being incident to the vertex that i is incident from, the step consists in evaluating Po(fo,ff),where f, and f ,
are the FDs on a and i , and Pa is the procedure. The step is successful
if f f meets the the requirements that Pa makes of it. one of which is to
unify it with the value of some path in a copy of fa. This copy then
becomes part of the label on a new edge beginning where fa begins and
ending where fiends. The requirements imposed on f i , the path in the
copy o f f , with which it is unified, and the procedure to be incorporated
in the label of the new edge, are all built in to P o . The last step completes
the recognition of a phrase, and the new edge that is produced is inactive
and therefore has no procedure in its label.
This process is camed out for every pair consisting of an active followed
tiy an inactive edge that comes to be part of the chart. Each successful
step lads to the introduction of one new edge, but this edge may result
in several new pairs. Each new pair produced therefore becomes a new
item on the agknda which serves as a queue of pairs waiting to be processed.
The initial step in the recognition of a phrase is one in which the active
member of the pair is incident from and to the same vertex - the same
vertex from which the inactive edge is also incident. This is reasonable
because active edges are like a snapshot of the process of recognizing a
phrase when it is still incomplete. The initial active edge is therefore a
snapshot of the process before any work has been done. These edges
constitute the only cycles in the chart. Their labels clearly contain a process, but no FD. The question that remains to be answered is how these
initial active edges find their way into the chart.
Suppose the first step in the recognition of all phrases, of whatever
type. is carried out by a single procedure. I leave open, for the moment,
the question of whether this is true of the output of the functional grammar
compiler. This process will be the only one that is ever found in the label
o f initial, looping, active edges. If a looping edge is introduced at every
vertex in the chart labeled with this process, then the strategy I have
I
1
outlined will clearly c iuse all substrings of the string that are phrases to
be analyzed as such, md all the analyses of the string as a whole will be
among them. This technique corresponds to what is sometimes called
undirected. bottom-up parsing.' If there is a number of different processes
that can begin the recognition of a phrase, then a loop must be introduced
for each of them at each vertex. Uqdirected top-down parsing is modeled
by a strategy in which the only loops introduced initially are at the first
vertex and the procedures in their labels are those that can initiate the
search for a sentence. Others are introduced as follows: when the FD
that a given procedure is looking for on an inactive edge corresponds to
a phrase rather than a lexical item, and there is no loop at the relevant
vertex that would initiate the search for a phrase of the required type.
one is created.
7.2.2. The parsing grammar
The parsing grammar, as we have seen, takes the form of a set o f procedures, each of which operates on a pair of FDs. One of these FDs, the
matrix FD. is a partial description of a phrase, and the other, the consfiruenfFD. is as complete a description as the parser will ever have of
a candidate for inclusion as constituent of that phrase. There may be
various ways in which the constituent FD can be incorporated. Suppose,
for example, that the matrix is a partial FD of a noun phrase and that it
already contains the description of a determiner. Suppose, further, that
the edge representing the determiner ends where the current active edge
ends. The current procedure will therefore be a specialist in noun-phrase
constituents that follow initial determiners. Offered a constituent FD that
describes an adjective, it will incorporate it into the matrix FD as a modifier and create a new active edge whose label may contain itself - on the
theory that what can follow a determiner can also follow a determiner
and an adjective. But, if the constituent FD describes a noun, that must
be incorporated into the matrix as its head. At least two new edges will
be produced in this case, an inactive edge representing a complete noun
phrase and an active edge labeled with a procedure that specializes in
incorporating prepositional phrases and relative clauses.
We have seen that the process of recognizing a phrase is initiated by
an active edge that loops back to its starting vertex, and which has a null
FD in its label. For the case of functional grammar, all such loops will
be labeled with the same procedure, one capable of recognizing initial
members of all constituents and building a matrix FD that incorporates
them appropriately. Thus, for example, if this procedure is given a constituent FD that is a determiner description. it will put a new active edge
in the chart whose FD is a partial NP description incorporating that determiner FD.
The picture that emerges, then, is of a network of procedures, akin in
many ways to an augmented transition network (ATN) (Wood, 1969). The
initial procedure corresponds to the initial state and, in general, there is
an arc connecting a procedure to the procedures that it can use to label
the new active edges it introduces. The arc that i s followed in a particular
case depends on the properties that the procedure finds in the constituent
FD it is applied to.
Consider the situation that obtains in a simple English sentence in the
active voice after a noun phrase, a verb, and a second noun phrase have
been seen. Three different grammar procedures are involved in getting
this far. Let us now consider what would face a fourth procedure. If the
verb is of the appropriate kind, the question of whether the second noun
phrase is the direct or the indirect object is still open. If the current
constituent FD is a noun phrase, then one possible way to continue the
analysis will accept that noun phrase as the direct object and the preceding
one as the indirect object. But, regardless of what the constituent FD is,
another possibility that must be explored is that the preceding noun phrase
was in fact the direct object. Let us assume that what can follow is the
same in both cases; in ATN terms, a transition is made to the same state.
A grammar procedure that does this is given below in (22).
(22)
- [I]121
131
(PROO ((OLDMATRIX MATRIX)
'%
.
141
SI)
' -. (SELECTQ
7
151
161
171
LI
(ASSIGN 4 DOEJ)
(TO S~DOSJ)
(SETQ MATRIX OLDMATRIX)
(ASSIGN 3 DOE,)
(TO S~DOBJ)
1141
[Is]
(NP)
(NIL (SETQ SI T))
(GO LI))
(AND SI (NEWPATH 4 CAT)
(QUOTE NP)))
(ASSION 3 IOEJ)
181
191
1101
1111
1121
1131
(PATH 4 CAT)
OUT)
The procedure is written in Interlisp, a general programming language,
but only a very small subset of the language is used, and the salient points
about the example will be readily understood with the help of the following
remarks. The major idiosyncracy of the language for present expository
purposes is that the name of a function or procedure, instead of being
written before a parenthesized list of arguments, is written as the first
item within the parentheses. So what would normally be written as F(x)
is written (F x).
35:
o
E
rn
The heart of the procedure is in lines 3 through 6, which constitute a
call on the function sELECTQ. SELECTQ examines the value of the CAT attribute of the constituent FD, as specified by the first argument (PATH 4
CAT). This returns the value of the (CAT) path of the fourth, that is, the
current constituent.2 The SELECTQ function can have any number of arguments, and the remaining ones give the action to be taken for various
values of the first one. All but the last give the action for specific values,
or sets of values, and the last says what is to be done in all other cases.
In (22). the specific values are NP (line 4), and NIL (line 5). The nonspecific
case is given in line 6: (GO LI). Now let us examine these cases more
closely.
The last case is the most straightforward. If the value of the constituent
FD has some value other than NP, or N I L for its CAT attribute. then it must
describe something other than a noun phrase. NP is the value it would
have if it did describe a noun phrase, and NIL is a purely conventional
value that the procedure finds when the attribute is absent altogether.
The only other possibility is that the attribute has some other substantive
value, like VP. If this happens, the instruction (GO L I ) is executed, causing
processing to resume at line 12. The first thing that is done here, (SETQ
MATRIX OLDMATRIX), is of minor interest, and w e will return to it shortly.
After this. only two instructions remain, on lines 13 and 14. The effect
of the first of these, (ASSIGN 3 ooei), is to unify the description ofthe third
constituent of the phrase the noun phrase immediately preceding the
phrase that this procedure examines - with the value of the path DOBJ.
In short, it implements the hypothesis that that constituent was the direct
object. The instruction (mSKWBJ), intentionally reminiscent of the language of A T N s , causes a new active edge to be created, labeled with the
procedure SIDOBI.
In the remaining cases, the constituent either has, or could be given,
the feature [CAT = NP]. These are the cases in which the current constituent could be the direct, and the preceding constituent the indirect,
object. However, if the value of the CAT attribute of the current constituent
is NIL, it must be assigned the value NP before the hypothesis is acted
upon and the procedure sets the local variable SI to T (Interlisp's own
name for "true") in line 5 to remind it to d o this. It is actually done in
by the instructions on lines 7 and 8; if si has a nonnull value, the instruction
(NEWPATH ICAT) (QUOTE NP)) is camed out, causing NP to become the new
value of the CAT attribute of the constituent FD. Setting and testing the
local variable SI is an eficiency measure, enabling the cost of the NEWPATH
operation to be avoided when the required value is already in place.
The instructions on lines 9 and IO are similar to the one on line 13.
They cause the third and fourth constituents to be assigned the functions
IOBI and DOBJ. respectively. (TO S ~ B I then
)
causes a new active edge to
be created as before, The procedure then goes on to do what it would
-
have done in the default case described above. But it must first restore
the FDs to the condition they were in when the procedure was entered,
and this is the purpose o f the instruction (sETQ MATRIX OLDMATRIX) on line
12.' The original matrix FD was preserved as the value of the temporary
variable OLDMATRIX in line 1 and the constituent FD is embedded in the
same data structure by a subterfuge that is best left unexamined.
If, in the case w e have just examined, the grammar made no provision
for constituents following the direct object, the instruction (TO WBJ)
appearing on lines 11 and 14 would have been replaced by (DONE). This
has the effect of creating an inactive edge labeled with the current matrix
FD. More realistically. the procedure should probably have both instructions in both places, on the theory that material can follow the direct
object, but it is optional.
Before going on to discuss how procedures of this kind can b e compiled
from functional grammars, a few words are in order on why procedures
like (22) have the particular structure they have. Why, in particular, is
the local variable S I used in the way it is rather than simply embedding
the (NEWPATH 4 CAT) (QUOTE NP)) in the (SELSCTQ) instruction? The example
at (23) will help make this clear. The procedure succeeds, and a transition
to the state NEXT is made, just in case the current constituent has the
features [CAT = NP] and [ANIMATE = NONE]. If attribute CAT in the constituent F D has the value NIL, then as we have seen, the function NEWPATH
must be called to assign the value NP to this attribute. However, there is
nopoint in doing this if the other requirements of the constituent are not
also met; Accordingly, in this case, the procedure verifies the feature
[ANIMATE = NQNEI before making any reassignments of values.
1
(23)
(PROO ( (OLDMATRIX MATRIX)
s2 SI)
(SELECTQ (PATH I CAT)
(NP)
(NIL (SETQ SI T))
(00 OUT))
(SELEIXQ (PATH I ANIMATE)
(NONE)
(NIL (SETQ sz T))
(WJ OUT))
(AND sz (NEWPATH I ANIMATE)
(QUOTE NONE)))
(AND SI(NEWPATH I CAT)
(QUOTE NP)))
(TONEXT)
OUT)
A second point that may require clarification involves instructions like
(ASSIGN 3 DOBJ), wpich
can cause the assignment of constituents other than
the current one to grammatical functions - attributes - in the matrix FD.
In principle, all such assignments could be made when the constituent in
question is current, in which case no special instruction would be required
and it would not be necessary to use numbers to identify particular constituents. These devices are the cost of some considerable gains in efficiency. A situation that illustrates this is now classic in discussions of
ATN grammars. Suppose that the first noun phrase in a sentence is to be
assigned to the SUBJ attribute if the sentence is active, and to the DOBJ
attribute if it is passive. The voice is not known at the time the noun
phrase is encountered what the voice of the sentence is. A perfectly viable
solution to the problem this raises is to make both assignments, in different
active edges, and to unify the corresponding value with that of the VOICE
attribute. When the voice is eventually determined. only the active edge
with the proper assignment for the first noun phrase will be able to continue. But. in the meantime. parallel computations will have been pursued
redundantly. The numbers that appear in the ASSIGN instructions serve as
a temporary label and enable us to defer the decision as to the proper
role for a constituent.
7.3. The compiler
The compiler has two major sections. The first part is a straightforward
application of the generation program to put the grammar, effectively.
into disjunctive normal form. The second is concerned with actually building the procedures.
A grammar is a complex FD, typically involving a great many alternations. As with any algebraic formula involving and and or, it can be
restructured so as to bring all the alternations to the top level. In other
words, if F is a grammar, or indeed any complex FD,it is always possible
to recast it in the form F I V F2 ... F , , where the Fi ( I 5 i 5 n ) each
contain no alternations. This remains true even when account is taken of
the alternations implicit in patterns, for if the patterns in an FD do not
impose a unique ordering on the constituents, then each allowable order
is a diNerent alternative.
Now, the process of generation from a particular FD,
effectively
selects those members of F ,
F,, that can be unified with f. and then
repeats this procedure recursively for each constituent. But F is, in general, a conjunct containing some atomic terms and some alternations.
ignoring patterns for the moment, the procedure is as follows:
.f9
I.
Unify the atomic terms of F w i t h f . If this fails. the procedure as a whole
fails. Some number of alternations now remain to be considered. In other
words, that part of F that remains to be unified with f is an expression
F' Of the f o m (01.1 V ~ 1 . z... 0 1 . t ~A) ( 0 2 . 1 V 0 2 . 1 ... 0 1 . k , ) ... ( a V , tV a n , :
"' 0 m . k . ) .
2.
3.
Rewrite as an alternation by muhiplying our the terms of an arbitrary
alternation in F'. say the first one. This gives an expression F" of the
form ( O I J A (o?.I V 0 2 . 2 ... ~ 2 . d A (am.t V an,2 ...
o ~ , L )V
) ((11.2 A (a2.i V 0 2 . 2 ... ~ z . . t ) ) A ( 0 m . 1 V 0 " . 2 ... ~ , , , h ~ ) )...
( o I . ~A
, (02.1
V
02.2
... a 2 . d
A
(CLI
V (1.1
..-a , . d ) .
Apply the whole procedure (steps 1-3) separately to each conjunct in
F .
I f f is null, then there are no constraints on what is generated and the
effect is to enumerate all the sentences in the language, a process that
presumably only terminates in trivial cases. Unconstrained generation
does terminate, however, if it is applied only to a single level in the constituency hierarchy and, indeed, the effect is to generate precisely the
disjunctive normal form of the grammar required by the compiler.
It remains to spell out the alternatives that are implicit in the patterns.
This is quite straightforward and can be done in a separate part of the
procedure applied to each of the FDs that the foregoing procedure delivers. The basic idea is to generate all permutations of the constituent set
of the FD and to eliminate those that do not match all the patterns. This
process, like the one just described, can be considerably streamlined.
Permutations are generated in an analog of alphabetical order, so that all
those that share a given prefix are produced together. But, if any such
prefix itself violates the requirements of the patterns, the process is truncated and the corresponding suffixes are not generated for them.
The result of this phase of the compilation is a list of simple FDs,
containing no alternations. and having either no pattern, or a single pattern
that specifies the.yrder of constituents uniquely. Those that have no pattern become lexical entries and they are of no further interest to the
compiler. It seems likely that realistic grammars will give rise to liscs
which, though quite long, are entirely manageable. If a little care is exercised and list-processing techniques are used in a thoroughgoing rnanner, a great deal of structure turns out to be shared among members of
the list.
When FDs are simplified in this way, an analogy between them and
phrase structure rules begins to emerge. In fact, a simple F D F of this
kind could readily be recast in the form F -+ F ,
Fr where F ,
F&
are the subdescriptions identified by the terms in the pattern taken in the
order given there. The analogy is not complete because the items that
make up the right-hand side of such a rule cannot be matched directly
against the left-hand sides of other rules. A rule of this kind can be used
to expand a given item, not i f its left-hand side is that item. but if it can
be unified with it.
The second phase of the compiler centers around a procedure which,
given a list of simple FDs, and an integer n, attempts to find an attribute,
h:
3.
n
K
a
0
2
a
I
or path, on the Ilasis of which the nth constituent of those FDs can be
distinguished. In the general case, the result of this process is (I) a path
A , (2) a set of values for A , each associated with the subset of the list of
FDs whose nth constituent has that value of A . and (3) a residual subset
of the list consisting of FDs whose nth constituent has no value for the
attribute A . The procedure attempts to find a path that minimizes the size
of the largest of these sets. The residual set cannot be distinguished from
the other sets on the basis of the chosen path; in other words. for each
member R of the residual set, a new member can be added to each of the
other sets by adding a path with the appropriate value to a copy of R.
The same procedure is applied to each of the resulting sets. with the same
constituent number, until the resulting sets cannot be discriminated further. The overall result of this is to construct a discrimination tree that
can be applied to an FD to determine in which of the members of the list
it could be incorporated as the nth constituent.
The discrimination procedure also has the responsibility of detecting
features that are shared by all the FDs in the list provided to it. For this
purpose, it examines the whole FD and not just the nth constituent.
Clearly, if the list is the result of a previous discrimination step. there
will always be at least one such feature, namely the path and value on
the basis of which that previous discrimination was made. The discrimination procedure thus collects almost all the information required to construct a grammar procedure. All that is lacking is information about the
other procedures that a given one must nominate to label active chart
edges that it produces. This in its turn is bound up with the problem
reducing equivalent pairs of procedures to one. Before going on to these
problems, we shall do well to examine one furthqr example of a grammar
procedure to see how the operation of the discrimination process is reflected in its structure.
The procedure below, (24), shows the reflexes o f all the important parts
of the procedure. It is apparent that two sets of FDs have been discriminated on the basis that the discriminating path was CAT. If the value of
the path is VERB, a further discrimination is made on the basis of the
TRANSITIVE attribute o f the third constituent. If the constituent is a transitive verb, the next active arc will be labeled with the procedure V-TRANS,
and if intransitive with V-INTRANS. If the category is A U X , the next procedure will be AUX-STATE.-ThiSdiscrimination is reflected in lines 5-23.
Lines 3 and 4 reflect the fact that all the FDs involved have the subject
in first position, and that subject is singular. The instruction (OR (UNIFY
(SUBJ)(I))(GOOUT)), for example, say that either the value of the subject
attribute can be unified with the first constituent or the procedure terminates without further ado.
(PROG
( (OLDFEATURES FEATURES)
SI s2)
(OR (UNIFY (SUB))
(I))
(GO OUT))
(OR (UNIFY SING (SUEJNUM)) (GO OUT))
(SELECTQ(PATH
(3
CAT))
(VERB (GO ~ 2 ) )
(AUX (GO LI))
(NIL (SETQ SI T))
(GO OUT))
(NEWPATH 3 CAT (QUOTE VERB))
LZ
(SELECTQ (PATH (3 TRANSITIVE))
(PLUS (GO U))
(MINUS (GO LJ))
(NIL (SETQ sz T))
(GO OUT))
(NEWPATH 3 TRANSITIVE (QUOTE PLUS))
(TO V-TRANS)
u
(SETQ FEATURES OLDFEATURES)
(AND sz (NEWPATH 3 TRANSITIVE (QUOTE MINUS))
(TO V-INTRANS)
~3
(OR SI (GO OUT))
i22j
[231
(NEWPATH 3 CAT (QUOTE AUX))
(TO AUX-STATE)
LI
OUT)
It remains to discuss how a procedure is put in touch with its successor
procedures. I n the examples I have shown, instructions like (TO V-INTRANS), provide more or less mnemonic names for the procedures that
will be used to label new active edges, but, in doing so; I was exercising
expository license. When the time comes to insert a name like this into
the procedure+.,the discrimination process is invoked recursively to compile the procedu’re in question. Having done this, it compares the result
with a list of previously compiled procedures. If, as frequently happens,
it finds that the same procedure has been compiled before, it uses the old
name and throws away the result of the compilation just completed; 0therwise it assigns a new name. The effect of this is to substantially reduce
the total number of procedures required. In ATN terms, the effect is to
connate similar tails of the transition networks. If this technique were not
used, the network would in fact have the form of a tree.
5E
09
$
L
7.4. Conclusion
The compilation scheme just outlined is based on the view, expressed at
the outset, that the competence grammar is written in a formalism with
theoretical status. This is the seat o f linguistic universals. Since the parsing grammar is tightly coupled to this by the compiler. the language that
it is written in is of only minor theoretical interest. The obvious choice
I
3
is therefore a standard programming language. It is not only relatively
straightforward to compile the grammar into such a language, as I hope
to have shown, but the result can then be further compiled into the language of a particular computer. The result is therefore an exceedingly
efficient parser. But there is a concomitant inefficiency in the research
cycle involved in developing the grammar. This comes from the fact that
the compilation itself is a long and very time-consuming enterprise.
Two things can be said to mitigate this to some extent. First, the parsing
and generation grammars do indeed describe exactly the same languages,
so that much of the work involved in testing prototype grammars can be
done with a generator that works directly and efficiently off the competence grammar. The second point is this: it is not in fact necessary to
compile the whole grammar in order to obtain an entirely satisfactory
parser. Suppose, for example, that every constituent is known to have a
value for the attribute CAT. Some such assumption is almost always in
order. Suppose, further, that a parsing grammar is compiled ignoring completely all other attributes; in other words, the compiler behaves as though
any attribute-value pair in the grammar that did not mention CAT was not
there at all. The resulting set of parsing procedures clearly recognizes at
least all the sentences of the language intended, though possibly others
in addition. However, the results of the parsing process can be unified
with the original grammar to eliminate false analyses.
Notes
1. See Kay (1980) for a fuller discussion of these terms.
2. The functions PATH, NEWPATH, and ASSIGN all take an indefinite number or
arguments which, since the functions are all FEXPRS, are unevaluated. The first
argument is a constituent number and the remainder constilute a path.
3. Argumentative LISP programmers may find this construction unconvincing on
the grounds that OLDMATRIX in fact preserves only a pointer to the data structure
and not the data structure itself. They should subside, however, on hearing that
the data structure is an association list and the only changes made t o it involve
cowing new material on the beginning. thus masking old versions.
Reference
Giizdar. Gerald. 1982. Phrase structure grammar, In: Pauline Jacobson and Geoniey K.
Pullum (eds.). Thr nature afsyntartic prorrssing. Dordrechl. Holland: D. Reidel. pp.
131-86.
Kaplan, Ronnld M. 1972. Augmented transition networks as psychologicalmodels of sentence comprehension. Artificial Inrelligrncr 3:77- 100.
1973. A General Syntactic Processor. In: Rustin (cd.). Narural languagr procrssing.
Englewood Cliffs. N.J.: Rentice-Hall.
1978. Computational resources and linguistic theory. 'fhrorrrical Issurs in Narural Languagr Procrssing 2.
Kay. Martin. 1973. The mind system. In: Rustin (ed.).Natural languagr procrssing. Englewood Cliffs. N.J.: Rentice-Hall.
1977. Morphological and syntactic analysis. In: A. Zampolli (ed.). Syntactic structurrs
procrming. Amsterdam: North Holland.
1979. Functional grammar. Proceedings of the Fifth Annual Meefing a/ thr Brrkelry
Linguisrir Socirty .
1980. Algorirhm schemara and data structures in syntactic processing. CSL-ilO.12 Xerox
Palo Alto Research Center.
Woods, William A. 1%9. Transition network grammars for nalural language analysis. Communications of the Associarion for Compuring Machinery 13:591-602.
3h:
3.
n
0
a
E
v)