Download Nouns and Noun Phrases: Grammatical Variation and Language

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Morphology (linguistics) wikipedia , lookup

Lexical semantics wikipedia , lookup

Modern Greek grammar wikipedia , lookup

Zulu grammar wikipedia , lookup

Swedish grammar wikipedia , lookup

Malay grammar wikipedia , lookup

Portuguese grammar wikipedia , lookup

Junction Grammar wikipedia , lookup

Preposition and postposition wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Arabic grammar wikipedia , lookup

Grammatical case wikipedia , lookup

Russian declension wikipedia , lookup

Compound (linguistics) wikipedia , lookup

Antisymmetry wikipedia , lookup

French grammar wikipedia , lookup

Agglutination wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Grammatical number wikipedia , lookup

Inflection wikipedia , lookup

Classifier (linguistics) wikipedia , lookup

Romanian grammar wikipedia , lookup

Polish grammar wikipedia , lookup

Romanian nouns wikipedia , lookup

Vietnamese grammar wikipedia , lookup

Esperanto grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Danish grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Determiner phrase wikipedia , lookup

Transcript
The Typology of Noun Phrase Structure from a Processing Perspective
JOHN A. HAWKINS
Abstract
This paper examines cross-linguistic variation patterns in the syntax and morpho-syntax of
Noun Phrases. The variation is surprising and not readily explainable in grammatical terms
alone, but many of these patterns can be motivated in terms of on-line processing demands.
Two processing hypotheses are proposed: anything that is an NP must be recognized as
such, i.e. every NP must be "constructable"; and all the items that belong to NP must be
"attachable" to it, and the amount of syntactic, morpho-syntactic or lexical encoding of
attachment will be in proportion to complexity and efficiency in processing. Some
predictions following from these hypotheses are defined. Typological generalizations and
cross-linguistic data provide prima facie evidence for them, suggesting that processing has
played a significant role in shaping grammars in this area.
Keywords:
agreement, case copying, classifiers, definite articles, fixed word order, head
of phrase, lexical differentiation, nominalizing particles, noun phrase,
parsing, possessive phrases, processing typology, projection
1. Introduction
Noun phrases (NPs) exhibit a surprising amount of cross-linguistic variation in the morphosyntactic and syntactic devices that define their structure, and puzzling restrictions in the
occurrence versus non-occurrence of these devices in different environments. Some
languages have definite or indefinite articles, some have classifiers, some make extensive use
of nominalizing particles, case marking is found in some, case copying throughout the noun
phrase in a subset of these, other kinds of agreement patterns can be found on certain
2
modifiers, "linkers" exist in some languages for NP-internal constituents, a "construct state"
attaches NP to a sister category in others, and so on. The positioning of these items within
the NP also exhibits considerable variation. The grammatical rules generating them must
sometimes guarantee their presence, sometimes their absence, in ways that require numerous
formal stipulations and complications.
My goal is to examine these variation patterns in NPs from a processing perspective.
I will not present a detailed typology of the noun phrase. That has already been done, cf. e.g.
Rijkhoff (2002) and Plank (2003). There have also been detailed studies of specific morphosyntactic devices characteristic of NPs, such as case marking and Suffixaufnahme (Plank
1995) and classifiers (Aikhenvald 2003). Rather, my goal will be to show that we can
understand the variation better, and shed light on some of the puzzles, if we look at grammars
in terms of processing. Predictions can be made for the existence of certain structural
devices, and for their presence versus absence, on the basis of general principles that are
supported, ultimately, by experimental and corpus findings from language performance.
Looking at cross-linguistic variation in this interdisciplinary way is not
straightforward. Linguists do not, in general, do so. They either construct a formal
grammatical model for the patterns in question or conduct a typological survey.
Psychologists do not do so either. They focus on performance data, generally from a
restricted set of European languages. This methodological divide is unfortunate since I have
argued (in Hawkins 1990, 1994, 2004), and a growing number of others have also argued
(see e.g. Aissen 1999, Bresnan et al. 2001, Bybee & Hopper 2001, Dryer 1992, Haspelmath
1999, Kirby 1999, Newmeyer 2005), that many grammatical properties are correlated with
what are ultimately processing and usage-based considerations including complexity,
3
efficiency and frequency. If this is so, then we need to initiate a more systematic dialogue
between linguists and psychologists so that we can better understanding how processing
works and how it has impacted grammars and led to typological variation. The purpose of
this paper is to initiate such a dialogue with respect to the typology of the noun phrase.
I shall make use of two simple and intuitive processing ideas that need to be
incorporated in any model of comprehension (e.g. Fodor et al. 1974) or production (e.g.
Levelt 1989). First, every phrase that is an NP has to be recognized as such in language use,
i.e. it has to be "constructable" as an NP. Second, all the words and immediate constituents
that are intended to belong to a given NP must be correctly recognized as belonging to it, i.e.
they must be "attachable" to this NP, rather than to some other phrase.
Noun phrases pose two challenges in this respect for any parser. First, NPs do not
always contain nouns (Ns), i.e. the head category that "projects" to a mother NP, making
it recognizable (cf. Jackendoff 1977, Pollard & Sag 1994) . An NP must therefore be
"constructable" from a variety of other terminal categories that are dominated by NP, the
precise nature of which can vary across languages. Mandarin de and Lahu ve can
nominalize non-nominal categories or phrases. Definite articles, in languages that have
them, often have a similar nominalizing function, constructing NP over a non-nominal
part of speech or phrase.
Second, it must be made clear in performance which terminal categories are to be
"attached" to a given NP, as opposed to some other NP or to other phrases. Nominal
agreement patterns on the immediate constituents of NP, as in Latin, signal such attachments.
So does "case copying" in languages that have it. Such devices are particularly useful when
4
attachments are difficult, for example when the immediate constituents of NP are not
adjacent to one another.
These challenges for parsing must also be reflected, to a considerable extent at least,
in a corresponding model for production.
This paper defines two general hypotheses for "construction" and "attachment" in
noun phrase processing, in sections 3 and 4, and it formulates some predictions that follow
from these hypotheses. A range of typological patterns will be presented together with
grammatical details from diverse languages that are relevant to these predictions, that test
them and that appear to support them. They provide prima facie evidence for a general
hypothesis regarding the performance-grammar interface that I defined in Hawkins (2004):
(1)
Performance-Grammar Correspondence Hypothesis (PGCH)
Grammars have conventionalized syntactic structures in proportion to their degree
of preference in performance, as evidenced by patterns of selection in corpora and
by ease of processing in psycholinguistic experiments.
The PGCH accounts for many universal and distributional regularities, it motivates many
exceptions to current universals (Newmeyer 2005, Hawkins 2004), and it makes correct
predictions for many variation patterns across grammars that are not currently predicted by
grammatical considerations alone. The PGCH provides the ultimate motivation for the noun
phrase predictions to be formulated here.
The paper begins, in section 2, with a brief enumeration, primarily for the benefit of
readers who are not typologists, of some major syntactic and morpho-syntactic devices
across languages that appear to be relevant to any discussion of NP construction and NP
attachment.
5
2. NP Construction and Attachment to NP
2.1 Construction
Several categories construct NP:
•
Nouns (i.e. lexical items specialized for the category N) like student and professor
in English
•
Pronouns (personal, demonstrative, interrogative, etc): he/she, this/that, who, and
their counterparts in other languages, cf. Bhat (2004)
•
Various determiners including the definite article (in theories in which
Determiner Phrase and NP are not distinguished, cf. Hawkins 1993, 1994, Payne
1993) [1]
•
Nominalizing particles like Lahu ve (Matisoff 1972), Mandarin de (Li & Thompson
981) and Cantonese ge (Matthews & Yip 1994:113) can combine with a non-noun or
pronoun to construct a mother NP, as in the examples of (2), cf. C. Lehmann
(1984:61-66):
(2) a. np[chu ve]
(Lahu)
fat NOMINALIZER
'one that/who is fat'
b. np[vp[chī hūn]
eat meat
de]
(Mandarin)
NOMINALIZER
'one who eats meat'
c. np[vp[heui hōi-wúi]
go
ge]
have-meeting NOMINALIZER
'those who are going to the meeting'
(Cantonese)
6
•
Classifiers in many languages perform syntactic functions that include the
construction of NP (Aikhenvald 2003:87-90), which permits omission of nouns from
NP, resulting in pronoun-like uses for classifiers, as in the following example from
Jacaltec (Craig 1977:149):
(3)
xal naj pel
chubil chuluj
said CL Peter that
naj
hecal
will-come CL/he tomorrow
'Peter said that he will come tomorrow'
•
Case particles or suffixes construct a case-labelled mother or grandmother NP
respectively, cf. Hawkins (1994:ch.6) for detailed discussion, e.g. in Japanese,
German, Russian:
(4) a. npAcc[tegami o]
(Japanese)
letter ACC
b. npAcc[den
Tisch]
(German)
the-ACC-SG-MASC table
c. npAcc[lip-u]
(Russian)
lime tree-ACC-SG-II
2.2 Attachment
Various (morpho-)syntactic devices signal the attachment of sister categories to a given NP:
•
Adjective agreement is a clear instance, e.g. Latin adjectives agree in case, number
and gender features with some np[N], see Vincent (1988), permitting separation of
noun phrase constituents as in (5b):
(5) a. np[illarum
bonarum
feminarum]
that-GEN-PL-FEM good-GEN-PL-FEM woman-GEN-PL-FEM
7
'of those good women'
b. pp[npi[magno]
cum npi[periculo]]
great-ABL-SG-NEUT with
danger-ABL-SG-NEUT
'with great danger'
•
Case copying in "word-marking" Australian languages like Kalkatungu (Blake 1987,
Plank [ed.] 1995) also signals attachment (to a similarly case-marked np[N]),
permitting separation of NP constituents as in (6b):
(6) a. npi[thuku-yu
yaun-tu] npj[yanyi]
dog-ERG big-ERG
itya-mi
white-man
(Kalkatungu)
bite-FUT
'The big dog will bite the white man'
b. npi[thuku-yu] npj[yanyi]
dog-ERG
itya-mi
npi[yaun-tu]
white-man bite-FUT
big-ERG
These case suffixes also construct a case-marked mother or grandmother NP, as in
(4). I.e. case markers can serve both to construct the dominating (case-labeled) NP
and to attach the respective daughters with the same case to it.
•
Mandarin de similarly performs both an attachment and a construction function,
attaching NP-dominated constituents together and constructing the mother NP, cf. the
discussion in C. Lehmann (1984:63-66) from which the following examples are
taken:
(7)
a. np[shuìjiòu de
sleep
rén]
NOMLZ/ATTACH person
'sleeping person'
b. np[[bù hăo] de
lái-wăng]
(Mandarin)
8
not good NOMLZ/ATTACH come-go
'undesirable contact'
c. np[s[wŏ lái]
de
dìfáng]
I come NOMLZ/ATTACH
place
'place from which I am coming'
d. np[s[wŏ vp[jiăn zhĭ]] de
I
jiăndao]
cut paper NOMLZ/ATTACH
scissors
'scissors with which I cut paper
•
Classifiers also attach NP-sisters to the NP that they construct, as in the following
examples from Cantonese in which the classifier attaches a possessor to its head noun
(8a) and a (preposed) relative clause to its head noun (8b), cf. Matthews & Yip
(1994:107-12):
(8) a. lóuhbáan ga chē
boss
(Cantonese)
CL car
'the boss's car'
b. ngóhdeih hái Faatgwok sihk dī
we
in France
eat
yéh
CL food
'the food we ate in France'
The repeated classifier -ma in the following example from Tariana functions like
agreement in Latin (5) and case copying in Kalkatungu (6) to signal co-constituency
between adjective and noun within NP (Aikhenvald 2003:94-5): nu-kapi-da-ma
hanu-ma (1SG-hand-CL:ROUND-CL:SIDE big-CL:SIDE), 'the big side of my
finger'.
9
•
Linkers such as na in Tagalog attach ulól ('foolish') and unggó ('monkey') into a
single NP in ulól na unggó ('foolish monkey'), cf. Hengeveld et al. (2004:553)
•
The construct state in Berber signals co-constituency between nouns (/NPs) in the
construct state and a preceding noun (9b), quantity word (9c), preposition (9d),
intransitive verb (9e), and transitive verb (9f) (Keenan 1988):
(9)
a. Free form:
Construct form:
aryaz 'man'
arba 'boy'
tarbatt 'girl'
uryaz
urba
terbatt
b. np[axam np[uryaz]]
tent
man-CONSTR
'tent of the man/the man's tent'
c. np[yun uryaz]
one man-CONSTR
'one man'
d. pp[tama (n) np[uryaz]]
near
man-CONSTR
'near the man'
e. s[lla
IMPF
vp[t-alla
np[terbatt]]]
she-cry
girl-CONSTR
'The girl is crying'
f. s[vp[i-annay np[urba]
he-saw
boy-CONSTR
'The boy saw the girl'
np[tarbatt]]]
girl
(Berber)
10
The construct state signals attachment of these immediate constituents but does not
unambiguously construct any particular mother or grandmother phrase. The mother
most immediately dominating np[N] in the construct state can be NP, PP, or VP, etc.
•
A possessive (/genitive) -s in English (and similar forms in other languages) signals
the attachment of PossP to the head N, and also the construction of a grandmother (or
mother) NP (NPi in (10)):
(10)
npi[possp[npj[the king of England]-s] daughter]
3. The Constructability Hypothesis, its Predictions and Typological Patterns
I begin with the following hypothesis, which is motivated by general considerations of
parsing and parsability, by consideration of a large range of phrasal types in many languages,
and by performance data from diverse language types, as summarized in Hawkins (1994,
2004):
(11)
The Constructability Hypothesis (cf. Hawkins 1994:379)
For each phrasal node P there will be at least one word of category C dominated by P
that can construct P on each occasion of use.
It appears that there is always some category C that enables the parser to recognize that C is
dominated by a phrase of a particular type, NP, PP, or VP, etc, generally as a daughter or as a
granddaughter. Building hierarchical phrase structure trees in syntactic representations on
the basis of terminal elements is a key part of grammatical processing. If a given P cannot be
properly recognized (or "constructed"), its integration into the syntactic tree, and its semantic
interpretation are at risk. So construction of mother nodes, or of grandmothers, is a vital part
of successful parsing and production, and this is reflected in the cross-linguistic fact that
grammars appear to have systematic devices for constructing each of their phrasal nodes.[2]
11
More generally, I have argued that the hypothesis in (11) motivates a lot of the grammatical
properties of heads of phrases, both lexical and functional, and that it provides a processing
explanation for this universal and for many related properties that involve head-like
projection.[3]
3.1 NP Construction
The Constructability Hypothesis in (11) leads to a prediction for the structure of NPs:
(12)
Prediction 1: NP Construction
Any phrase that is of type NP must contain either (i) a lexical head N or pronoun
(personal or demonstrative, etc) or proper name, or (ii) some other functional
category that can construct NP on each occasion of use in the absence of N or Pro or
Name.
We expect NPs to contain either some lexical and inherent head category, therefore, like a
noun or pronoun or name, on the basis of which NP can always be recognized; or
alternatively we expect to find categories that project uniquely to NP being especially
productive, and indeed obligatory, in the absence of nouns, pronouns and names. Examples
are given in (13):
(13) a. Lahu, Mandarin and Cantonese nominalizers, as in (2).
b. Jacaltec classifiers, as in (3).
c. English permits omission of nouns with certain restrictive adjectives plus the definite
article as a constructor of NP, the rich, the poor, the good, the bad. We say I envy the
rich, etc, not *I envy rich.
12
d. Spanish has expanded this option (lo difícil 'the difficult thing') to other categories
such as infinitival VPs in el hacer esto fue fácil (DEF to-do this was easy) 'doing this
was easy' (Lyons 1999:60).
e. Malagasy has expanded it to locative adverbs, as in ny eto (DEF here), meaning 'the
one(s) who is/are here' (Anderson & Keenan 1985:294).
f. Case-marking on adjectives in e.g. Latin and German permits them to function as
referential NPs, Latin bonī (good-Nom-Masc-Pl) „the good ones‟, German Gutes
(good-Nom-Neut-Sg) „good stuff‟.
g. In numerous languages the definite article signals a nominalization of some kind, e.g.
Lakhota ktepi kį wąyake (kill DEF he-saw) 'he saw the killing' (Lyons 1999:60), or
the construction of a subordinate clause in noun phrase position, e.g. as subject or
object, in Huixtan Tzotzil and Quileute (Lyons 1999:60-61).
h. Head-internal relatives are structurally clauses that function as NPs and that are
regularly marked as such by definiteness markers and/or case particles and
adpositions, as in Diegueno (Gorbet 1976, Basilico 1996).
i. Free relatives can also consist of a clause functioning as an NP that is
constructed by a nominalizing particle, e.g. in Cantonese léih mh ngoi ge (you not
want Nominalizer) 'what you don't want' (Matthews & Yip 1994:113).
The values of C constructing NP can vary in these np{C, X} structures, as can the
values of X. There are language-particular conventions for the precise set of constructing
categories (nominalizing particles, classifiers, definite articles, etc) and for the different
values of X (adjective, adverb, infinitival VP, S, etc) that can combine with the relevant C to
yield a noun phrase. But the very possibility and cross-linguistic productivity of omitting the
13
noun/pronoun/name and of still having the phrase recognized as NP, in so-called
'nominalizations' and in the other structures illustrated here, follows from the Constructability
Hypothesis in (11).
A further prediction made by (11) is relevant for those languages whose lexical items
are highly ambiguous with respect to syntactic category, even for the major parts of speech
like noun and verb. The Polynesian languages are often discussed in this context (see e.g.
Broschart 1997, Hengeveld et al. 2004). So is Cayuga (Sasse 1988). English has a large
number of words that are ambiguous between noun and verb and there are many minimal
pairs such as they want to run/they want the run and to play is fun/the play is fun. The article
constructs NP here and disambiguates between N and V.
Languages without a unique class of nouns do not have lexical categories that can
unambiguously construct NP on each occasion of use. If lexical predicates are vague as to
syntactic category, then projection to NP is not guaranteed by lexical entries and the
Constructability Hypothesis (11) is not satisfied.
(14)
Prediction 2: Lexical Differentiation
Languages in which nouns are differentiated in the lexicon from other categories
(verbs, adjectives or adverbs) can construct NP from nouns alone. Languages
without a unique class of nouns in the lexicon will make use of constructing particles
in order to construct NP and disambiguate the head noun from other categories; such
particles are not required (though they are not ruled out) in languages with lexically
differentiated nouns.
Relevant data come from the Polynesian languages, which make extensive and obligatory use
of NP-constructing particles such as "definite" articles, extending their meanings into the
14
arena of indefiniteness, see Lyons (1999:57-60). Samoan le, Maori te and Tongan e appear
to be best analyzed as general NP constructors: they convert vague/ambiguous predicates
into nouns within the NP constructed. Other (tense and aspect) particles construct a clause
(IP) or VP and convert ambiguous lexical predicates into verbs (Broschart 1997). We have
here a plausible motivation for the expanded grammaticalization of definite articles and other
particles in these languages (see Hawkins 2004:82-92 for detailed discussion).[4]
3.2 VO versus OV Asymmetries
VO languages have predominantly head-initial phrases that permit early construction of these
phrases in parsing, by projection from the respective heads (V projects to VP, N to NP, P to
PP, etc). OV languages have predominantly head-final phrases that favor late construction. I
have argued (Hawkins 1994, 2001, 2004) that consistent head ordering minimizes processing
domains for phrase structure recognition by shortening the distances between heads and that
this provides an explanation for the existence and productivity of these two major language
types across the world, head-initial and head-final.[5] There is, however, an interesting
asymmetry between them that can be seen in so-called non-lexical or functional head
categories that can be linked to considerations of processing.
Consider first the combination of a verb with a PP sister within VP, i.e. phrases such
as vp[went pp[to the movies]] in English. There are four logical possibilities for the ordering
of V, the lexical head of VP, and P, the lexical head of PP, shown in (15):
(15)
a. vp[went pp[to the movies]]
--------c. vp[went [the movies to]pp]
-------------------
b. [[the movies to]pp went]vp
--------d. [pp[to the movies] went]vp
-------------------
15
(15a) is the English order, (15b) is the Japanese order, and these two sequences with adjacent
lexical heads (V and P) guarantee the smallest possible strings of words for the recognition of
VP and its immediate constituents (see the underlinings). They are also highly preferred by
approximately 94% to a combined 6% for the inconsistently ordered heads of (15c) and (d) in
the Hawkins (1983, 1994, 2004) and Dryer (1992) samples.
An additional non-lexical category C within NP that can construct NP, in addition to
N, can be efficient in VO languages. Either np[N ...] or np[C ...] orders can construct NP
immediately on its left periphery and provide minimal “phrasal combination domains” and
“lexical domains” linking e.g. V and NP within a VP (cf. note 5).
(16)
vp[V np[N ...]
vp[V np[C ... N ...]
------
We expect additional constructor categories C to be productive in VO and head-initial
languages, therefore, and to be especially favored when N itself is not initial in NP, e.g. in
np[C AdjP N]. The determiner position of English exemplifies this, with left-peripheral
articles constructing NP in advance of N. A German structure like die von dem Bauer
geschlachtete Kuh (the by the farmer slaughtered cow, i.e. 'the cow slaughtered by the
farmer') provides an extreme example of a left-peripheral constructor of NP in advance of N,
since the intervening participial phrase can be quite complex (Weber 1970).
Additional constructing categories in OV and head-final languages, on the other hand,
do not have comparable benefits. They lengthen phrasal combination domains and other
processing domains linking NP to V when NP precedes, whether the additional constructor
precedes or follows N:
16
(16)
[[... N ... C]np V]vp
[[... C ... N]np V]vp
------------
Additional constructors of NP can be inefficient in OV orders, therefore, and are predicted to
be significantly less productive than their head-initial counterparts as a consequence.[6]
(17)
Prediction 3: VO versus OV asymmetries
Constructors of NP other than N, Pro and Name, such as articles, are efficient for NP
construction in VO languages and should occur frequently; they are not efficient for
this purpose in OV languages and should occur less frequently.
We can test this using the World Atlas of Language Structures (Haspelmath et al.
2005). WALS provides data on languages that have definite articles as a separate category
from demonstrative determiners (from which definite articles have generally evolved
historically, cf. Himmelmann 1997, Lyons 1999). If, as argued in Hawkins (2004:82-93), it
is processing efficiency that drives the grammaticalization of definite articles out of
demonstratives, then we expect to see a skewing in the distribution of definite articles in
favor of head-initial languages. The figures in (18) show that VO languages do indeed have
significantly more definite articles than OV languages. We also expect that non-rigid OVX
languages should have more definite articles than OV languages with rigid verb-final orders,
since OVX languages have more head-initial phrases in their grammars, including headinitial NPs (Hawkins 1983), in which early construction of NP can be an advantage. This
prediction is also borne out. The figures in parentheses refer to Dryer's "genera".[7]
(18)
Rigid OV
Def word distinct from Dem
19% (6)
No definite article
81% (26)
17
VO
58% (62)
42% (44)
Non-rigid OVX
54% (7)
46% (6)
By similar reasoning separate words for constructing subordinate clauses, such as
complementizers, should be more frequent in VO than in OV languages. Verbs construct
VP, and finite or agreement-marked verbs project to a clause or IP (Hawkins 1994:ch.6).
Complementizers can precede the verb in VO languages, shortening phrasal combination
domains for matrix sentence (S and VP) processing in sentences such as I believe [that John
ate the sandwich] (the words relevant for matrix clause processing are shown in bold). But
whether the complementizer in an OV language occurs initially (in sentences corresponding
to I [that John the sandwich ate] believe and [that John the sandwich ate] I believe) or
finally (I [John the sandwich ate that] believe and [John the sandwich ate that] I believe)
there will be longer phrasal combination domains compared with VO languages. When the
complementizer is initial it constructs the subordinate clause before the verb, and when it is
final the verb will generally construct the subordinate clause before the complementizer.
Either way there will be longer matrix processing domains. According to Dryer (1992, 2007)
separate complementizer words are indeed significantly more common in VO languages than
in OV, and their initial positioning in VO languages is supported exceptionlessly, as shown
in (19). (Again, genera are given in parentheses.)
(19)
Complementizer Words (% of total lgs with complementizers)
OV
VO
Final
14% (27)
Initial
12% (22)
Final
0%
Initial
74% (140)
(0)
18
4. The Attachability Hypothesis, its Predictions and Typological Patterns
Corresponding to the Constructability Hypothesis in (11) I propose (20):
(20)
The Attachability Hypothesis
For each phrasal node P, all daughter categories {A, B, C, ...} must be attachable to P.
The degree of syntactic, morpho-syntactic or lexical encoding that facilitates
attachability will be in proportion to the processing complexity and/or efficiency of
making the attachment.
In other words, all daughters must be attachable, and the more difficult the attachment is, the
more grammatical or lexical information is required to bring it about. The use of explicit
attachment devices under conditions of difficulty, and their possible omission when
processing is easy, is efficient: activation of processing resources and greater effort are
reserved for conditions under which they are most useful. This is supported by a large range
of grammatical and performance data that motivate the principle of Minimize Forms in
Hawkins (2004:38-48): Form minimizations apply in proportion to the ease with which a
given property P can be assigned in processing to a given form F. Rohdenburg's (1996,
1999) complexity principle provides further supporting data from English corpora: "In the
case of more or less explicit grammatical options, the more explicit one(s) will be preferred
in cognitively more complex environments" (Rohdenburg 1999:101).
For attachments to NP, (20) leads to the hypothesis in (21):
(21)
NP Attachment Hypothesis
Any daughters {A, B, C, ...} of NP must be attachable to it on each occasion of use,
through syntactic, morpho-syntactic or lexical encoding on one or more daughters,
19
whose explicitness and differentiation are in proportion to the processing complexity
and/or efficiency of making the attachment.
In sections 4.1 - 4.5 I define and test some predictions made by (21).
4.1 Separation of NP Sisters
One clear factor that increases the difficulty of attaching constituents together as sisters is
separation from one another.
(22)
Prediction 4: Separation of Sisters
Morpho-syntactic encoding of NP Attachment will be in proportion to the degree of
separation between sisters: the more distance, the more encoding.
Consider first some performance data from English involving relative clauses with explicit
relativizers (who, whom, which, and that) versus zero. The relativizers construct a relative
clause. Their presence can also help to attach the relative to the head, especially when there
is animacy agreement between relativizer and head noun (the professor who ..., etc), but also
in the absence of agreement in animacy (since relatives are known to attach to head nouns by
Phrase Structure rules). Empirically, it turns out that the presence of the relativizer and the
avoidance of zero is directly proportional to the distance between the relative clause and the
head noun. The figures in (23) are taken from Quirk's (1957) corpus of spoken British
English. They show that the use of explicit relativizers increases significantly, from 60% to
94%, when there is any separation between nominal head and relative.
(23) a. Restrictive (non-subject) relatives adjacent to the head noun
explicit relativizer = 60% (327)
zero = 40% (222)
b. Restrictive (non-subject) relatives separated from the head noun
explicit relativizer = 94% (58)
zero = 6% (4)
20
The figures in (24) measure the impact on relativizer retention resulting from larger versus
smaller structural separations and are taken from the Brown corpus (collected by Barbara
Lohse , cf. Lohse 2000).
(24) a. Separated relatives in NP-internal position
which/that = 72% (142)
zero = 28% (54)
b. Separated relatives in NP-external position (i.e. extraposed)
which/that = 94% (17)
zero = 6% (1)
Relatives in (24b) have been completely extracted out of NP (in structures corresponding to
buildings will never fall down which we have constructed). In (24a) they remain NP-internal
but still separated (e.g. by an intervening PP, buildings in New York which we have
constructed). There is a significant increase from 72% to 94% in relativizer retention when
the separated relatives are extraposed. These data support prediction 4.
Consider now some data from grammars involving explicit case marking. In
languages that employ case copying as an attachment strategy (see section 2.2) we predict a
possible asymmetry whereby explicit case marking can be retained on separated, but not on
adjacent, sisters. Warlpiri exemplifies this (Blake 1987). Contrast the Warlpiri pair (25)
with Kalkatungu (6), repeated here:
(25) a. np[tyarntu wiri-ngki]+tyu yarlki-rnu
dog
big-ERG+me
(Warlpiri)
bite-PAST
b. npi[tyarntu-ngku]+tyu yarlku-rnu npi[wiri-ngki]
dog-ERG+me
bite-PAST
big-ERG
yaun-tu] npj[yanyi]
itya-mi
'The big dog bit me.'
(6) a. npi[thuku-yu
(Kalkatungu)
21
dog-ERG big-ERG
white-man
bite-FUT
'The big dog will bite the white man'
b. npi[thuku-yu] npj[yanyi]
dog-ERG
itya-mi
npi[yaun-tu]
white-man bite-FUT
big-ERG
Case copying in Kalkatungu occurs on every word of the NP, whether adjacent or not. Warlpiri
case copying occurs only when NP sisters are separated (25b). When NP constituents are
adjacent (25a) the ergative case marking occurs just once in the NP and is not copied. This pair
of Australian languages illustrates the asymmetry underlying Moravcsik's (1995:471)
implicational universal in (26):
(26)
Moravcsik's Universal
If agreement through case copying applies to NP constituents that are adjacent, it
applies to those that are non-adjacent.
In other words, agreement is found under adjacency and non-adjacency, but it can be absent
from adjacency at the same time that it occurs in non-adjacent environments. What is ruled
out is the opposite asymmetry: agreement when adjacent and not when non-adjacent. Since
agreement is a type of attachment marking we see, correspondingly, that the explicit
encoding of attachment in performance and grammars is found under both adjacency ((23a)
and (6a)) and non-adjacency ((23b), (25b) and (6b)). Zero coding is preferred when there is
adjacency and is increasingly dispreferred when there is not (compare (23a) with (23b) in
performance and (25a) with (25b) in grammars). What is not found is the opposite of the
English relativizer pattern and of Warlpiri case coding: explicit attachment coding under
adjacency and zero coding for separated items.
22
An example of case copying in a nominative-accusative language comes from
Hualaga Quechua, as discussed in Plank (1995:43) and Koptjevskaja-Tamm (2003:645).
When a possessor phrase is separated from its possessed head, as in (27), the accusative case
marker -ta appropriate for the whole NP is added to genitive case-marked Hwan-pa.
(27)
Hipash-nin-ta
kuya-: Hwan-pa-ta
daughter-3POSS-ACC love-1 Juan-GEN-ACC
'I love Juan's daughter'
4.2 Ambiguous attachment sites
A second factor that adds to the processing load of making a correct NP attachment is the
availability of alternative sites for attachment and of structural ambiguity:
(28)
Prediction 5: Ambiguous Attachments
Morpho-syntactic encoding of NP Attachment will be in proportion to the availability
of alternative containing phrasal nodes for the attachment of NP constituents: the
more attachment sites, the more encoding.
Consider some data from Modern Greek. The adjective normally precedes the noun in
Greek, but it can also follow. When it does so a second definite article is required in order to
signal attachment of the adjective to NP, as shown in (29a). (29b), without the second
article, has the syntax and semantics of a predicate adjective within a secondary predication
(Joseph & Philippaki-Warburton 1987:51):
(29) a.
Mu
arésun
i
fústes
i
kondés
Me-GEN please-3PL DEF skirts-FEM+PL DEF short-FEM+PL
'I like the short skirts'
b. Mu arésun i fústes
kondés
23
'I like the skirts short' (i.e. 'I like them to be short')
This minimal pair in Greek reveals the role that definiteness marking can play in signaling
syntactic attachment to NP, in addition to its construction role (cf. section 2.1), and in
addition to its pragmatic functions (Lyons 1999). It also gives us an important glimpse into
the origins of agreement affixation in languages like Arabic, where the affixes are derived
from definiteness markers and are copied on noun phrase daughters. Illustrative examples
from Arabic and other languages are given in (30):
(30) a. Arabic has a definite prefix on adjectives in definite NPs: al-bustān-u l-kabīr-u
(DEF-garden-NOM DEF-big-NOM), 'the big garden' (Lyons 1999:91).
b. Agreement affixes on Albanian adjectives derive historically from the definite article:
i mir-i djalë (DEF good-AGR boy), 'the good boy' (Lyons 1999:79).
c. Lithuanian and Latvian have definite adjective declensions that are the product of an
affixed demonstrative stem with 'j', while Serbo-Croat and Slovenian show relics of a
system of similar origin (Lyons 1999:82-84).
Further example of definiteness marking for attachment purposes can be seen in languages
like Rumanian. NPs in combination with a preposition often omit the (affixed) definite
article when the NP consists of N alone, but keep it when NP consists of two or more ICs,
e.g. N Adj, N Rel, and N Adj Rel (Mallinson 1986:55, Himmelmann 1998):
(31) a. intrat
în casă [ǎ = schwa]
entered in house, i.e. 'entered the house'
b. intrat
în casa
mare
entered in house-DEF big, i.e. 'entered the big house'
4.3 Frequency of NP Attachments
24
A further factor that facilitates attachment of a category to an NP that has been constructed in
on-line processing is the frequency with which that category normally occurs NP-attached.
(32)
Prediction 6: Frequency of NP Attachments
Morpho-syntactic encoding of the attachment of X to an NP that has already been
constructed by some Y in on-line processing will be in proportion to the frequency
with which X occurs NP-attached: the less frequent X‟s attachment, the more
encoding, and vice versa.
Haspelmath (1999:234-5) has discussed relevant data involving the occurrence versus nonoccurrence of definite articles in NPs containing possessives. Grammatical rules are
sensitive to the relative ordering in which the possessor and the possessed head noun occur,
and this ordering can be linked in the account given here to the frequency with which the
category that follows (X) occurs attached to the one that precedes (Y).
In several languages (including English, Spanish, Swedish, Rumanian and Albanian)
a preposed possessor rules out an accompanying article, whether the article normally comes
before the noun (33a) or after it (33b):
(33) a. English our dog's (*the) kennel; Spanish (*el) mi libro, (DEF) my book
b. Swedish min vän(*nen), my friend(DEF); Albanian im vëlla(*-i), my brother(DEF)
But a postposed possessor in these languages does not rule out a definite article:
(34)a. English the kennel of our dog; Spanish el libro mío, the book my,
b. Swedish vänn-en min, friend-DEF my; Albanian vëlla-i im, brother-DEF my
The asymmetry here is between Poss N structures, in which articles are dispreferred, versus
Art N Poss and N Art Poss.
25
Frequency of NP Attachments provides a possible explanation. In the Poss N order
either a possessive determiner such as my or a possessive phrase morpheme such as -s
constructs NP (by Mother Node Construction or Grandmother Node Construction, Hawkins
1994:ch.6). A following N can be readily linked to the NP so constructed because N requires
a mother NP and co-occurs frequently with determiners and other modifiers within NP. But
in the reverse order, N Poss, either N or Art will construct NP on the left periphery and a
following possessor will not be so readily attachable to this NP.
For example, Swedish min is a pronoun as well as a so-called possessive adjective,
and pronouns typically construct NPs on their own and do not normally co-occur with
accompanying modifiers. Hence, a min following vänn can benefit from explicit attachment
marking in the form of a definite article (vänn-en min) in order to signal that a normally freestanding pronominal element which constitutes NP on its own is actually an immediate
constituent of the NP already constructed by the noun vänn. A similar account can be given
for the Spanish mío in postnominal position (el libro mío). For phrasal possessors such as
our dog's kennel we might appeal to the fact that an N following PossP can readily attach to
the NP constructed by -s (making the article unnecessary, our dog's (*the) kennel), whereas
prepositional phrase possessors in post-nominal position (the kennel of our dog) are not
unique to NP (being productive complements of adjectives, aware of our dog, and verbs,
think of our dog, etc) and can therefore benefit from explicit attachment marking in the form
of the definite article.
The basic generalization here is that a following N can be readily incorporated into an
NP that has already been constructed on-line, whereas pronouns and other categories
following N can benefit from attachment marking. For pronouns this is plausibly because
26
they resist attachment to an NP other than the one they themselves construct. I.e. they are
only rarely attached to an NP that has already been constructed. PPs and certain other
categories sometimes attach to NP and sometimes attach to other phrasal nodes. A noun, by
contrast, is always NP-dominated and (in contrast to pronouns) combines highly frequently
with other NP-dominated constituents. The occurrence of the definite article in these data is
in inverse proportion to the frequency with which a following category X occurs within an
NP already constructed by Y in on-line processing: less frequency results in more explicit
encoding of attachment by the definite article.
What differentiates them is the order of N relative to the possessor. Why should
articles be dispreferred in the Poss N order? I suggest that attachability to NP provides an
explanation. In the Poss N order either a possessive determiner such as my or a possessive
phrase morpheme such as -s constructs NP (by Mother Node Construction or Grandmother
Node Construction respectively, cf. Hawkins 1994:ch.6). A following N can be readily
linked to the NP so constructed because N requires a mother NP and co-occurs productively
with determiners and other modifiers within NP, as formulated in the Phrase Structure rules
of the grammar. But in the reverse order, N Poss, either N or Art will construct NP on the
left periphery and a following possessor will not be so readily attachable to this NP.[8]
This line of explanation receives further support from a parallel asymmetry involving
definite articles and demonstratives, which Haspelmath (1999) points out in his note 9
(p.235). There are several languages in which the order Dem N rules out an accompanying
article, whereas N Dem permits one. Compare Spanish este libro ('this book') with el libro
este ('the book this'), cf. Brizuela (1999). Once again, the noun libro is readily attachable to
an NP that has already been constructed by a preceding este, which is a pronoun and also a
27
demonstrative determiner. But when este follows libro its attachment to the NP constructed
by this latter is facilitated by explicit attachment marking in the form of the definite article el.
Yet further support comes from an altogether different language and family, Maltese.
Adjectives normally follow nouns in Maltese, and when they do so the definiteness prefix is
often repeated in its appropriate allomorphic form (Plank & Moravcsik 1996:187-9): il-mare
(t-)twila (DEF-woman (DEF)-tall, i.e. 'the tall woman'). When the adjective precedes the
noun, however, only the adjective takes definiteness marking: il-famuz (*is-)strajk (DEFfamous (DEF-)strike, i.e. 'the famous strike'). In other words, N Adj order permits repeated
definiteness marking on Adj, Adj N order does not permit repeated marking on N. I suggest
that a noun following an adjective can be readily attached to an NP that has been constructed
(through Grandmother Node Construction) by a definiteness prefix on the adjective (cf. ilfamuz). But an adjective following a noun can benefit from this additional definiteness
marking on the adjective itself (il-mare t-twila) in order to signal the attachment to NP.
4.4 Lexical Differentiation and Word Order
Languages with lexical specialization for parts of speech, such as adjective, can provide clear
attachments to NP in many grammatical environments when Phrase Structure rules plus the
lexicon are accessed in parsing. I.e. lexical specialization plus a grammar can often
guarantee unambiguous attachment to NP. Languages without such specialization must rely
on either morpho-syntactic particles devices (see section 4.1) or on fixed word order. In (35)
I formulate a prediction that can be made for fixed word order on the basis of the NP
Attachment Hypothesis in (21).
(35)
Prediction 7: Lexical Differentiation and Word Order
28
Lexical specialization for the category Adjective will permit the relevant languages to
order the Adjective before or after the Noun, attachment to NP being guaranteed by
this part of speech plus Phrase Structure rules; lack of such specialization will favor
fixed word order, with consistent attachments to a leftward nominal head (VO lgs) or
to a rightward nominal head (OV lgs).
Hengeveld et al. (2004) have shown that a lack of lexical differentiation for basic parts of
speech correlates with more fixed and typologically consistent word orders across phrasal
categories. The NP Attachment Hypothesis provides a reason why. They distinguish, inter
alia, between languages that have a separate category of NP-dominated adjectives, which can
unambiguously attach to NP, versus those that do not. The former (exemplified by English
and Wambon, types 4-5 on the Hengeveld et al. Part of Speech hierarchy) all have separate
nouns, verbs and adjectives. The latter (exemplified by Samoan, Warao and Ngiti, types 1-3
for Hengeveld et al.) have no separate adjectives. They may or may not have separate nouns
and verbs either, depending on their position on the Hengeveld et al. hierarchy
Consider Basque, an OV language which has a separate class of adjectives ordered
inconsistently after the noun, while adverbial modifiers of the verb precede the verb
(Saltarelli 1988). In the absence of lexical (or morpho-syntactic) differentiation between Adj
and Adv, a non-head modifier that followed N and preceded V would be regularly
ambiguous as to its attachment site, to the left or to the right:
(36)
N Adj/Adv V.
[Head-Modifier in NP within SOV]
Using English morphemes, a grammatical distinction between loud (adjective) and loudly
(adverb) can make clear the intended attachments in music loud played (attach loud to music
29
within NP) versus music loudly played (attach loudly to played within VP). Alternatively, a
consistently ordered adjective in an SOV language would avoid this attachment ambiguity:
(36')
Adj N Adv V
[Modifier-Head in NP within SOV]
Conversely, a VO language with Modifier Head ordering in the NP but with postverbal adverbial modification would invite similar attachment ambiguities in the absence of
lexical differentiation between play loud music and play loudly music:
(37)
V Adv/Adj N
[Modifier-Head in NP within VO]
A consistently ordered adjective would avoid them:
(37')
V Adv N Adj
[Head-Modifier in NP within VO]
In VO (head-initial) languages, non-head categories can be consistently attached to heads on
their left, therefore, and in OV (head-final) languages they can be consistently attached to the
right. Languages with lexically differentiated categories, on the other hand, can tolerate
ordering inconsistency when the lexicon supplies a clear class of adjectives that are known to
attach to NP in relevant environments.
The precise predictions that we can make on the basis of (35) are complicated by the
fact that category differentiation in the lexicon plus word order is only one grammatical
means for solving attachment problems, morpho-syntactic linking of different kinds being
another (see section 3.2), and also (in spoken language) considerations of prosody and
intonation (see e.g. Kiaer 2007). Nonetheless, we expect any one attachment device to be
more productive in certain language types and structures than in others, namely in those for
which the absence of any attachment device at all would be problematic for NP Attachment
(21). For lexical differentiation we can reasonably make the following predictions, therefore:
30
(38) a. Lexically undifferentiated languages without a productive Adj category (Hengeveld
et al.'s types 1-3) and with basic SOV will favor consistent MH order in NP.
b. Such languages with basic VO (VSO or SVO) will favor consistent HM order in NP.
(39) a. Lexically differentiated languages with a productive Adj category (Hengeveld et al.'s
types 4-5) and with basic SOV will permit inconsistent HM order in NP, either as a
basic order or as a frequent variant.
b. Such languages with basic VO (VSO or SOV) will permit inconsistent MH order in
NP, either as a basic order or as a frequent variant.
The relevant language quantities, taken from Hengeveld et al.'s Table 4 (p.549), are as
follows:
(38') a. 5/6 SOV languages described in (38a) have consistent MH order in NP (Mundari)[9]
b. 3/4 VO languages described in (38b) have consistent HM order in NP (Samoan)[10]
(39') a. 10/13 SOV languages described in (39a) permit inconsistent HM orders in NP
(Basque)[11]
b. 4/8 VO languages described in (39b) permit inconsistent MH orders in NP
(Arapesh)[12]
In other words, the languages predicted to be consistent generally are so, and a significant
number of the languages predicted to permit inconsistency do so.
4.5 Minimize NP Attachment Encoding
A further prediction that can be made on the basis of the NP Attachment Hypothesis is (40):
(40)
Prediction 8: Minimize NP Attachment Encoding
31
The explicit encoding of attachment to NP will be in inverse proportion to the
availability of other (morpho-syntactic, syntactic and semantic-pragmatic) cues to
attachment: the more such cues, the less encoding.
In other words, we predict less explicit attachment marking when there are other cues to
attachment. Consider in this regard Haspelmath's (1999:235) universal regarding the
omissibility of definite articles in NPs with possessors depending on the type of possession
relation, rather than on the ordering of the head noun (as discussed in section 4.3)
(41)
Haspelmath's Universal
If the definite article occurs with a noun that is inherently related to an accompanying
possessor, such as a kinship term, then it occurs with nouns that are not so inherently
related.
I suggest that this universal can be seen as a consequence of the attachment function of the
definite article, linking a possessor to a head noun. Kinship involves necessary and
inalienable relations between referents, which makes explicit signaling of the attachment less
necessary with nouns of this subtype. The definite article can attach a possessor to a head
noun in Bulgarian, Nkore-Kiga and Italian (42a), but not when the head noun + possessor
describes a kinship relation like 'my mother' (42b), cf. Haspelmath (1999:236) and
Koptevskaja-Tamm (2003):
(42) a. Bulgarian kola-ta mi; Nkore-Kiga e-kitabo kyangye; Italian la mia casa
car-DEF my
DEF-book my
DEF my house
b. Bulgarian majka(*-ta) mi; Nkore-Kiga (*o-)mukuru wangye; Italian (*la) mia madre
mother(-DEF) my
(DEF-)sister my
(DEF) my mother
32
Support for this attachment explanation comes from the fact that other attachment
devices (see section 2.2) show a parallel sensitivity to inalienable possession, suggesting that
omissibility is not a consequence of the semantics and pragmatics of definiteness as such in
combination with inalienable possession, as in Haspelmath's account. The Cantonese
nominalizer/attachment marker ge can be omitted as an explicit signal of attachment for
possessor + noun when there is an inalienable bond between them, like kinship, and
especially when the possessor is a pronoun. Contrast ngóh sailóu (I younger-brother, i.e.
'my younger brother') with gaausauh ge baahngūngsāt (professor NOMLZ/ATTACH office,
i.e. 'the professor's office'), cf. Matthews & Yip (1994:107).
A particularly subtle test of the basic idea behind prediction 8 (40) has been made on
Zoogocho Zapotec data by Sonnenschein (2005:98-110). There are different formal means
for marking possession in this language, by simple adjacency of nouns (43a), by a possessive
prefix (43b) and by a postnominal possessor phrase headed by che (of) (43c):
(43) a. tao lalo
(Zoogocho Zapotec)
grandmother Lalo, i.e. 'Lalo's grandmother'
b. x-kuzh-a'
POSS-pig-1SG, i.e. 'my pig'
c. tigr che-be'
tiger of-3INFORMAL, i.e. 'her tiger'
Sonnenschein tests the idea that there is a continuum from inalienable possession at the one
end ('my head', etc) through frequently possessed items (like 'her pig') to not very frequently
possessed items (like 'her tiger'). He shows on the basis of a corpus study that the amount of
formal marking for possession correlates inversely with the frequency with which the
33
relevant head nouns are in a semantic possession relation. Possession signaled by simple
adjacency (43a) is used for head nouns that are always possessed (like kinship terms and
body parts). Possession signaled syntactically by a postnominal possessor phrase (43c) is
used with head nouns that are generally unpossessed. And NPs that show either
morphological x- (43b) or syntactic encoding (43c) are more variably possessed. This
intermediate group also shows a preference for the morphological variant when the
possession is more inherent, and for the syntactic variant when the possession is less
inherent, for example when a possessed house is under construction and the owners are not
yet living in it.
Sonnenschein's quantification of the degree and frequency of possession correlating
inversely with both the presence versus absence of possession marking and with its amount
and complexity supports the role of additional semantic-pragmatic cues in signaling the
attachment of possessor to possessed, resulting in form minimization.
Consider finally in this section a detail from Maltese. Plank & Moravcsik (1996:192)
point out that the definite article is not used with nouns in the construct state, despite the
definiteness of the relevant NPs. In other words, it is not used when nouns are followed by
"a nominal attributive not marked by a preposition", as in leħen Manwel (voice Manwel, i.e.
Manwel's voice); compare leħen l-avukat (voice DEF-advocate, i.e. 'the advocate's voice'). I
mentioned in section 2.2 that the construct state in Semitic is an attachment device that links
sisters together. If this applies to Maltese as well, there would be no need for a second
attachment marker in addition to the construct state, according to (34), which is what we get.
4.6 Significant Attachments
34
We have seen in sections 4.1 - 4.5 that structural attachments to NP that would be hard to
process can be facilitated by various morpho-syntactic and syntactic devices, by fixed word
order and by lexical differentiation. Notice now that explicit attachment devices are also
preferred in structures in which the attachment is communicatively significant, for example
those in which nominal modifiers are vital for referent identification. The following
generalization does not follow from the NP Attachment Hypothesis as formulated in (21), but
it appears to be a robust pattern that is further characteristic of attachment markers. Further
research should explore more data of this sort to see whether a higher processing
generalization can capture both the patterns predicted by (21) and these "significant"
attachments at the same time:
(44)
Significant Attachments
The explicit encoding of attachment is in proportion to the significance of making the
attachment for referent identification purposes.
For example, with proper names in English an accompanying relative clause takes a definite
article when it is semantically restrictive, and not when it is appositive: the John who likes
me versus John, who likes me. The former distinguishes one person named John from
another, the latter does not, and it is the former that takes the definite article, which I would
analyze here as an explicit marker for attaching the relative to the head. The explicit marking
reflects the fact that this attachment is significant for referent identification. The same
applies to adjective modifiers. Compare the nice John (as opposed to the less nice one) with
nice John (only one person named John is under consideration).
Similarly in Maltese Plank & Moravcsik (1996:191) point out that proper names that
do not normally take the definite article will take one when accompanied by a restrictive
35
relative: il-Manwel li naf jien (DEF-Manwel who I-know I, 'The Manwel who I know'). For
adjectives they point out (pp.187-8):
"Where there is variation [in spoken and journalistic Maltese, JAH] it is essentially
restrictive adjectives, making a significant contribution to the identifiability of the
noun phrase's referent, which deserve a definite article of their own. Thus, in il-mara
t-twila [DEF-woman DEF-tall, JAH] a contrast is likely to be implied to a woman
that is not tall, while in il-mara twila the tallness of the woman is likely to be part of
the addressee's advance knowledge."
When nouns in English are accompanied by sentential complements, the fact that
Arctic ice has disappeared at an alarming rate or the rumor that the Prime Minister is going
to resign we also find definite articles, obligatory in the former case (*a fact that Arctic ice
has disappeared ... is ungrammatical), optional in the latter (a rumor that the Prime Minister
is going to resign is grammatical). Definiteness can be used in these structures in violation
of the normal pragmatic rules governing familiarity and appropriate usage (cf. Hawkins
1978, 1991). The fact and the rumor in question may be quite unknown to the hearer and
uninferrable prior to these utterances, and yet the definite article is appropriate. I suggest that
the signals an important attachment to fact and rumor respectively: the hearer needs to
consult the sentential complement in order to identify which fact and which rumor the
speaker intends. We might extend this explanation to other cases in which there is a
predicational relation between modifier and head, such as the name Algernon. Hearers may
not know that "Algernon is a name" (just as they may not know the fact that Arctic ice has
disappeared ...), cf. also the color ruby (ruby is a color), the number two (two is a number),
etc. In all these cases the following name, noun or numeral serves to identify the referent of
36
the head noun, limiting it to a particular instance of the set of things that are names, colors
and numbers, just as the following complement clause identifies the relevant fact and rumor.
English also exhibits special "cataphoric" uses of the demonstrative determiner those
with a following noun and relative clause: those students who pass the foreign language
exam can gain admission to university. The students in question are only identifiable by
reference to the content of the relative clause and the determiner draws explicit attention to
the need to consult the attached clause. Simple definite articles can also perform this
cataphoric function with relative clauses in NPs whose referents can be pragmatically
unfamiliar, like Joe's fed up with the book he just got for his birthday (Hawkins 1978:131-8).
The determiners signal the significance of the relative clause for referent identification in
these examples, and they do so in structures for which normal pragmatic rules of appropriate
usage do not, or need not, apply.
Attachment marking needs to be recognized as a component in the description of the
definite article in English, in addition to its pragmatic functions. Definite articles also
construct English NPs on their left periphery, regularly minimizing syntactic processing
domains. Syntactic and semantic processing functions are an important part of the analysis
of articles and determiners in English, therefore, even though pragmatic functions appear to
be widespread and to subsume most uses. In many other languages, like Polynesian and
Maltese, the pragmatic functions of articles are less clear because their construction and
attachment functions are more extensive and cover a larger range of uses than in English.
Construction and attachment are, nonetheless, an important part of the grammar of English
determiners.
5. Conclusions
37
I have argued that a whole myriad of grammatical details across languages can be profitably
viewed from a processing perspective. Two hypotheses have been proposed involving
Constructability (11) and Attachability (20), from which eight predictions for typological
patterns have been derived: for NP Construction in (12), Lexical Differentiation (14), VO
versus OV Asymmetries (17), Separation of Sisters (22), Ambiguous Attachments (28),
Frequency of NP Attachments (32), Lexical Differentiation and Word Order (35), and
Minimize NP Attachment Encoding (40). These predictions have been illustrated with
sample languages and with quantitative data where these are available.
The variation patterns we have seen, the presence versus absence of morpho-syntactic
devices in different environments, the existence of different degrees of lexical differentiation
for parts of speech correlating with word order patterns, the exceptions to normal
grammatical or semantic/pragmatic rules in certain structures, and the asymmetries between
different language types, are largely mysterious when viewed from a grammatical
perspective alone. In order to describe the facts, we need the best models of syntax available
and precise processing operations defined in terms of them. But it is these latter that help
clarify why and how grammars make use of the various devices summarized in section 2 and
why different languages exhibit the cross-linguistic variation patterns we have seen in
sections 3-4. They all follow from two basic processing needs: anything that is an NP must
be recognized as such, i.e. NP has be constructable; and all the items that belong to NP must
be attachable to it. I have tried to show that these two simple ideas have numerous
grammatical consequences and can make sense of a lot of the variation patterns that
typologists and formal grammarians have puzzled over. Grammars appear to have
conventionalized the preferences of performance that are evident in languages with structural
38
choices between e.g. the presence or absence of a relative pronoun, of an article or a
classifier. This Performance-Grammar Correspondence Hypothesis (1) motivates the
"processing typology" research program of Hawkins (1994) and (2004). For linguists the
inclusion of processing in the very core of their theorizing about grammars can provide
benefits of the kind I have illustrated here. For psychologists, exposure to more linguistic
diversity can lead to refinements in processing ideas derived from the more familiar
languages. Data from less familiar languages sometimes challenge assumptions that are
inherent in current models.[13]
There are countless cross-linguistic details that I could not mention in this paper, for
reasons of space, some of which can be readily linked to the predictions made here, others of
which remain mysterious to me. My purpose in writing the paper is to encourage others to
consider grammars from this point of view. Even very familiar words in the best described
languages, like the definite article in English, can benefit, I would argue, from a reanalysis
along the lines suggested here.
UC Davis and Cambridge University
Correspondence addresses: Department of Linguistics, UC Davis, 208 Sproul Hall, One
Shields Avenue, Davis, CA 95616, USA; and Research Centre for English and Applied
Linguistics, University of Cambridge, 9 West Road, Cambridge, CB3 9DP, UK; e-mails:
[email protected], [email protected]
Acknowledgements:
Abbreviations: ABL = ablative case; ACC = accusative case; ADJ = adjective; ADV =
adverb; AGR = agreement marker; ART = article; ATTACH = attachment marker; CL =
39
classifier; CONSTR = construct state; DEF = definite article or particle; DEM =
demonstrative determiner or pronoun; EIC = Early Immediate Constituents; ERG =
ergative case; FEM = feminine; FUT = future; GEN = genitive case; HM = HeadModifier order; IC = immediate constituent; II = second declension; IP = inflection phrase
(or clause); M = mother node; MH = Modifier-Head order; MiD = Minimize Domains; N
= noun; NEUT = neuter; NOM = nominative case; NOMLZ = nominalizing particle; NP
= noun phrase; OV = object verb order; P = preposition or postposition; PART = particle;
PCD = Phrasal Combination Domain; PGCH = Performance-Grammar Correspondence
Hypothesis; PL = plural; PoS = part of speech; POSS = possessor; PossP = Possessor
Phrase; PP = prepositional or postpositional phrase; REL = relative clause; S = sentence;
SG = singular; SOV = subject object verb order; SVO = subject verb object order; V =
verb; VO = verb object order; VP = verb phrase; VSO = verb subject object order; WALS
= World Atlas of Language Structures.
Footnotes
1.
For theories in which Determiner Phrase and NP are distinguished the
present paper can be viewed as providing a processing perspective on NP and DP
structure. A number of the details will differ from the account proposed here, regarding
which of these maximal projections is actually constructed by particular daughters and
regarding the attachments to each, but the same processing logic can carry over to
structural analyses incorporating DPs.
2.
Some other imaginable possibilities for constructing phrasal nodes, such
as from a sister node, are discussed in Hawkins (1994:360-1).
40
3.
There are numerous differences between different formal models of
grammar with respect to the precise set of heads they define, and numerous
disagreements exist with respect to particular categories, cf. Dryer (1992) and Corbett et
al., eds. (1993) for detailed summaries and discussion. Hawkins (1993, 1994) argues that
the disputed categories generally have a "construction" function in parsing (whence the
plausability of considering them heads at all), and that it is this that ultimately motivates
the whole notion of "head of phrase" and its correlating properties. Individual
grammatical models differ, essentially, in their stipulations over how closely they want
grammars to be aligned with processing functions. So, for example, it is not required for
processing that there should be a bi-unique relationship between a constructor C and the
phrase P that it constructs. It is sufficient that C alone should guarantee P uniquely,
whether or not some other category C' can also construct P. Putting this in grammatical
terms: can a phrase have more than one head? Many grammatical models insist on biuniqueness, whereas others (those that are more closely aligned with parsing on this
occasion) do not.
4.
One way to test the proposed link between NP-constructing particles and
lexical differentiation would be to compare languages with and without lexically unique
nouns by selecting various subsets of lexical predicates, quantifying numbers of categoryambiguous items (i.e. predicates like run and play in English, as opposed to student and
professor, which are uniquely nouns), numbers of syntactic environments that require the
definite article or other NP constructor, and corpus frequencies for these constructors.
Hengeveld et al. (2004) provide a useful typology of lexical differentiation across languages
41
and a language sample and I shall make use of their data in section 4.4 below when testing a
related prediction involving word order and attachment.
5.
The basic efficiency principle to which I appeal in this section is Minimize
Domains (Hawkins 2004:31), defined as follows:
(i)
Minimize Domains (MiD)
The human processor prefers to minimize the connected sequences of linguistic
forms and their conventionally associated syntactic and semantic properties in
which relations of combination and/or dependency are processed. The degree of
this preference is proportional to the number of relations whose domains can be
minimized in competing sequences or structures, and to the extent of the
minimization difference in each domain.
MiD predicts that "phrasal combination domains" should be as short as possible, and that
the degree of this preference should be proportional to the minimization difference
between competing orderings. This principle (a particular instance of Minimize
Domains) was called Early Immediate Constituents (EIC) in Hawkins (1994):
(ii)
Phrasal Combination Domain (PCD) [Hawkins 2004:107]
The PCD for a mother node M and its I(mmediate) C(onstituent)s consists of the
smallest string of terminal elements (plus all M-dominated non-terminals over the
terminals) on the basis of which the processor can construct M and its ICs.
(iii)
Early Immediate Constituents (EIC) [Hawkins 1994:69-83]
The human processor prefers linear orders that minimize PCDs (by maximizing
their IC-to-word ratios), in proportion to the minimization difference between
competing orders.
42
Empirical support for EIC and for MiD is summarized in Hawkins (1994, 2004) using both
corpora from numerous language types and psycholinguistic experiments. Additional
corpus and experimental results providing broad support for EIC's/MiD's predictions are
presented, for English in Wasow (1997, 2002), Stallings (1998), Stallings et al. (1998) and
Lohse et al. (2004), for Japanese in Yamashita (2002) and Yamashita & Chang (2001), for
Cantonese in Matthews & Yeung (2001), and for German in Uszkoreit et al. (1998).
Hawkins' (2004) MiD is a more general version of the EIC principle for word order and of
the structural complexity predictions for filler-gap structures defined in Hawkins (1994,
1999), applied now to all grammatical relations of combination and dependency. Gibson's
(1998, 2000) "locality" principle is fundamentally similar in spirit to MiD and the
considerable experimental support that Gibson offers for it carries over to MiD.
6.
As shown in section 2.2 and as illustrated in greater detail in section 4,
constructing categories like articles and classifiers can also signal the attachment of NP
daughters to one and the same NP, and this could motivate the presence of C in an order
such as np[... C ... N] even though phrasal combination domains were lengthened as a
result. An order such as np[N ... C] could also signal the rightmost boundary of an NP in
an inconsistent OV language with noun-initial NPs. Such additional processing
motivations may explain why articles occur at all in (a minority of) rigid OV languages,
as in the data of (18).
7.
A genus for Dryer (1992) is a genetic grouping of languages comparable
in time depth to the subfamilies of Indo-European.
8.
The account proposed here is in the same processing spirit as
Haspelmath's (1999) more pragmatically-based account, but differs from it in its
43
emphasis on syntactic co-occurrence frequencies for the categories X and Y in their
respective orders.
9.
The five SOV languages with basic and fixed MH order in NP are:
Mundari, Hurrian, Quechua, Turkish and Ket (Warao being basic and fixed HM).
10.
The three VO languages with basic and fixed HM in NP are: Samoan,
Miao and Tidore (Tagalog having no basic order, i.e. MHM).
11.
The ten SOV languages described in (39a) with inconsistent HM orders in
NP are: Abkhaz, Alamblak, Bambara, Basque, Hittite, Koasati, Nasioi, Sumerian,
Oromo and Wambon. Of these, four have maximum inconsistency with basic and fixed
HM (Bambara, Koasati, Nasioi and Sumerian), four have basic HM with frequent
departures in favor of the typologically consistent MH (Abkhaz, Basque, Nasioi and
Sumerian), while two have no basic order (Hittite and Alamblak). The three SOV
languages described in (39a) with consistent MH orders in NP are: Burushaski, Japanese
and Nama. MH is both basic and fixed in the latter two, while MH is basic and free in
Burushaski.
12.
The four VO languages described in (39b) with inconsistent MH orders in
NP are: Arapesh, Berbice Dutch, Pipil and Polish. Berbice Dutch and Pipil have basic
MH (with fixed and free NP word orders respectively). Arapesh and Polish are classified
as having no basic NP order (MHM).
13.
See Hawkins (2004:265-70) and (2007) for illustrations of this last point.
44
References
Aikhenvald, Alexandra Y. 2003. Classifiers: A typology of noun categorization devices.
Oxford: Oxford University Press.
Aissen, Judith. 1999. Markedness and subject choice in Optimality Theory. Natural
Language and Linguistic Theory 17. 673-711.
Anderson, Stephen R., and Edward L. Keenan. 1985. Deixis. In Timothy Shopen (ed.),
Language typology and syntactic description, Vol.3: Grammatical categories and the
lexicon, 259-308. Cambridge: Cambridge University Press.
Aranovich, Raúl. 2007. Incorporation, pronominal arguments, and configurationality in
Fijian. MS, Department of Linguistics, University of California, Davis.
Basilico, David. 1996. Head position and internally headed relative clauses. Language
72. 498-532.
Bauer, Winifred. 1993. Maori. London: Routledge.
Bhat, D.N.S. 2004. Pronouns. Oxford: Oxford University Press.
Blake, Barry. 1987. Australian aboriginal grammar. London: Croom Helm.
Bresnan, J., S. Dingare, and C.D. Manning. 2001. Soft constraints mirror hard constraints:
Voice and person in English and Lummi. In M. Butt and T.H. King (eds.), Proceedings
of the LFG 01 Conference. Stanford, California: CSLI Publications.
Brizuela, Maquela. 1999. Definiteness types in Spanish: A study of natural discourse. PhD
thesis, University of Southern California.
Broschart, Jürgen. 1997. Why Tongan does it differently: Categorial distinctions in a language
without nouns and verbs. Linguistic Typology 1. 123-65.
45
Bybee, Joan and Paul Hopper, eds. 2001. Frequency and the Emergence of Linguistic
Structure. Amsterdam: Benjamins.
Clancy, P.M., H. Lee and M. Zoh. 1986. Processing strategies in the acquisition of
relative clauses. Cognition 14. 225-62.
Corbett, Greville G., Norman M. Fraser, and Scott McGlashan, S., eds. 1993. Heads in
grammatical theory. Cambridge: Cambridge University Press.
Craig, Colette Grinevald. 1977. The structure of Jacaltec. Austin: University of Texas
Press.
Dixon, R.M.W. 1977. A grammar of Yidiny. Cambridge: Cambridge University Press.
Dryer, Matthew S. 1992. The Greenbergian word order correlations. Language 68. 81138.
Dryer, Matthew S. 2005. Determining dominant word order. In Martin Haspelmath,
Matthew S. Dryer, David Gil, and Bernard Comrie (eds.), The world atlas of
language structures, 371. Oxford: Oxford University Press.
Dryer, Matthew S. 2007. The branching direction theory of word order correlations
revisited. MS, Dept of Linguistics, SUNY, Buffalo.
Fodor, Jerry A., Thomas G. Bever, and Merrill F. Garrett. 1974. The Psychology of
Language. New York: McGraw-Hill.
Gibson, Edward. 1998. Linguistic complexity: Locality of syntactic dependencies.
Cognition 68. 1-76.
Gibson, Edward. 2000. The dependency locality theory: A distance-based theory of
linguistic complexity. In A. Marantz, Y. Miyashita, and W. O'Neil (eds.), Image,
language, brain, 95-126. Cambridge, MA: MIT Press.
46
Gorbet, L. 1976. A grammar of Diegueno nominals. New York: Garland.
Haiman, John. 1983. Iconic and economic motivation. Language 59. 781-819.
Haiman, John. 1985. Natural syntax. Cambridge: Cambridge University Press.
Haspelmath, Martin. 1999. Explaining article-possessor complementarity: Economic
motivation in noun phrase syntax'. Language 75. 227-43.
Haspelmath, Martin, Matthew S. Dryer, David Gil, and Bernard Comrie, eds. 2005. The
world atlas of language structures (WALS). Oxford: Oxford University Press.
Hawkins, John A. 1978. Definiteness and indefiniteness: A study in reference and
grammaticality prediction. New Jersey: Humanities Press, and London: Croom
Helm.
Hawkins, John A. 1983. Word order universals. New York: Academic Press.
Hawkins, John A. 1986. A comparative typology of English and German: Unifying the
contrasts. Austin: University of Texas Press.
Hawkins, John A. 1990. A parsing theory of word order universals. Linguistic Inquiry 21.
223-61.
Hawkins, John A. 1991. On (in)definite articles: Implicatures and (un)grammaticality
prediction. Journal of Linguistics 27. 405-42.
Hawkins, John A. 1993. Heads, parsing, and word order universals. In Greville G. Corbett,
Norman M. Fraser, and Scott McGlashan (eds.), Heads in grammatical theory, 23165. Cambridge: Cambridge University Press.
Hawkins, John A. 1994. A performance theory of order and constituency. Cambridge:
Cambridge University Press.
47
Hawkins, John A. 1999. Processing complexity and filler-gap dependencies. Language
75. 244-285.
Hawkins, John A. 2001. Why are categories adjacent? Journal of Linguistics 37. 1-34.
Hawkins, John A. 2004. Efficiency and complexity in grammars. Oxford: Oxford
University Press.
Hawkins, John A. 2007. Processing typology and why psychologists need to know about
it. New Ideas in Psychology 25. 87-107.
Hengeveld, Kees, Jan Rijkhoff, and Anna Siewierska. 2004. Parts-of-speech systems and
word order. Journal of Linguistics 40. 527-70.
Himmelmann, Nikolaus P. 1997. Deiktikon, Artikel, Nominalphrase: Zur Emergenz
syntaktischer Struktur. Niemeyer: Tübingen.
Hiimmelmann, Nikolaus P. 1998. Regularity in irregularity: Article use in adpositional
phrases. Linguistic Typology 2.315-53
Jackendoff, Ray. 1977. X-bar syntax: A study of phrase structure. Cambridge, Mass.: MIT
Press.
Joseph, Brian, and Irene Philippaki-Warburton. 1987. Modern Greek. London and New
York: Routledge.
Keenan, Edward L. 1988. On semantics and the binding theory. In John A. Hawkins (ed.),
Explaining language universals,105-44. Oxford: Basil Blackwell.
Kiaer, Jieun. 2007. Processing and interfaces in syntactic theory: the case of Korean.
PhD thesis, King's College London.
Kirby, Simon. 1999. Function, selection and innateness. Oxford: Oxford University
Press.
48
Koptevskaja-Tamm, Maria. 2003. Possessive noun phrases in the languages of Europe. In
Frans Plank (ed.), Noun phrase structure in the languages of Europe, 621-722.
Berlin: Mouton de Gruyter.
Lehmann, Christian. 1984. Der Relativsatz. Tübingen: Narr.
Levelt, Willem J.M. 1989. Speaking: From intention to articulation. Cambridge, Mass.:
MIT Press.
Li, Charles N. and Sandra A. Thompson. 1981. A functional reference grammar of
Mandarin Chinese. Berkeley: University of California Press.
Lohse, Barbara. 2000. Zero versus explicit marking in relative clauses. MS, Department of
Linguistics, University of Southern California.
Lohse, Barbara, John A. Hawkins, and Tom Wasow. 2004. Domain minimization in English
verb-particle constructions. Language 80. 238-261.
Lyons, Christopher. 1999. Definiteness. Cambridge: Cambridge University Press.
MacDonald, Maryellen C., Neil J. Pearlmutter, and Mark S. Seidenberg. 1994. The
lexical nature of syntactic ambiguity resolution. Psychological Review 101.
672-703.
Mallinson, Graham. 1986. Rumanian. London: Croom Helm.
Matisoff, James A. 1972. Lahu nominalization, relativization, and genitivization. In John
Kimball (ed.), Syntax and semantics, Vol.1, 237-57. New York/London:
Academic Press.
Matthews, Stephen, and L.Y.Y. Yeung. 2001. Processing motivations for topicalization
in Cantonese. In Kaoru Horie and S. Sato, Cognitive-functional linguistics in an
east asian context 81-102. Tokyo: Kurosio.
49
Matthews, Stephen, and Virginia Yip. 1994. Cantonese: A comprehensive grammar.
London and New York: Routledge.
Moravcsik, Edith. 1995. Summing up Suffixaufnahme. In Frans Plank (ed.), Double case:
Agreement by Suffixaufnahme, 451-84. Oxford: Oxford University Press.
Newmeyer, Frederick J. 2005. Possible and probable languages: A generative perspective
on linguistic typology. Oxford: Oxford University Press.
Payne, John. 1993. The headedness of noun phrases: Slaying the nominal hydra. In
Greville G. Corbett, Norman M. Fraser, and Scott McGlashan (eds.), Heads in
grammatical theory, 114-39. Cambridge: Cambridge University Press.
Plank, Frans. ed. 1995. Double case: Agreement by Suffixaufnahme. Oxford: Oxford
University Press.
Plank, Frans. ed. 2003. Noun phrase structure in the languages of Europe. Berlin: Mouton
de Gruyter.
Plank, Frans, and Edith Moravcsik. 1996. The Maltese article: Language-particulars and
universals. Rivista di Linguistica 8. 183-212.
Pollard, C. and I.A. Sag. 1994. Head-driven Phrase Structure Grammar. Chicago: University of
Chicago Press.
Quirk, Randolph. 1957. Relative clauses in educated spoken English. English Studies 38.97109.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik. 1985. A
comprehensive grammar of the English language. London: Longman.
Rijkhoff, Jan. 2002. The noun phrase. Oxford: Oxford University Press.
50
Rohdenburg, Günter. 1996. Cognitive complexity and grammatical explicitness in English.
Cognitive linguistics 7. 149-82.
Rohdenburg, Günter. 1999. Clausal complementation and cognitive complexity in English. In
F.-W. Neumann and S. Schülting (eds.), Anglistentag 1998: Erfurt 101-12. Trier:
Wissenschaftlicher Verlag.
Saltarelli, Mario. 1988. Basque. London and New York: Routledge.
Sasse, Hans-Jürgen. 1988. Der irokesische Sprachtyp. Zeitschrift für Sprachwissenschaft 7.
173-213.
Sonnenschein, Aaron Huey. 2005. A descriptive grammar of San Bartolomé Zoogocho
Zapotec. Muenchen: Lincom Europa.
Stallings, Lynne M. 1998. Evaluating heaviness: Relative weight in the spoken production of
Heavy-NP Shift. Ph.D. thesis, University of Southern California.
Stallings, Lynne M., Maryellen C. MacDonald, and Patrick O'Seaghdha. 1998. Phrasal
ordering constraints in sentence production: Phrase length and verb disposition in
Heavy-NP Shift. Journal of Memory and Language 39. 392-417.
Uszkoreit, H., Th. Brants, D. Duchier, B. Krenn, L. Konieczny, S. Oepen, and
W. Skut. 1998. Studien zur performanzorientierten Linguistik: Aspekte der
Relativsatzextraposition im Deutschen. Kognitionswissenschaft 7. 129-133.
Vincent, Nigel B. 1988. Latin. In Martin B. Harris and Nigel B. Vincent (eds.), The Romance
languages, 26-78. Oxford: Oxford University Press.
Wasow, Tom. 1997. Remarks on grammatical weight. Language Variation and Change 9.
81-105.
Wasow, Tom. 2002. Postverbal behavior. Stanford: CSLI Publications, Stanford University.
51
Weber, Heinrich. 1970. Das erweiterte Adjektiv- und Partizipialattribut im Deutschen.
München: Hueber Verlag.
Yamashita, H. 2002. Scrambled sentences in Japanese: Linguistic properties and motivation
for production. Text 22. 597-633.
Yamashita, H., and Franklin Chang. 2001. "Long before short" preference in the production
of a head-final language. Cognition 81. B45-B55.