Download Cognitive linguistics and language structure

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Pipil grammar wikipedia, lookup

Lexical semantics wikipedia, lookup

Symbol grounding problem wikipedia, lookup

Meaning (philosophy of language) wikipedia, lookup

Semantic holism wikipedia, lookup

Malay grammar wikipedia, lookup

Morphology (linguistics) wikipedia, lookup

Agglutination wikipedia, lookup

Stemming wikipedia, lookup

Pleonasm wikipedia, lookup

Untranslatability wikipedia, lookup

Distributed morphology wikipedia, lookup

Word-sense disambiguation wikipedia, lookup

Antisymmetry wikipedia, lookup

Cognitive semantics wikipedia, lookup

Junction Grammar wikipedia, lookup

Dependency grammar wikipedia, lookup

Parsing wikipedia, lookup

Transformational grammar wikipedia, lookup

Probabilistic context-free grammar wikipedia, lookup

Construction grammar wikipedia, lookup

Focus (linguistics) wikipedia, lookup

Lojban grammar wikipedia, lookup

Musical syntax wikipedia, lookup

Integrational theory of language wikipedia, lookup

Cognitive linguistics and language structure
Cognitive linguists agree that language is handled mentally by general cognitive
structures and processes rather than by a dedicated mental module. However, in spite
of remarkable progress in some areas, cognitive linguists have generally paid little
attention to the possible implications of this ‘cognitive assumption’ for the theory of
language structure – how language is organised and what structures we assign to
utterances. Cognitive Grammar has avoided formalization, and the various versions of
construction grammar have adopted rather conservative views on grammatical
structure. The exception is Word Grammar, which offers a radical alternative view of
language structure. This paper defends the structural claims of Word Grammar on the
grounds that most of them follow logically from the cognitive assumption (though a
few need to be revised). The paper starts by breaking this assumption into a number of
more specific tenets relating to learning, network structures, ‘recycling’, inheritance,
relations, activation and chunking. It then shows how these tenets support various
claims that distinguish Word Grammar in cognitive linguistics. According to this
argument, morphology and syntax are distinct levels, so language cannot consist of
nothing but ‘symbols’ or ‘constructions’. Moreover, in syntax the main units – and
possibly the only units – must be words, not phrases, so the basic relation of syntax is
the dependency between two words, not the relation between a phrase and its part. In
other words, sentence structure is dependency structure, not phrase structure – a
network, as expected in cognition, not a tree. One of the benefits of this analytical
machinery is in the treatment of the various patterns that have been called
‘constructions’, which benefit from the flexibility of a network structure. This kind of
structure is also very appropriate for semantic analysis, illustrated here by two
examples: the distinction between universal and existential quantifiers, and a detailed
structural analysis of the meaning of the how about X? construction, covering its
illocutionary force as well as deictic binding. Finally, the paper discusses a formal
property, the order of words, morphs and so on, arguing that the constraints on order
are expressed in terms of the ‘landmark’ relation of Cognitive Grammar, while the
actual ordering requires the more primitive relation found in any ordered string, here
called ‘next’. The paper explains how landmark relations can be derived from wordword dependencies in both simple and complex syntactic patterns, and why the words
in a phrase normally stay together.
Now that cognitive linguistics (CL) has established itself as a valid and productive
approach to the study of language, it is reasonable to ask what progress it has made on
one of the traditional questions of linguistic theory: how language is structured. In
particular, how is the answer to this question affected, if at all, by what we might call
the Cognitive Assumption in (1)?
(1) The only cognitive skills used in language are domain-general skills which are
also used outside language.
This unifying belief of all cognitive linguists has been expressed more or less pithily
by others: ‘knowledge of language is knowledge’ (Goldberg 1995:5), we should
‘derive language from non-language’ (Lindblom and others 1984:187, quoted in
Bybee 2010:12), ‘language is not an autonomous cognitive faculty’ (Croft and Cruse
2004:1). What difference does the Cognitive Assumption make to our ideas about
how language is organised, compared with the alternative views in which language is
seen either as having nothing to do with cognition, or as a separate module of
The natural place to look for an answer is in the theoretical packages that
address this question directly. The Oxford Handbook of Cognitive Linguistics lists
three ‘models of grammar’ (Geeraerts and Cuyckens 2007): Cognitive Grammar,
construction grammar (without capitals) and Word Grammar.
Cognitive Grammar has not yet been developed into a sufficiently formal
system because ‘the cost of the requisite simplifications and distortions would greatly
outweigh any putative benefits’ (Langacker 2007: 423). Whatever the merits of this
strategic decision, it means that we cannot expect a precise account of how language
is organised, comparable to the accounts that we find in non-cognitive theoretical
When written without capitals, ‘construction grammar’ has sometimes been
identified simply with ‘the cognitive linguistics approach to syntax’ (Croft and Cruse
2004:225). In his 2007 survey, Croft divides construction grammar into four versions
(including Cognitive Grammar). The Fillmore/Kay ‘Construction Grammar’ (with
capitals) is formally explicit, but makes very similar claims about language structure
to the non-cognitive model Head-Driven Phrase Structure Grammar (HPSG; Pollard
and Sag 1994). The Lakoff/Goldberg version is much less formally explicit, but offers
syntactic analyses that are very similar to those of Construction Grammar (Croft
2007:486). Finally, Croft’s own Radical Construction Grammar does comprise
original claims about language structure, but the arguments for these claims are only
loosely related to the Cognitive Assumption, and indeed, I shall suggest in section 5
that they are incompatible with it.
Construction grammarians agree in rejecting the distinction between grammar
and lexicon. This is certainly an innovation relative to the recent American tradition
(though Systemic Functional Grammar has recognised the grammar-lexicon
continuum for decades under the term ‘lexicogrammar’ - Halliday 1961, Halliday
1966). Otherwise, however, Cognitive Grammar and the other versions of
construction grammar make assumptions about language structure which are
surprisingly conservative considering their radical criticisms of ‘main-stream’
linguistic theories. It would probably be fair to describe the assumed model of syntax
as little more sophisticated than Zwicky’s ‘plain vanilla syntax’ (Zwicky 1985), and
more generally the assumed grammatical model is little different from the American
structuralism of the 1950s. The aim of this paper is to question some of these
assumptions on the grounds that they conflict with the Cognitive Assumption. (There
are also good ‘linguistic’ reasons for questioning them, but these arguments will be
incidental to the main thrust of the paper.)
The third ‘model of grammar’ recognised by the Oxford Handbook is Word
Grammar. Not all cognitive linguists recognise Word Grammar as a part of cognitive
linguistics; for instance, neither of the articles about the other two models mentions
Word Grammar, nor is it mentioned at all in the 800 pages of Cognitive Linguistics:
An introduction (Evans and Green 2006), and although Croft and Cruse 2004 mention
it once, they do not regard it as an example of cognitive linguistics. However, if the
Cognitive Assumption is critical, then Word Grammar definitely belongs to cognitive
linguistics. Since the theory’s earliest days its publications endorse this assumption in
passages such as the following:
‘... we should assume that there is no difference between linguistic and nonlinguistic knowledge, beyond the fact that one is to do with words and the
other is not’ (Hudson 1984:36-7)
To reinforce the link to early cognitive linguistics, this is coupled with an approving
reference to Lakoff’s view:
For me, the most interesting results in linguistics would be those showing how
language is related to other aspects of our being human. (Lakoff 1977)
By 1990, cognitive linguistics existed as such and is cited with approval in the next
book about Word Grammar (Hudson 1990:8) in connection with ‘cognitivism’, one of
the theory’s main tenets. By 2007 it was possible to write of cognitive linguistics that
Word Grammar ‘fits very comfortably in this new tradition’ (Hudson 2007:2), and in
2010: ‘Like other ‘cognitive linguists’, I believe that language is very similar to other
kinds of thinking’ (Hudson 2010:1).
On the one hand, then, Word Grammar incorporates the same Cognitive
Assumption as other cognitive theories. Moreover, it has been heavily influenced by
the work of other cognitive linguists, such as Lakoff’s work (mentioned earlier) on
prototypes, Langacker’s analyses of construal in languages such as Cora (Casad and
Langacker 1985), Fillmore’s analyses of English lexical fields such as commercial
transactions and risk (Fillmore 1982, Fillmore and Atkins 1992) and his joint work on
constructions (Fillmore and others 1988, Kay and Fillmore 1999), and Bybee’s work
on learning (Bybee and Slobin 1982). On the other hand, one of the distinctive
characteristics of Word Grammar is its focus on questions of language structure –
‘formal’ questions about the ‘formal’ properties of language. Unfortunately,
‘formalism’ is associated in the literature of cognitive linguistics with Chomskyan
linguistics (Taylor 2007:572), and as noted earlier, Cognitive Grammar has positively
resisted formalisation as a dangerous distraction. Measured solely in terms of insights
into the formal structure of language, Chomsky has a point when he claims that
cognitive linguistics (as he understands it) ‘has achieved almost no results’ (Chomsky
2011). But there is no inherent incompatibility between the Cognitive Assumption and
formalisation. After all, an interest in formal cognitive structures is the hallmark of
Artificial Intelligence, and we have already noticed the similarity between
Construction Grammar and the very formal theoretical work in HPSG. As in other
kinds of linguistics, work is needed on both formal and informal lines, and progress
depends on fruitful interaction between the two.
This paper focuses on general formal questions about language stucture, to
argue that the Cognitive Assumption actually leads to quite specific answers which
are different from the ‘plain vanilla syntax’ which is generally taken for granted, and
that these answers generally coincide with the claims of Word Grammar (though
some revision is needed). The next section analyses the Cognitive Assumption into a
number of more specific tenets that are relevant to language, and the following
sections apply these assumptions to a number of questions about language structure:
the nature of linguistic units, the relations between morphology and syntax, the status
of ‘constructions’ and their relation to dependencies in syntax, the nature of meaning
and the ordering of words. The last section draws some general conclusions.
The Cognitive Assumption unpacked
If cognition for language shares the properties of the general-purpose cognition that
we apply in other domains, the first question for cognitive linguistics is what we know
about the latter. Of course, cognitive scientists know a great deal about cognition, so
the immediate question is what they know that might have consequences for the
theory of language structure. Most of the following tenets are recognised in any
undergraduate course on cognitive psychology or AI as reasonably ‘mainstream’
views, even if they are also disputed; so I support them by reference to textbooks on
cognitive psychology (Reisberg 2007) and AI (Luger and Stubblefield 1993). These
tenets are also very familiar to any reader of this journal, so little explanation or
justification is needed.
The first relevant tenet of cognitive psychology might be called ‘the learning
tenet’ and consists of a truism:
(2) The learning tenet: We learn concepts from individual experiences, or
One conclusion that can be drawn from experimental results is that we learn by
building ‘prototype’ schemas on the remembered exemplars, but without deleting the
latter from memory (Reisberg 2007:321); and another is that schemas have statistical
properties that reflect the quantity and quality of the experiences on which they are
based. Thus it is not just language learning, but all learning, that is ‘usage-based’.
The ‘network tenet’, also called the ‘network notion’ (Reisberg 2007:252), is
this claim:
(3) The network tenet: The properties of one concept consist of nothing but links to
other concepts, rather than ‘features’ drawn from some separate vocabulary.
In this view, concepts are atoms, not bundles or boxes with an internal structure. A
concept is simply the meeting point of a collection of links to other concepts.
Consequently, the best way to present an analysis of some area of conceptual
structure is by drawing a network for it. Moreover, since a concept’s content is
carried entirely by its links to other concepts, labels carry no additional information
and are simply mnemonics to help researchers to keep track of the network models
they build; so in principle, we could remove all the labels from a network without
losing any information (Lamb 1998:59).
An important corollary of the network tenet is that when we learn a new
concept, we define it as far as we can in terms of existing concepts. This is such an
important idea that we can treat it as a separate tenet:
(4) The recycling tenet: Existing concepts are ‘recycled’ wherever possible as
properties of other concepts.
The recycling tenet explains the often observed fact that social and conceptual
networks tend to be ‘scale-free’, meaning that they have more densely-linked nodes
than would be expected if links were distributed randomly (Barabasi 2009). It also
has important implications for the formal structure of language which we shall
consider in relation to morphology (section 3) and semantics (section 6).
Next, we have the ‘inheritance tenet’:
(5) The inheritance tenet: We build taxonomies of concepts linked to one another
by the special relation ‘isa’, which allow generalisations to spread down the
taxonomy from more general to more specific concepts.
The ‘isa’ relation is widely recognised as a basic relation ((Reisberg 2007:270), and
at least in AI, this process of generalisation is called ‘inheritance’ (e.g. Luger and
Stubblefield 1993: 386), so that more specific concepts can ‘inherit’ properties from
more general ones. Many AI researchers accept the need for inheritance to allow
exceptions, so that properties are inherited only by default. This logic is variously
called ‘default inheritance’, ‘normal inheritance’ or ‘normal mode inheritance’, in
contrast with ‘complete inheritance’ which forbids exceptions. If our basic logic is
default inheritance, this explains one part of the learning tenet above: how we can
accommodate both prototypical and exceptional members in the same category.
The fifth relevant claim of cognitive psychology and AI concerns the
classification of relations, so we can call call it the ‘relation tenet’:
(6) The relation tenet: Most relations are themselves concepts which we learn from
On the one hand it is obvious that the links in a network are of different types; for
instance, the links from the concept ‘dog’ to ‘animal’, ‘tail’ and ‘bark’ are
fundamentally different from each other. On the other hand, we cannot assume that
these different types are all ‘built in’, part of our inherited conceptual apparatus
(Reisberg 2007: 270). A few of them must be built in, because they underlie our most
basic logical operations, the clearest example being the ‘isa’ relation mentioned
above; but most of them must be learned from experience just like ordinary concepts.
One solution to this dilemma is to recognise these learned relations as a sub-type of
concept: ‘relation’, contrasted with ‘entity’.
The sixth tenet that is relevant to language structure is the ‘activation tenet’:
(7) The activation tenet: Retrieval is guided by activation spreading from node to
Indeed, the strongest evidence for the network tenet is the evidence for spreading
activation, notably the evidence from priming experiments and from speech errors
(Reisberg 2007:254). In any search, the winner is the most active relevant node, and,
all being well, this will also turn out to be the target node. A node’s activation level is
influenced in part by previous experience – hence the statistical differences between
schemas noted earlier – but partly by the immediately current situation to which the
person concerned is paying attention.
Finally, we note the ‘chunking tenet’:
(8) The chunking tenet: We understand experience by recognising ‘chunks’ and
storing these in memory as distinct concepts (Reisberg 2007:173).
For present purposes, the most important part of this claim is that we create new
concepts for new experiences; so given a series of digits to memorize, we create a
new concept for each digit-token as well as for each ‘chunk’ of digits. This is
evidence for node-creation on a massive scale, though of course most of the nodes
created in this way have a very short life. Those that survive in our memories
constitute the exemplars of the learning tenet, and once in memory they are used in
future experience as a guide to further node-creation.
These seven elementary tenets of cognitive science are closely interconnected.
Take, for example, the temporary exemplar nodes that are created under the chunking
tenet. What the network tenet predicts is that these temporary nodes must be part of
the conceptual network, since that is all there is in conceptual structure; so (by the
recycling tenet) the only way in which we can understand an experience is by linking
it to pre-existing nodes of the conceptual network. The activation tenet predicts that
these nodes are highly active because they are the focus of attention. Moreover, the
inheritance tenet predicts an ‘isa’ relation between each temporary node and some
pre-existing node in the network from which it can inherit properties; this follows
because the aim of classifying exemplar nodes is to inherit unobservable properties,
and ‘isa’ is the relation that allows inheritance. As for the exemplar’s observable
properties, these must also consist of links to pre-existing concepts; but according to
the relation tenet, these links must themselves be classified, so we distinguish the very
general ‘isa’ relation from much more specific relational concepts.
To make these rather abstract ideas more concrete before we turn to
specifically linguistic structures, consider a scene in which you see an object and
recognise it as a cat. Although recognition is almost instantaneous, it must consist of a
series of interconnected steps, with considerable feedback from logically later steps to
earlier ones:
 node-creation: you create highly active temporary nodes for all the objects you
can see, including not only the cat’s parts but also the ‘chunk’ that will
eventually turn out to be a cat.
 classification: you link each temporary node to the most active pre-existing
node in the network which classifies it as a paw, a tail, a cat and so on.
 enrichment: you enrich your knowledge of the cat by inheriting from ‘cat’
through highly active relation nodes; for example, if you want to know
whether to stroke it, this is how you guess an answer.
Each of these processes is covered by some of the seven tenets: node-creation by the
learning, activation and chunking tenets; classification by the learning, recycling,
network, relation and activation tenets, and enrichment by the learning and
inheritance tenets.
The rest of the paper will explore the consequences of these rather elementary
ideas about general cognition for the theory of language structure, following much the
same logical route as Langacker’s (Langacker 1987) but starting from general
cognitive psychology rather than Gestalt psychology and ending up with more
specific claims about language structure. Whether or not similar bridges can be built
from psychology to other theories of language structure I leave to the experts in those
Morphology and syntax
One of the issues that divides grammatical theories concerns the fundamental question
of how the patterning of language can be divided into ‘levels’ or ‘strata’, such as
phonology, grammar and semantics. The basic idea of recognising distinct levels is
uncontroversial: to take an extreme case, everyone accepts that a phonological
analysis is different from a semantic analysis. Each level is autonomous in the sense
that it has its own units – consonants, vowels, syllables and so on for phonology, and
people, events and so on for semantics – and its own organisation; but of course they
are also closely related so that variation on one level can be related in detail to
variation on the other by ‘correspondence’ or ‘realisation’ rules.
A much more controversial question concerns the number of levels that should
be distinguished between phonology and semantics. Two answers dominate cognitive
linguistics: none (Cognitive Grammar), and one (construction grammar). Cognitive
Grammar recognises only phonology, semantics and the ‘symbolic’ links between
them (Langacker 2007:427), while construction grammar recognises only
‘constructions’, which constitute a single continuum which includes all the units of
syntax, the lexicon and morphology (Evans and Green 2006:753, Croft and Cruse
2004:255). In contrast, many theoretical linguists distinguish two intervening levels:
morphology and syntax (Aronoff 1994, Sadock 1991, Stump 2001); and even within
cognitive linguistics this view is represented by both Neurocognitive Linguistics
(Lamb 1998) and Word Grammar. I shall now argue that the Cognitive Assumption in
(1) supports the distinction between morphology and syntax.
Consider first the chunking tenet (8). This effectively rules out reductionist
analyses which reduce the units of analysis too far; so a reductionist analysis of the
sequence ‘19452012’ recognises nothing but the individual digits, ignoring the
significant dates embedded in it (1945, 2012). Psychological memory experiments
have shown that human subjects are looking for significant ‘higher’ units and it is
these units, rather than the objective string, that they recognise in memory
experiments. The higher units are the ‘chunks’ which we actively seek, and which
create a ‘higher’ level of representation by ‘representational redescription’
(Karmiloff-Smith 1992, Karmiloff-Smith 1994). A chunk based on a collection of
‘lower’ units is not evidence against the cognitive reality of these units, but coexists
with them in a hierarchy of increasingly abstract levels of representation; so the
individual digits of ‘1945’ are still part of the analysis, but the sequence is not
‘merely’ a sequence of digits – more than the sum of its parts.
Chunking is really just one consequence of the recycling tenet (4), which says
that a concept’s properties are represented mentally by links between that concept and
other existing concepts. Rather obviously, this is only possible if the concepts
concerned are represented by single mental nodes (even if these nodes are themselves
represented at a more concrete level by large neural networks). To pursue the previous
example, I must have a single mental node for ‘1945’ because otherwise I couldn’t
link it to the concept ‘date’ or to events such as the end of WWII. But once again it is
important to bear in mind that the existence of the concept ‘1945’ does not undermine
the mental reality of the concepts ‘1’, ‘9’ and so on, which are included among its
Bearing chunking in mind, consider now Langacker’s claim that language
consists of nothing but phonology, semantics and the links between them:
..., every linguistic unit has both a semantic and phonological pole... Semantic
units are those that only have a semantic pole ... phonological units ... have
only a phonological pole. A symbolic unit has both a semantic and a
phonological pole, consisting in the symbolic linkage between the two. These
three types of unit are the minimum needed for language to fulfill its symbolic
function. A central claim ... is that only these are necessary. Cognitive
Grammar maintains that a language is fully describable in terms of semantic
structures, phonological structures, and symbolic links between them.
(Langacker 2007:427)
This claim ignores the effects of chunking. Take a simple example such as cat,
pronounced /kat/. True, this is phonology, but it is not mere phonology. The
‘symbolic unit’ needs a single phonological pole, and in this simple case we might
assume this is the syllable /kat/ - a single phonological unit – but meaning-bearing
units need not correspond to single phonological units.
Apart from the obvious challenge of polysyllabic words, the literature of
linguistics is full of examples where phonological boundaries are at odds with word
boundaries, so that (unlike the case of cat) words cannot be identified as a single
phonological unit. An extreme case of this mismatch arises in cliticization; for
example, the sentence You’re late clearly contains a verb, written as ’re, and has the
same meaning and syntax as You are late. And yet, in a non-rhotic pronunciation
there is no single phonological unit that could be identified as the ‘phonological pole’
of the symbolic unit corresponding to are. The words you’re sound different from
you, but, given the absence of /r/, the difference lies entirely in the quality of the
vowel. No doubt the phonological analysis could be manipulated to provide a single
unit, such as an abstract feature, but this would be to miss the point, which is that
phonological analysis pays attention to completely different qualities from
grammatical analysis.
In short, chunking creates mental nodes which are only indirectly related to
phonological structure; so we must recognise units corresponding not only to syllables
such as /kat/, but also to sequences of syllables such as /katǝgri/ (category) or parts of
syllables such as the vowel quality of /jɔ:/ (you’re). These more abstract units, like the
dates discussed earlier, are mapped onto more concrete units by properties; so just as
‘1948’ is mapped onto ‘1’ and so on, symbolic units can be mapped onto
phonological structures such as /kat/, /katǝgri/ or /jɔ:/. But this is very different from
saying that the chunk is a phonological structure.
In fact, this view is just the same as the traditional view that words mediate
between phonology and meaning. The chunk is the word, and it exists precisely
because it serves this mediating function: it brings together a recognisable stretch of
phonology and a recognisable bit of meaning which systematically cooccur. And it
brings sound and meaning together by having both of them as its properties; so in
traditional terms, the word cat is pronounced /kat/ and means ‘cat’. But of course once
we have recognised the category of ‘words’, and allowed them to have two properties
– a pronunciation and a meaning – there is nothing to stop us from treating them just
like any other type of concept, with multiple properties. So just as linguists have been
saying for centuries, words can be categorized (as nouns, verbs and so on), words
have a language (such as English or French), words have a spelling, they can be
classified as naughty or high-fallutin’, and they even have a history and an etymology.
This view of words as collecting points for multiple properties is an automatic
consequence of chunking combined with the network principle.
None of this is possible if a symbolic unit is simply a link between a semantic
pole and a phonological pole. A link is a property, but it cannot itself have properties;
so even if there is a link between /kat/ and ‘cat’, it cannot be extended to include, say,
a spelling link; the only way to accommodate spelling in this model would be to add
another symbolic link between ‘cat’ and the spelling <cat>, as though /kat/ and <cat>
were mere synonyms. In contrast, recognising words allows us to treat spellings and
pronunciations, correctly, as alternative realisations of the same unit. Moreover, if
meanings and pronunciations belong to a much larger set of properties, we can expect
other properties to be more important in defining some words or word classes. Rather
surprisingly, perhaps, Langacker himself accepts that some grammatical classes may
have no semantic pole:
At the other end of the scale [from basic grammatical classes] are idiosyncratic
classes reflecting a single language-specific phenomenon (e.g. the class of
verbs instantiating a particular minor pattern of past-tense formation).
Semantically the members of such a class may be totally arbitrary. (Langacker
Presumably such a class is a symbolic unit (albeit a schematic one), but how can a
symbolic unit have no semantic pole, given that a unit without a semantic pole is by
definition a phonological unit (ibid: 427)?
My conclusion, therefore, is that Langacker’s ‘symbolic units’ are much more
than a mere link between meaning and sound: they are units with properties –
concepts, defined like all other concepts by their links to other concepts. Following
the network tenet that structures can be represented as networks of atomic nodes, this
conclusion can be presented as a rejection of the first diagram in Figure 1 in favour of
the second:
Figure 1: Symbolic units as concepts rather than links
Suppose, then, that symbolic units are like the unit labelled ‘CAT’. If so, they
are indistinguishable from ‘constructions’ as defined by construction grammar. This
brings us to the second issue of this section: how well does the notion of
‘construction’ accommodate the traditional distinction between morphology and
syntax? As mentioned earlier, one of the claims of construction grammar is that
constructions comprise a single continuum which includes all the units of syntax, the
lexicon and morphology. Croft describes this as ‘one of the fundamental hypotheses
of construction grammar: there is a uniform representation of all grammatical
knowledge in the speakers’s mind in the form of generalized constructions’ (Croft
2007:471). According to Croft, the ‘syntax-lexicon continuum’ includes syntax,
idioms, morphology, syntactic categories and words. This generalisation applies to
Cognitive Grammar as well as construction grammar: ‘the only difference between
morphology and syntax resides in whether the composite phonological structure ... is
smaller or larger than a word.’ (Langacker 1990:293).
Like other cognitive linguists I fully accept the idea of a continuum of
generality from very general ‘grammar’ to a very specific ‘lexicon’; but the claim that
syntax and morphology are part of the same continuum is a different matter. To make
it clear what the issue is, here is a diagram showing an analysis of the sentence Cats
miaow in the spirit of construction grammar. The diagram uses the standard ‘box’
notation of construction grammar, but departs from it trivially in showing meaning
above form, rather than the other way round.
‘cats miaow’
cats miaow
Figure 2: Cats miaow: a construction analysis in box notation
We must be careful not to give too much importance to matters of mere
notation. According to the network tenet (3), all conceptual knowledge can be
represented as a network consisting of nothing but links and atomic nodes; this rules
out networks whose nodes are boxes with internal structure, such as the one in Figure
2. However, a box is just a notation for part-whole relations, so it will be helpful to
remind ourselves of this by translating this diagram into pure-network notation, with
straight arrows pointing from wholes to their parts. Bearing this convention in mind,
the next figure presents exactly the same analysis as Figure 2:
‘cats miaow’
Cats miaow.
Figure 3: Cats miaow: a construction analysis in network notation
As before, the vertical lines show the ‘symbolic’ link between each form and
its meaning, and the diagram illustratescats
generalisation about constructions
providing a single homogeneous analysis for the whole of grammar, including the
morphology inside cats as well as the syntax that links cats to miaow. His claim fits
well with the ‘plain vanilla’ American Structuralist tradition in which morphology is
simply syntax within the word – a tradition represented not only by pre-Chomskyans
(Harris 1951, Hockett 1958) but also by Chomsky himself in his famous analysis of
English auxiliaries as two morphemes (e.g. be+ing), and by Distributed Morphology
(Halle and Marantz 1993). But how well does it mesh with the Cognitive Assumption
(and indeed, with the linguistic facts)? I shall now present an objection to it based on
the prevalence of homonymy.
The argument from homonymy is based on the recycling tenet (4) and goes
like this. When we first meet a homonym of a word that we already know, we don’t
treat it as a completely unfamiliar word because we do know its form, even though we
don’t know its meaning. For instance, if we already know the adjective ROUND (as
in a round table), its form is already stored as a ‘morph’ – a form which is on a higher
abstraction level than phonology; so when we hear Go round the corner, we recognise
this form, but find that the expected meaning doesn’t fit the context. As a result, when
creating a new word-concept for the preposition ROUND we are not starting from
scratch. All we have to do is to link the existing form to a new word. But that means
that the existing form must be conceptually distinct from the word – in other words,
the morph {round} is different from the words that we might write as ROUNDadj and
ROUNDprep. Wherever homonymy occurs, the same argument must apply: the normal
processes of learning force us to start by recognising a familiar form, which we must
then map onto a new word, thereby reinforcing a structural distinction between the
two levels of morphology (for forms) and syntax (for words), both of which are
different from phonology and semantics. The proposed analysis for the homonymy of
round is sketched in Figure 4.
Figure 4: Homonymy and the levels of language
It might be thought that this conclusion could be avoided simply by linking the
form {round} directly to its two different meanings. But this won’t do for purely
linguistic reasons: because the different meanings are also associated with very
different syntactic properties, so distinct concepts are needed to show these
correlations. The problem is especially acute in the case of bilingual individuals, who
may link the same meaning to words from different languages where the syntactic and
morphological differences are even greater. The fact is that round is not mere
phonology, because it is a listed, and recognised, ‘English word’ – i.e. a word-form.
But this word-form is actually shared by (at least) two distinct words, each with its
different syntactic properties and each distinct from the meaning that it expresses. In
short, we need to distinguish the forms of morphology from the words of syntax.
Moreover, the relation between a word and its form is different from the part-whole
relations that are recognised within both morphology and syntax. It makes no sense to
say that {round} is a part of ROUND for the simple reason that, if we can measure
‘size’ at all, they are both the same size at least in the sense that they both map onto
the same number of phonological segments. Rather, the relation between form and
word is the traditional relation called ‘realization’, where the form ‘realizes’ the word
(by making it more ‘real’, or more concrete).
The psychological reality of morphological form is evident in another area of
language-learning: ‘folk etymology’, where we try to force a new form into an
existing pattern, regardless of evidence to the contrary. For instance, our word
bridegroom developed out of Old English bryd-guma when the word gyma ‘man’ fell
out of use, leaving the guma form isolated. The form groom had the wrong meaning
as well as the wrong phonology, but it had the great virtue of familiarity – i.e. it
already existed in everybody’s mind as an independent concept – so it was pressed
into service (‘recycled’) for lack of a better alternative. The fact that folk etymology
has affected so much of our vocabulary, and continues to do so, is clear evidence of
our enthusiasm for recycling existing forms so that every word consists of familiar
forms, regardless of whether this analysis helps to explain their meaning.
This conclusion about morphology and syntax undermines the main distinctive
claim of construction grammar. According to the construction-based analysis of Cats
miaow in Figure 3, the relation of the morphs cat and s to the word cats is just the
same as the latter’s relation to the entire sentence, namely a part-whole relation. But if
the argument from homonymy is correct, the morphs {cat} and {s} exist on a different
level of analysis from the word cats (which we might write, to avoid ambiguity, as
CAT:plural); and the morphs’ relation to the word is different from the word’s
relation to the rest of the sentence. In short, grammatical structure within the word is
not merely a downwards extension of grammatical structure above the word; as has
often been pointed out by morphologists since Robins (Robins 2001), morphology is
not just syntax within the word.
It could be objected that languages vary much more than we might expect (if
morphology and syntax are fundamentally different) in the sheer amount of
morphology that they have, with ‘analytical’ languages such as Vietnamese showing
virtually none. How can a language have no inflectional morphology at all if
morphology is a logical necessity in any language? This objection misses the point of
the argument, where there was no mention of the morphological processes or patterns
that we associate with inflectional morphology (or, for that matter, with derivational
morphology). The only claim is that if a language has homonyms, then it will also
distinguish words from the forms that realize them, however simple or complex those
realization relations may be.
We can push the argument a step further by questioning the constructional
claim that every unit of grammar has a meaning. This claim is explicit in Figure 3,
where the form s means ‘plural’. In contrast, the two-level analysis of Figure 4 has no
direct link between the morph {round} and either meaning, and a similar analysis of
cats would only link {s} to ‘plural’ via the word CAT:plural. This separation of form
from meaning follows from the homonymy argument: if meanings are correlated (as
they are) with syntactic behaviour, they must be a property of words, not forms; so
homonyms must be words that share the same form, not meanings that (directly) share
the same form.
So far, then, the argument seems to support a rather traditional four-level
analysis of language in which semantic structures relate to words, which relate to
morphs, which relate to phones (or whatever the units of phonological structure may
be). However, one of the weaknesses of the traditional view of language architecture
is its restrictiveness; it not only claims that meanings are related to words, but it also
claims that they cannot be related to morphs or to phones. This is psychologically
very implausible, for the simple reason that we learn by recording correlated patterns,
so if a morph correlates with a particular meaning, there is nothing to prevent us from
learning the correlation as a link between the two. So if the morph {un} correlates
with the meaning ‘not’, it seems likely that we will record this link in addition to any
more specific link that we may establish between ‘not’ and words such as UNTIDY.
There is no reason to believe that we avoid redundancy – in fact, redundancy is to be
expected in any adaptive system - so we can assume considerable direct linkage
between morphs and meanings. Moreover, one of the continuing embarrassments for
the traditional model has always been the existence of ‘phonesthemes’ and other kinds
of sound symbolism, such as the meaning ‘indolence or carelessness’ shared by words
such as slack, slattern, sleazy, slob, slut (Reay 2006). Such examples seem to show a
direct connection between phonological patterns and meanings – a connection which
is even more obvious in the case of intonation, where neither morphs nor words are
available to mediate the link to meaning.
The conclusion of this argument, then, is that homonymy automatically pushes
learners to create separate mental representations for morphs and for words.
Typically, it is words that are directly linked to meanings, while morphs only have
meanings indirectly by virtue of the words that they help to realize, and phones are
even more indirectly related to meanings via morphs and words; but exceptions can
be found in which morphs, or even phones, have meanings. This is a very different
model from construction grammar, in which a construction is by definition a pairing
of a meaning with a form, and words and morphs co-exist on the same level as
‘forms’. Nor, on the other hand, is it quite the same as published versions of Word
Grammar (Hudson 2007, Hudson 2010), which all assume that only words can have
Encouragingly, the argument from homonymy takes us to exactly the same
conclusion that many linguists reached in the 1950s by simply looking at the facts of
language. In those days the choice was between the American Structuralist approach
called ‘Item and Arrangement’ (with its process-based alternative called ‘Item and
Process’) and the European approach called ‘Word and Paradigm’ (Robins 2001). The
argument centred round the place of the word in grammatical analysis, with the
Americans tending to deny it any special place and the Europeans making it central;
for the Americans, the grammar within the word is simply a downwards extension of
syntax, whereas for the Europeans it was different. The European argument centred
on languages such as Latin in which it is very easy to show that morphs have no direct
or simple relation to meanings; for example, in the Latin verb amo, ‘I love’, the suffix
{o} expresses person, number and tense all bundled up together. The conclusion is
that, unlike words, morphs have no meaning in themselves; instead, they help to
realise word-classes (such as ‘first person singular present tense’), and it is these that
have meaning. Of course this is no longer a debate between Europe and America, as
many leading American morphologists accept the same conclusion (Aronoff 1994,
Sadock 1991, Stump 2001); but it is very noticeable that the literature of construction
grammar follows the American Structuralist path rather than engaging with the
Equally encouragingly, the separation of phonology, morphology and syntax
is confirmed by psycholinguistic priming experiments, which offer a tried and tested
way to explore mental network structures (Reisberg 2007:257). In a priming
experiment, the dependent variable is the time (in milliseconds) taken by a person to
recognise a word that appears on a screen in front of them, and to make some decision
about it which depends on this recognition; for example, as soon as the word doctor
appears, the subject may have to decide whether or not it is an English word and then
press one of two buttons. The independent variable is the immediately preceding
word, which may or may not be related to the current word; for instance, doctor might
follow nurse (related) or lorry (unrelated). What such experiments show is that a
related word ‘primes’ the current word by making its retrieval easier, and therefore
faster; and crucially, they allow us to distinguish the effects of different kinds of
 semantic (nurse – doctor: Bolte and Coenen 2002, Hutchison 2003)
 lexical (doctor – doctor; Marslen-Wilson 2006)
 morphological (factor – doctor; Frost and others 2000)
 phonological (nurse – verse; Frost and others 2003)
Once again, therefore, we have evidence for a three-level analysis which breaks down
the general notion of ‘form’ into three kinds of structure, each with its own distinctive
units: phonology, morphology and syntax (whose units are words). This is the
architecture claimed in Word Grammar (and many other theories of grammar), but it
conflicts with construction grammar and even more so with Cognitive Grammar.
One possible objection to this line of argument is that it could be the thin end
of a very large wedge which would ultimately reduce the argument to absurdity. Why
stop at just the three formal levels of phonology, morphology and syntax? Why not
recognise four or five levels, or, for that matter, forty or fifty? This question requires a
clear understanding of how levels are differentiated: in terms of abstractness, and not
in terms of either size or specificity. For instance, the only difference between the
word DOG, the morph {dog} and the syllable /dɒg/ is their abstractness, because they
all have the same size and the same degree of specificity. DOG has properties such as
its meaning and its word-class which are more abstract than those of {dog}, which in
turn are more abstract than those of /dɒg/. The question, therefore, is whether three
levels of abstractness is in some way natural or normal. In relation to English, might
we argue for further levels? Are there other languages where the evidence points to
many more levels? Maybe, but it seems unlikely given that we already have evidence
from massively complicated morphological systems that a single level of morphology
is enough even for them (Sadock 1991).
Phrases or dependencies
Another structural issue which has received very little attention in the cognitive
linguistics literature is the choice between two different views of syntax. Once again,
cognitive linguistics is firmly located in the tradition of American Structuralism rather
than in its European rival. The American tradition dates from the introduction of
‘immediate constituent analysis’, which inspired modern Phrase Structure Grammar
(PSG; Bloomfield 1933, Percival 1976). In contrast, the European tradition of
Dependency Grammar (DG) dates back a great deal further, and certainly at least to
the Arabic grammarians of the 8th century (Percival 1990, Owens 1988), and provided
the basis for a great deal of school grammar on both sides of the Atlantic. Among the
theories of grammar that are aligned with cognitive linguistics, Word Grammar is the
only one that even considers Dependency Grammar as a possible basis for syntax.
The essential difference between the two approaches lies in their assumptions
about what relations can be recognised in syntax. For PSG, the only possibility is the
very elementary relation of meronymy: the relation between a whole and its parts.
This restriction follows automatically from the definition of a phrase-structure tree as
equivalent to a bracketed string. Brackets are very elementary devices whose purpose
is simply to distinguish parts from non-parts; for instance, the brackets in ‘a [b c]’
show that b and c are both parts of the larger unit ‘b c’, but a is not. A bracketed string
is inherently incapable of giving information about the relations between parts and
non-parts (e.g. between a and b). In contrast, DG focuses on the relations between
individual words, recognising traditional relations such as ‘subject’, ‘complement’
and ‘modifier’. To take a very simple example, consider Small birds sing. PSG
recognises the phrase small birds as well as the whole sentence, but it recognises no
direct relation at all between small and birds, or between birds and sing. In contrast,
the individual words and their direct relations are all that a DG recognises, although
the phrase small birds is implicit in the link between small and birds.
Of course, there are versions of PSG which include the traditional relations as
an enrichment of the basic part-whole relation, so that small birds is recognised
explicitly as the subject of the sentence. This is true in Lexical Functional Grammar
and Head-Driven Phrase Structure Grammar, as well as in other ‘functional’ theories
such as Relational Grammar and Systemic Functional Grammar. More relevantly
here, it is also a characteristic of construction grammar (except Radical Construction
Grammar). But this compromise should not obscure the fact that the relations
concerned are still basically part-whole relations. All these versions of PSG, including
those found in CL, still exclude direct relations between words, such as those between
small and birds and between birds and sing.
To clarify the issues it will again help to consider concrete diagrammatic
structures, so here are three diagrams for the syntactic structure of Small birds sing.
(A) is pure PSG, without function labels; (B) is a compromise analysis which is at
least within the spirit of construction grammar in terms of the information conveyed,
even if tree notation is not popular in the cognitive linguistics literature; and (C) is an
example of DG enriched with function labels. The ‘stemma’ notation in (C) was
introduced by the first DG theorist (Tesnière 1959) and is widely used in DG circles.
It has the advantage of comparability with the tree notation of PSG, but I shall suggest
a better notation for dependencies below. All three diagrams show syntax without
semantics, but this is simply because the present topic is syntactic relations. This is
also why the nodes are unclassified.
small birds sing
small birds sing
small birds
small birds
(A )
Figure 5: Phrase structure or Dependency structure
The issue can be put in concrete terms as two related questions about the
example Small birds fly: What is the subject, and what is it the subject of? In the
European dependency tradition, the subject is birds, and it is the subject of fly. In
contrast, the American PSG tradition takes small birds as the subject of the sentence
Small birds fly. The PSG tradition has had such an impact on grammatical theory that
most of its followers take it as obviously true, but it has serious weaknesses, and
especially so if we start from the Cognitive Assumption, so it is especially
problematic for cognitive linguistics. This is already recognised in Cognitive
Symbolic assemblies exhibit constituency when a composite structure ... also
functions as component structure at another level of organization. In Cognitive
Grammar, however, grammatical constituency is seen as being variable,
nonessential and nonfundamental. (Langacker 2007:442)
I shall start with the specifically cognitive weaknesses, before turning to more
familiar ‘purely linguistic’ weaknesses.
From a cognitive point of view, PSG has two weaknesses, to do with relations
and with tokens. The first weakness is the extreme poverty of its assumptions about
possible relations, in contrast with the assumption accepted by almost every
psychologist that cognitive structure is an associative network (Ferreira 2005). By
excluding all relations except that between a whole and its parts, it excludes a great
deal of normal cognitive structure – and not least, the whole of social structure, based
as it is entirely on relations such as ‘mother’ and ‘friend’. The relation tenet (6) asserts
that relations of many different types can be learned and distinguished from one
another, so there is no reason to prioritise the part-whole relation. And if other
relations are possible, then we can freely relate any object to any other object. So just
as we can relate one human to another in our social world, in the realm of syntax we
can relate one word directly to another. Although Bloomfieldian Immediate
Constituent Analysis had roots in German psychology, Chomsky’s formalisation in
terms of bracketed strings and rewrite rules removed any semblance of psychological
The second cognitive weakness of PSG is its assumption about tokens, i.e.
about the sentences that a grammar generates. The symbols in a sentence structure are
seen as mere copies of those used in the grammar; so if the sentence structure
contains, say, the symbol N (for ‘noun’), this is the same symbol as the N in the
grammar; and the symbol birds in the sentence is the same as the one in the grammar.
Once again we find a very impoverished view of cognitive capacity, in which the only
operation that we can perform is to copy symbols and structures from the grammar
onto the sentence structure. In contrast, the chunking tenet (8) says that when we
encounter a new experience, we build a new node for it, and use relevant parts of our
stored knowledge to enrich this node. In this view, the nodes for tokens are much
more than mere copies of those in the grammar for the relevant types; although the
tokens inherit most of the properties of the types, they also have a great many other
properties, reflecting the particularities of their context. If the sentence contains the
word birds, this is not just the stored word birds, less still the lexeme BIRD or the
category ‘plural’; instead, it is a distinct concept from all these, with properties that
include the other words in the sentence.
Suppose, then, that we take the relation and chunking tenets seriously. What
mental structure would we expect for the sentence Small birds sing? First, small and
birds are clearly related, so we expect a direct relation between these words; equally
clearly, the relation between small and birds is asymmetrical because small modifies
the meaning of birds, rather than the other way round. In traditional terminology,
small is the ‘modifier’ and depends on birds, which is the implied phrase’s head. But
if the meaning of birds is modified by small, it follows that birds – i.e. this particular
token of birds – does not mean ‘birds’, but means ‘small birds’; so there is no need to
postulate a larger unit small birds to carry this meaning. In other words, the meaning
of a phrase is carried by its head word, so the phrase itself is redundant. Furthermore,
the link between small and birds explains why they have to be positioned next to each
other and in that order. We return to questions of word order in section 7, but we can
already see that direct word-word links will play an important part in explaining word
Similar arguments apply to the subject link. Once again, this is a direct
asymmetrical link between two single word-tokens: birds (not: small birds) and sing,
bearing in mind that birds actually means ‘small birds’. As in the first example, the
dependent modifies the head’s meaning, though in this case the kind of modification
is controlled by the specifics of the dependency, traditionally called ‘subject’. Given,
then, that birds is the subject of sing, and that birds means ‘small birds’, it follows
that the word-token sing doesn’t just mean ‘sing’, but means ‘small birds sing’. So
once again, the head word (sing) carries the meaning of the whole phrase, and no
separate phrasal node is needed. In this case, the ‘phrase’ is the entire sentence, so we
conclude that the sentence, as such, is not needed. The proposed analysis is presented
in Figure 6.
‘small birds sing’
‘small birds’
Figure 6: Small birds sing: syntax and semantics
This diagram includes the earlier ‘stemma’ notation for syntactic relations
alongside a more obvious notation for the ‘meaning’ relation. This is inconsistent, as
the ‘meaning’ relation and the syntactic dependency relations look more different than
they should. After all, part of the argument above in favour of dependency analysis is
that our conceptual apparatus already contains plenty of relations of various kinds in
domains outside language, so the same apparatus can be applied to syntax. The
trouble with ‘stemma’ notation is that it is specially designed for syntax, so it
obscures similarities to other domains. In contrast, the notation for the ‘meaning’
relation is simply a general-purpose notation for relations as defined by the relation
tenet of (6). This general-purpose notation works fine for syntax, and indeed avoids
problems raised by the rather rigid stemma notation, so we now move to standard
Word Grammar syntactic notation, as in Figure 7, with arrows pointing from a word
to its dependents.
‘small birds sing’
‘small birds’
Figure 7: Small birds sing: Word Grammar notation
This notation should appeal to cognitive linguists looking for an analysis
which treats syntax as an example of ordinary conceptual structures. After all, it
makes syntax look like a network, which (according to the network tenet of (3)) is
what ordinary conceptual structure is like; in contrast, the notation used for syntax in
the cognitive linguistics literature is hard to relate to general conceptual structure.
Admittedly, the structure in Figure 7 may not look obviously like a network, but more
complicated examples certainly do. The sentence in Figure 8 makes the point, since
every arrow in this diagram is needed in order to show well-know syntactic patterns
such as complementation, raising or extraction. This is not the place to justify the
analysis (Hudson 2007, Hudson 2010), but it is worth pointing to one feature: the
mutual dependency between which and do. If this is indeed the correct analysis, then
the case for dependency structure is proven, because mutual dependency is impossible
in PSG. (And even assuming dependency structure, it is impossible in stemma
notation.) In the diagram, ‘s’ and ‘c’ stand for ‘subject’ and ‘complement’. The
notational convention of drawing some arrows above the words and others below will
be explained in the discussion of word order in section 7.
extractee & s
Figure 8: A complex syntactic network
Another way to bring out the network-like nature of syntactic structure is to go
beyond the structure of the current tokens in order to show how this structure is
derived from the underlying grammar. Returning to the simpler example, Small birds
sing, we know that birds can be the subject of sing because the latter requires a
subject, a fact which is inherited from the ‘verb’ node in the grammar; and similarly,
small can be the dependent of birds because it is an adjective, and adjectives are
allowed to modify nouns. The word tokens inherit these properties precisely because
they are part of the grammar, thanks to ‘isa’ links to selected word types. This is all as
predicted by the inheritance tenet (5), and can be visualised in the diagram below,
where the small triangle is the Word Grammar notation for the ‘isa’ relation. In
words, because the token small isa the lexeme SMALL which isa adjective, and
because an adjective is typically the modifier of a noun (represented by the left-hand
dot, which isa noun), it can be predicted (by inheritance) that small is also the
modifier of a noun, which (after some processing) must be birds. A similar logic
explains the subject link from sing to birds. The main point is that the syntactic
structure for the sentence is a small part of a much larger network, so it must itself be
a network.
Figure 9: Small birds sing and its underlying grammar
The main conclusion of this section is that syntax looks much more like
general conceptual structure if we allow words to relate directly to one another, as in
dependency analysis, than if we apply a rigid PSG analysis. Dependency structure is
very similar to many other areas of conceptual structure, such as social structure,
whereas it is hard to think of any other area of cognition which allows nothing but
part-whole relations. To make a rather obvious point, it is arguably PSG that has
encouraged so many of our colleagues to accept the idea that language is unique; but
PSG is simply a method of analysis, not a demonstrated property of language. The
only way to prove that PSG is in fact correct is to compare it with the alternatives,
especially dependency analysis; but this has hardly happened in the general literature,
let alone in the cognitive linguistics literature.
Quite apart from the cognitive arguments for direct dependencies between
words. it is easy to find ‘purely linguistic’ evidence; for example, most lexical
selection relations involve individual words rather than phrases. Thus the verb
DEPEND selects the preposition ON, so there must be a direct relation between these
two words which is impossible to show, as a direct relation, in PSG. In the next
section I shall show how this advantage of dependency structure applies in handling
However, it is important once again not to lurch from one extreme position to
its opposite. I have objected to PSG on the grounds that it rules out direct links
between words, a limitation which is arbitrary from a cognitive point of view; but it
would be equally arbitrary to rule out ‘chunking’ in syntax. In our simple example,
maybe we recognise small birds as a chunk as well as recognising the relations
between its parts. Given that we clearly do recognise part-whole relations in other
areas of cognition, it would be strange if we were incapable of recognising them in
syntax. Even in the area of social relations, we can recognise collectivities such as
families or departments as well as relations among individuals. And in syntax there do
seem to be some phenomena which at least seem at first sight to be sensitive to phrase
boundaries. One such is mutation in Welsh (Tallerman 2009) which seems to be
triggered by the presence of a phrase boundary and cannot be explained satisfactorily
in terms of dependencies. As we have already seen, there is no reason to think that
cognition avoids redundancies; indeed, it is possible that gross redundancy is exactly
what we need for fast and efficient thinking. This being so, we cannot rule out the
possibility that, even if syntactic structure includes direct word-word dependencies, it
also includes extra nodes for the phrases that these dependencies imply (Rosta 1997,
Rosta 2005).
Idioms and constructions
I have argued for a rather traditional view of language structure in which words are
central both by virtue of being the units that define a level of language which is
distinct from phonology and morphology, and also as the main (and perhaps only)
units on that level. My main argument was based on the Cognitive Assumption and its
tenets, but I also showed that the conclusion is required by more traditional types of
evidence. We now consider how this model of structure accommodates the
grammatical patterns that are so familiar from the constructional literature: idioms (9),
(10), clichés or formulaic language (11), meaning-bearing constructions (12), and
non-canonical constructions (13).
(9) He kicked the bucket.
(10) He pulled strings.
(11) It all depends what you mean.
(12) Frank sneezed the tissue off the table.
(13) How about a cup of tea?
All such examples show some kind of irregularity or detail that supports the idea of
usage-based learning (rather than mere ‘triggering’ of an inbuilt propensity), which in
turn is predicted by the learning tenet (2) and the inheritance tenet (5). But if the
proposed view of language structure makes these patterns relatively easy to
accommodate, this will also count as further ‘purely linguistic’ evidence for this view.
The following comments build on a number of earlier discussions of how Word
Grammar might handle constructions (Gisborne 2009, Gisborne 2011, Holmes and
Hudson 2005, Hudson 2008, Sugayama 2002).
One unifying feature of all the patterns illustrated in these examples is that
they involve individual words rather than phrases. This is obvious in most of the
examples; for instance, the meaning ‘die’ is only available if the verb is KICK
combined with BUCKET. An apparent counterexample is (12), from Goldberg
1995:152, where it is given as an example of the ‘caused-motion construction’.
According to Goldberg, ‘the semantic interpretation cannot be plausibly attributed to
the main verb’, but this is not at all obvious. After all, it is precisely because the verb
sneezed describes an action that it can be turned into a causative, and it is because it
needs no object in its own right that an extra one can be added. I shall argue below
that this is a straightforward example of word-formation, in which the explanation for
the syntactic pattern and its meaning revolves around one word, the verb. If this is so,
then individual words play a crucial role in every pattern that has been claimed to
require ‘constructions’; and crucially, phrase structure, as such, plays no such role.
We shall now work through the various types of ‘construction’ listed above,
starting with idioms. The main challenge of idioms is that the idiomatic meaning
overrides the expected literal meaning, so we need an analysis that includes the literal
meaning as well as the idiomatic one, while also showing that the literal meaning is
merely potential. A system that generated only one analysis would miss the point as
much as an analysis of a pun which showed only one of the two interpretations. This
view is supported by the psycholinguistic evidence that literal word meanings become
active during idiom production (Sprenger and others 2006), and that the syntactic
analysis of an idiom proceeds as normal (Peterson and others 2001). For example, in
processing He kicked the bucket, we activate all the normal syntactic and semantic
structures for KICK and BUCKET as well as the meaning ‘die’. A cognitive network
is exactly what is needed to explain a multiple analysis like this because it
accommodates the effects of spreading activation: the processes by which activation
from the word token kicked spreads to the lexeme KICK, and then as activation
converges from the and bucket, focuses on the sub-lexeme KICKdie, as found in kick
the bucket.
The diagram in Figure 10 shows the end-state of this processing, but in
principle a series of diagrams complete with activation levels could have shown the
stages through which a speaker or hearer passes in reaching this end-state. In words,
the word token kicked isa KICKdie, which in turn isa KICK. According to the
inheritance tenet (5), kicked inherits the properties of both these nodes, so it inherits
from KICKdie the property of requiring the bucket as its object; and thanks to the
workings of default inheritance, the lower node’s meaning overrides the default.
Figure 10: An idiom and its literal counterpart
The key notion in this analysis of idioms is ‘sub-lexeme’, one of the
distinctive ideas of Word Grammar. Sub-lexemes are common throughout the lexicon,
and not just in handling idioms; for example, the lexeme GROW carries important
shared properties such as irregular morphology, but its transitive and intransitive uses
combine different syntactic patterns with different meanings, so each requires a
different sub-lexeme of GROW. This analysis reveals the similarities of morphology
as well as the differences of syntax and meaning. The same apparatus seems well
suited to the analysis of idioms, which combine special syntax and meaning with a
close link to the literal pattern.
Unfortunately, Jackendoff thinks otherwise.
Another solution would be to say that kick has a second meaning, ‘die’, that
only can be used in the context of the bucket. Coincidentally, in just this
context, the and bucket must also have second meanings that happen to be
null. Then the meaning of the idiom can in fact be introduced with a single
word. The difficulty with this solution is its arbitrariness. There is no nontheory-internal reason to concentrate the meaning in just one of the
morphemes. (Jackendoff 2008)
His objection is strange, given the rather obvious fact that the idiom’s meaning ‘die’
is the literal meaning of a verb, so the verb is the obvious word to receive the idiom’s
meaning. Moreover, the analysis that he himself offered ten years earlier did
‘concentrate the meaning in just one of the morphemes’ by linking the meaning ‘die’
directly to the verb kick (Jackendoff 1997:169).
The apparatus of sub-lexemes and default inheritance allows us to model
different degrees of irregularity in idioms. The classic discussion of idioms (Nunberg
and others 1994) distinguished just two types: ‘idiomatic phrases’ such as kick the
bucket and ‘idiomatically combining expressions’ such as pull strings, which allow a
great deal more syntactic flexibility (e.g. Strings have been pulled; Strings are easy to
pull; He pulled a lot of strings.) However, the historical development of idioms argues
against such a clear division. Today’s metaphors tend to turn into tomorrow’s idioms,
which become increasingly opaque as the original metaphor vanishes from sight. It
seems much more likely that there is a continuum of irregularity, with kick the bucket
at one end and pull strings near the other end. Whereas kick the bucket overrides the
entire meaning of KICK, pull strings builds on a living metaphor, so the idiomatic
meaning isa the literal one: if I pull strings for my nephew, this is presented as a
deviant example of literally pulling strings. This explains the syntactic freedom. In
between these two examples, we find idioms such as cry wolf which derives its
meaning from a fairy story which some people know, but whose meaning is related in
such a complicated way to the story that it shows virtually no syntactic freedom.
Turning next to formulaic language such as It all depends (on) what you mean,
usage-based learning guarantees that we store a vast number of specific exemplars,
including entire utterances as well as individual word tokens. Every time we hear an
example of a stored utterance its stored form becomes more entrenched and more
accessible, which encourages us to use it ourselves, thereby providing a feed-back
loop which maintains the overall frequency of the pattern in the general pool of
Word Grammar provides a very simple mechanism by which formulaic
expressions are stored: the concepts that represent the word tokens do not fade away
as most tokens do, but persist and become permanent (Hudson 2007:54). For instance,
imagine someone hearing (or saying) our earlier example: Small birds sing.
Immediately after the event, that person’s mind includes a structure like the one
shown in Figure 9, but the nodes for the word tokens are destined to degenerate and
disappear within seconds. That is the normal fate of word tokens, but sometimes the
tokens are memorable and are remembered – which means that they are available if
the same tokens occur again. This process is logically required by any theory which
accounts for the effects of frequency:
While the effects of frequency are often not noted until some degree of
frequency has accumulated, there is no way for frequency to matter unless
even the first occurrence of an item is noted in memory. Otherwise, how
would frequency accumulate? (Bybee 2010:18)
In short, formulaic language is exactly what we expect to find, in large quantities, in
the language network; and it is represented in exactly the same way as the utterances
from which it is derived.
Meaning-bearing constructions such as the ‘caused-motion construction’ are
more challenging precisely because they go beyond normal usage. If anyone actually
said Frank sneezed the tissue off the table, they would certainly be aware of breaking
new linguistic ground, which is the exact opposite of the situation with idioms and
formulaic language. The standard constructional analysis of such cases was presented
by Goldberg (Goldberg 1995:152-79), so we can take this analysis as our starting
point. Figure 11 shows Goldberg’s diagram (from page 54) for the transitive use of
SNEEZE. This diagram is the result of unifying two others: the one in the middle line
for ordinary intransitive SNEEZE, and the one for the caused-motion construction
(which accounts for the top and botom lines). The letter ‘R’ stands for the relation
between these two patterns, which at the start of the middle line is explained as
‘means’, expressing the idea that sneezing is the ‘means’ of the motion (rather than,
say, its cause or manner).
R: means SNEEZE
< cause
theme >
< sneezer
Figure 11: Caused motion in constructional notation
The constructional notation is unhelpful in a number of respects, but the most
important is its semantic rigidity. The trouble is that it requires a one-one match
between words and semantic units, so one verb can only express one semantic unit,
whose arguments are expressed by the verb’s various dependents. This is a problem
because the sentence Frank sneezed the tissue off the table actually describes two
separate events: Frank sneezing, and the tissue moving off the table. Frank is the
‘sneezer’, but only in relation to the sneezing; and the tissue is the theme of the
moving, but not of the sneezing. Similarly, if Frank (rather than the sneeze) is a cause,
it is in relation to the moving, and not the sneezing; and off the table describes the
direction of the moving, and not of the sneezing. Collapsing these two events into a
single semantic structure is at best confusing, and arguably simply wrong.
In contrast, the network notation of Word Grammar in Figure 12 provides
whatever flexibility is needed. The analysis keeps as close as possible to Goldberg’s,
and the example is simplified, in order to focus on the benefits of flexibility in the
semantic structure. The dotted lines link the words to their meanings, and as usual the
little triangles show ‘isa’ relations. The main benefit of this network notation is the
possibility of separating the node labelled ‘Frank sneeze it off’ from the one labelled
‘it off’. The former is the meaning of the verb token sneezed, which, as usual, shows
the effects of all the dependents that modify the verb. The latter is a single semantic
entity which is contributed jointly by the object and directional adjunct (Goldberg’s
‘oblique’). In itself it isa motion, with a theme and a direction, but in relation to the
sneezing, it is the ‘result’ of the verb’s action. Notice how the two-event analysis
avoids all the problems noted above; so Frank is the sneezer, but plays no role at all in
the movement, and contrariwise ‘it’ and ‘off’ define the movement but have nothing
directly to do with the sneezing.
Frank sneeze
it off
it off
SNEEZE cause-move
Figure 12: Caused motion in Word Grammar notation
The last kind of ‘construction’ is represented here by How about a cup of tea? Such
examples have always been central to Word Grammar (Hudson 1990:5-6), where I
call them ‘non-canonical’, but they also play an important part in the literature of
constructions where the classic discussion calls What’s X doing Y a ‘non-core’
construction because the normal ‘core’ rules are suspended (Kay and Fillmore 1999).
To change examples, the ‘core’ rules require a sentence to have a finite verb; but there
are exceptions without finite verbs such as the how about pattern in How about a cup
of tea?. Once again, Word Grammar provides the necessary amount of flexibility,
thanks to the focus on word-word dependencies and the possibility of ‘sub-lexemes’
mentioned above. Indeed, it seems that these ‘constructions’ all consist of a
continuous chain of dependencies, a claim which cannot even be expressed in terms
of phrase structure, so dependency structure is more appropriate than phrase structure
(Holmes and Hudson 2005). The diagram in Figure 13 shows how the sub-lexemes
HOWx and ABOUTx relate to each other and to their super-lexemes, and hints at a
semantic analysis paired with the syntactic one. (Note how the notation allows us to
ignore the vertical dimension when this is convenient; in this case, the isa triangles
face upwards, reversing the normal direction.) This semantic sketch will be developed
in the next section as an example of what is possible in a network-based analysis of
‘how about x?’
Figure 13: How about x? Dependency syntax
This section has shown how easily Word Grammar accommodates the
idiosyncratic patterns that lie at the heart of what I have called ‘constructional’
analyses: idioms such as kick the bucket and pull strings, clichés or formulaic
language such as It all depends what you mean, meaning-bearing constructions such
as the caused-motion construction, and non-canonical constructions such as how
about X?. The key pieces of theoretical apparatus are sub-lexemes, default
inheritance, token-based learning, flexible networks with classified relations and, of
course, dependency structure in syntax. But how does this argument leave the notion
of ‘construction’?
It all depends what you mean by ‘construction’ (and ‘construction grammar’),
and as we have already seen, this varies a great deal from theory to theory, and from
person to person. If ‘construction grammar’ is simply the same as ‘cognitive
linguistics’, then nothing changes. Since Word Grammar is definitely part of
cognitive linguistics, it must also be an example of ‘construction grammar’ (though it
is hard to see what is gained by this double naming). Similar conclusions follow if
‘construction grammar’ is a grammatical theory that rejects the distinction between
‘grammar’ and ‘lexicon’, and recognises very specific grammatical patterns alongside
the very general ones. Here too Word Grammar is an ordinary example of
construction grammar (Gisborne 2008).
However, the debate becomes more interesting if we give ‘construction’ a
more precise meaning, and define ‘construction grammar’ as a grammar that
recognises nothing but constructions. For many authors, a construction is by
definition a pairing of a ‘form’ (some kind of formal pattern, whether phonological,
morphological or syntactic) with a meaning: ‘The crucial idea behind the construction
is that it is a direct form-meaning pairing that has sequential structure and may
include positions that are fixed as well as positions that are open.’ (Bybee 2010:9).
This definition implies a very strong claim indeed: that every pattern that can be
recognised at the levels of syntax or morphology can be paired with a single meaning.
We have already seen (in section 3) that this claim is untenable if morphology and
syntax are recognised as distinct levels, because (in this view) morphological
structures are typically not directly linked to meaning. Since Word Grammar does
recognise morphology and syntax, the grammatical patterns that it recognises cannot
be described as ‘constructions’. And yet Word Grammar can accommodate all the
idiosyncratic patterns that are often quoted as evidence for constructions.
In this sense, then, Word Grammar is a radical departure from ‘construction
grammar’, as radical as Croft’s Radical Construction Grammar, which departs in
exactly the opposite direction. For Word Grammar, the basic units of syntax are
words and the dependency relations between them. In contrast, Croft believes that the
basic units are meaning-bearing constructions:
Radical Construction Grammar ... proposes that constructions are the basic or
primitive elements of syntactic representation and defines categories in terms
of the constructions they occur in. For example, the elements of the
Intransitive construction are defined as Intransitive Subject and Intransitive
Verb, and the categories are defined as those words or phrases that occur in
the relevant role in the Intransitive construction. (Croft and Cruse 2004:284,
repeated as Croft 2007: 496).
Croft’s example of a non-construction that would not be recognised is ‘verb’.
Furthermore, the term ‘Intransitive Subject’ is presumably merely shorthand for
something more abstract such as ‘the noun that expresses the actor’, or less abstract,
such as ‘the noun before the verb’, because ‘there are no syntactic relations in Radical
Construction Grammar’ because such relations are redundant if morphosyntactic clues
are related directly to semantic relations (ibid:497).
As I commented earlier, the claims of Radical Construction Grammar are not
derived from the Cognitive Assumption; indeed, they conflict directly with some of
the tenets, including the principle for learning that Croft himself expressed so well. In
discussing the choice between fully general analyses without redundancy and fully
redundant listing of specific examples of general patterns, he concludes: ‘grammatical
and semantic generality is not a priori evidence for excluding the more specific
models’ (Croft 1998). This is generally accepted as one of the characteristics of
usage-based learning, so we can assume that any construction is stored not only as a
single general pattern, but also as a collection of individual exemplars that illustrate
the pattern. Indeed, the learning tenet requires the exemplars to be learned before the
general pattern, because these are the material from which the general pattern is
learned. But if exemplars can be mentally represented separately from the general
construction, and if they are even represented before the construction, how can the
construction be more basic? In short, the basic tenets of the Cognitive Assumption
support the more traditional approach which Croft criticizes as ‘reductionist’, in
which more abstract and general patterns are built out of more concrete and specific
Semantic/encyclopedic structures
If the Cognitive Assumption (1) is right, it follows that there can be no boundary
between ‘linguistic meaning’ and general conceptual structure, and therefore no
boundary between ‘dictionary’ meaning and ‘encyclopedic information’. The typical
meaning of a word or a sentence is simply the part of general conceptual structure that
is activated in the mind of the speaker and hearer. This view of meaning is one of the
tenets of cognitive linguistics (including Word Grammar) in contrast with the more
‘classical’ or ‘objectivist’ approaches to semantics that have dominated linguistic
semantics. Cognitive linguistics cannot match the massive apparatus of formal logic
that these approaches bring to bear on the analysis of meaning, but once again the
Cognitive Assumption may be able to guide us towards somewhat more formal
analyses than have been possible so far.
The most relevant consequence of the Cognitive Assumption is the recycling
tenet (4), the idea that each new concept is defined in terms of existing concepts. This
immediately rules out any boundary between ‘dictionary’ and ‘encyclopedia’, because
any dictionary entry is bound to recycle an encyclopedic entry. For example, take a
child learning the word CAT and its meaning: the child stores the word and looks for
a potential meaning in general memory, where the (encyclopedic) concept ‘cat’ is the
obvious candidate. Recycling guarantees that this concept is the one that the child
uses as the meaning of CAT.
Recycling also rules out a popular approach to lexical semantics in which
lexical meanings are defined in terms of a pre-existing metalanguage such as the
‘Natural Semantic Metalanguage’ suggested by Wierzbicka (Wierzbicka 1996). The
argument goes like this (Hudson and Holmes 2000): Once a concept has been created,
it is available as a property of other concepts, and should be recycled in this way
whenever it is relevant. But new concepts cannot be used in this way if the only
elements permitted in a definition are drawn from the elementary semantic
metalanguage. To take a concrete example, consider Wierzbicka’s definition of a
bicycle (Wierzbicka 1985:112), which refers to the pedals in at least three places: as a
part of the structure, as the source of power, and as the place for the rider’s feet. The
problem is that ‘pedal’ is not part of the metalanguage, so a circumlocution (‘parts for
the feet’) has to be used, obscuring the fact that each reference is to the same object.
In contrast, the recycling tenet requires us to recognise the concept ‘pedal’ and to
name this concept whenever it is relevant; but this of course is totally incompatible
with any attempt to define every concept soleley in terms of a fixed list of primitives.
Another tenet highly relevant to semantic structure is the network tenet (3),
which requires every scrap of information to be expressed in terms of network
structures. This means that network notation has to be available for every analysis that
can be expressed in other notations such as the predicate calculus. Take, for example,
the universal and existential quantifiers which distinguish semantically between
sentences such as the following:
(14) Everyone left.
∀x, person(x) → left(x) (‘For every x, if x is a person then x left’)
(15) Someone left.
∃x, person(x), left(x) (‘There is an x such that x is a person and x left’)
The sentences undeniably have different meanings, and the linear notation of formal
semantics distinguishes them successfully, but the challenge is to translate the linear
notation into a network. Thanks to the inheritance tenet (5), the solution is
surprisingly easy (Hudson 2007:33-4). Universal quantification is simply inheritance,
because any instance of a category automatically inherits all of that category’s
properties (unless of course they are overridden). Consequently, we can represent the
meaning of Everyone left as shown in the first diagram of Figure 14. According to
this diagram, the typical person left, so one of the properties to be inherited by any
example of ‘person’ is that they left. In contrast, the diagram for Someone left shows
that some particular person (represented by the dot) left, so leaving is not a property
of ‘person’ and therefore cannot be inherited by other people.
Figure 14: Everyone left and someone left.
This simple and natural analysis of universal and existential quantification
shows the benefit of starting from the Cognitive Assumption; and of course this
assumption also leads to an analysis which is cognitively more plausible than
traditional logic because default inheritance allows exceptions. As in ordinary
conversation, the generalisation that everyone left is not, in fact, overturned
completely if it turns out that a few people did not leave; exceptions are to be
expected in human reasoning.
Word Grammar offers structural analyses for many other areas of semantics
(Gisborne 2010, Hudson 1984:131-210; Hudson 1990:123-166; Hudson 2007:211248; Hudson 2010:220-241), all of which are informed by the Cognitive Assumption.
The example of universal and existential quantification illustrates a general
characteristic of these analyses: patterns that other theories treat as special to
semantics turn out to be particular cases of much more general cognitive patterns that
are therefore found in other areas of language structure. We shall consider another
example in the discussion of word order (section 7), where I argue that word order
rules build on two relations also found in semantics: the landmark relation expressed
by spatio-temporal prepositions, and the temporal relations expressed by tense. This
sharing of patterns across linguistic levels is exactly as expected given the Cognitive
Assumption, but it is rarely discussed in other theories.
This article is not the place to summarise all the possibilities of Word
Grammar semantics, but it may be helpful to illustrate them through the analysis of
one concrete example. In the previous section I gave a syntactic analysis of how about
x? in Figure 13 as an example of a non-canonical construction, with a promise of a
fuller semantic analysis which I can now redeem. The meaning of the syntactic
pattern how about x? is given as a node labelled ‘how about x’, an analysis which
does at least show that ‘x’ is defined by the complement of about, though it leaves the
rest of the meaning completely unanalyzed and undefined. But how might we define a
notion like this? Constructional analyses generally leave semantic elements without
definition; for example, Kay and Fillmore’s analysis of the meaning of the WXDY
construction recognises an element called ‘incongruity-judgment’ made by a
pragmatically-identified judge called ‘prag’ about the entity defined by the ‘x’ word.
That is as far as the semantic analysis goes. But according to the network tenet (3),
concepts are defined by their links to other concepts, so any semantic element can be
defined by links to other concepts.
The first challenge in the analysis of how about X? is its illocutionary force. If
I say How about a cup of tea?, I am asking a yes-no question, just as in Is it raining?
The only oddity is that this yes-no question is introduced by a wh-word, how (or
what). Even if the construction originated in a wh-question such as What do you think
about ..., the meaning has now changed as much as the syntax (just as it has in how
come ....?). How, then, can a semantic network indicate illocutionary force? This is a
rather fundamental question for any model, but especially for a usage-based model in
which all the contextual features of utterances are part of the total analysis; but
cognitive linguistics has so far produced few answers. In contrast, Word Grammar has
always had some suggestions for the structural analysis of illocutionary force. The
earliest idea was that it might be defined in terms of how ‘knowledge’ was distributed
among participants (Hudson 1984:186-197), and knowledge is clearly part of the
However, I now believe that the recycling principle (4) points to a simpler first
step: linking to the notions ‘ask’ and ‘tell’, which are already needed as the meanings
of the lexemes ASK and TELL. This is just like the ‘performative hypothesis’ of
Generative Semantics (Ross 1970) except that the ‘performative’ structure is firmly in
the semantics rather than in syntax (however ‘deep’). And as in the performative
analysis, the speaker is the asker or teller, the addressee is the ‘askee’ or the ‘tellee’,
and the content of the sentence is what we can call the ‘theme’ – the information
transferred from one person to the other. For most theories, this analysis would be
very hard to integrate into a linguistic structure because of the deictic semantics
involved in ‘speaker’ and ‘addressee’ which link a word token to a person, thereby
bridging the normal gulf between ‘language’ and ‘non-language’; but for Word
Grammar there is no problem because the Cognitive Assumption rejects any boundary
between language and non-language. The analysis of a word token is a rich package
which includes a wide range of deictic information specific to the token – its speaker,
its addressee, its time and place, and its purpose: what the speaker was trying to
achieve by uttering it. In the case of the sentence-root, which carries the meaning of
the entire sentence, its purpose may be to give information or to request it – in other
words, the token’s purpose is its illocutionary force (Gisborne 2010).
This treatment of illocutionary force is applied to How about X? in Figure 15,
which is based in turn on Figure 13 above. The one relation which is not labelled in
this diagram is that between ‘how about x?’ and ‘x’. We might be tempted to label it
‘theme’, but we should resist this temptation because x is not in fact the thing
requested; for example, How about John? is not a request for John. I develop this
point below.
‘how about x?’
addressee purpose
Figure 15: The meaning of How about x? – illocutionary force
The next challenge, therefore, is to decide what the ‘theme’ of the question is.
If, for example, How about John? is not a request for John, what does it want as an
answer? Clearly, it is a request for either ‘yes’ or ‘no’, information about the truth of
some proposition which we can call simply ‘p’; but what is ‘p’? This varies with the
situation as illustrated by the following scenarios:
(16) We need someone strict as examiner for that thesis, so how about John?
(17) You say you don’t know any linguists, but how about John?
(18) If you think Mary is crazy, how about John?
Similar variation applies to How about a cup of tea?, but this is so entrenched and
conventional that it needs no linguistic context. In every case, then, the x of How
about x? is suggested as a possible answer to a currently relevant question of identity
– the identity of a possible examiner in (16), of a linguist in (17) and of someone
crazy in (18). What is needed in the structural analysis of How about x?, therefore, is
an extra sub-network representing this identity-question, combined with a
representation of x as a possible answer and the truth value of the answer.
This supplementary network has two parts: one part which relates p to the
‘theme’ of the query, the choice between true and false, and another part which relates
p to x. Starting with the choice, this involves the Word Grammar treatment of truth in
terms of the primitive relation ‘quantity’, whose values range over numbers such as 0
and 1 (Hudson 2007:224-8). A node’s quantity indicates how many examples of it are
to be expected in experience; so 1 indicates precisely one, and 0 none. This contrast
applies to nouns as expected; so a book has a referent with quantity 1, while the
referent of no book has quantity 0; but it also applies to finite verbs, where it can be
interpreted in terms of truth. For example, the verb snowed in It snowed refers to a
situation with quantity 1, meaning that there was indeed a situation where it snowed;
whereas the root-word in It did not snow has a referent with quantity 0, meaning that
no such situation existed. Seen in this light, a yes/no question presents a choice
between 1 (true) and 0 (false) and asks the addressee to choose one of them. The
‘quantity’ relation is labelled ‘#’ in diagrams, so Figure 16 shows that the proposition
‘p’ has three quantities: ?, 1 and 0.
{1, 0, ?}
‘how about x?’
addressee purpose
Figure 16: The meaning of How about x? - content
The mechanics of choice in Word Grammar are somewhat complicated because
they involve two further primitive relations called ‘or’ and ‘binding’; these have a
special status alongside ‘isa’ and a small handful of other relations. A choice is
defined by a set that includes the alternatives in the mutually exclusive ‘or’ relation,
and a variable labelled ‘?’ which is simply a member (Hudson 2010:44-47). When
applied to the choice between 1 and 0, we recognise a set {1, 0, ?} which contains 1
and 0 as its mutually-exclusive ‘or’ members as well as an ordinary member called ‘?’
which can bind to either of them. The ‘or’ relation is shown by an arrow with a
diamond at its base while binding is represented by a double arrow; so the subnetwork at the top of Figure 16 shows that ‘?’, the theme of ‘how about x’, is either 1
or 0.
We now have an analysis which shows that How about x means ‘I am asking you
whether p is true’, where p has some relation to x. The remaining challenge is to
explain how p relates to x. It will be recalled that p is a proposition (which may be
true or false), but of course everything in this network must be a concept (because that
is all we find in conceptual networks), so propositions must be a particular kind of
concept. In this case, the proposition is the ‘state of affairs’ (Pollard and Sag 1994:19)
in which two arguments, labelled simply ‘a’ and ‘b’, are identical: the proposition ‘a =
b’. For instance, in (16) the proposition p is ‘the examiner we need = John’. The
identity is once again shown by the primitive binding relation introduced above,
which is shown in Figure 16 as binding a concept ‘q’ to x. Of these two concepts, we
already know x as the referent of word x; for example, in How about John?, x is John.
The other concept, q, is more challenging because it is the variable concept,
the hypothetical examiner, linguist or crazy person in (16), (17) and (18). What these
concepts have in common is that they have some currently active relationship to some
other currently active entity. The entity and relationship may be explicit, as in (16)
(examiner of that thesis), but How about a cup of tea? shows that they need not be.
The analysis in Figure 16 shows the connection to currently active structures by
means of the binding procedure (which triggers a search for the currently most active
relevant node); so node e at the top of the diagram needs to be bound to some active
entity node, illustrating how the permanent network can direct processes that are often
considered to be merely ‘pragmatic’. Most importantly, however, the same process
applies to the relationship node labelled ‘r’, binding it to an active relationship; so
relationships and entities have similar status in the network and are subject to similar
mental operations. This similarity of relationships and entities is exactly as required
by the relation tenet (6), which recognises relationships (other than primitives) as a
particular kind of concept; and it is built into the formal apparatus of Word Grammar.
I am not aware of any other theory that treats relationships in this way.
This completes the semantic analysis of What about x?, showing that it means
something like: ‘I am asking you whether it is true that x is relevantly related to the
relevant entity’, where relevance is defined in terms of activation levels. The main
point, of course, is not the correctness of this particular analysis, but the formal
apparatus that is needed, and that Word Grammar provides. The main facilities were
binding, relational concepts, quantities, mutually exclusive or-relations and network
notation (with the possibility of adding activation levels). Exactly as we might expect
given the Cognitive Assumption, none of these facilities is unique to semantics.
Order of words, morphs etc.
In this section we return to the ‘formal’ levels of syntax, morphology and phonology,
but we shall find that they share some of the formal structures found in semantics. The
question is how our minds handle the ordering of elements – word order in syntax,
morph order in morphology and phone order in phonology. Once again the choice of
notation is crucial. The standard notation of ‘plain vanilla’ syntax (or morphology or
phonology) builds strongly on the conventions of our writing system, in which the
left-right dimension represents time; so it is easy to think that a diagram such as
Figure 2, the box-notation constructional analysis of Cats miaow, already shows word
order adequately. But the network tenet requires every analysis to be translated into
network notation, and the fact is that a network has no ‘before’ or ‘after’, or left-right
dimension. In a network, temporal ordering is one relationship among many and has
to be integrated with the whole. But of course the ordering of elements presents many
challenges for linguistic theory beyond questions of notation, so what we need is an
analysis which will throw some light on these theoretical questions.
Another issue that arises in all questions of ordering is whether the ordering is
spatial or temporal. Once again, given our experience of left-right ordering in writing
we all tend to think in spatial terms, and indeed linguists often talk about ‘leftward
movement’ or even ‘fronting’ (assuming that the ‘front’ of a sentence is its leftmost
part). But given the logical primacy of spoken language, this spatial orientation is
actually misleading, so we need to think of words and sounds as events in time; and
that means that ordering is temporal, rather than spatial. Admittedly, this choice is
blurred by the tendency for temporal relations to be described metaphorically as
though they were spatial; but for consistency and simplicity the terminology will be
temporal from now on.
Temporal ordering of behaviour, including language, requires analysis at two
levels of abstraction. On the one hand we have the ‘surface’ ordering which simply
records the order of elements in a chain. This is the only kind of ordering there is if
the chain is arbitrarily ordered, as in a telephone number (or a psychological memory
experiment): so in a series such as ‘3231’, all we can say is that a 3 is followed by a 2,
which is followed by another 3, which is followed by a 1. The same is true in any
ordered series such as the digits, the alphabet or the days of the week. The only
relation needed to record this relation is ‘next’ (which, to judge by the difficulty of
running a series backwards, seems to spread activation in only one direction, so it may
be a primitive relation). In network notation we might show this relation as a straight
but dotted arrow pointing towards the next element, as in Figure 17. Notice in this
diagram how the default ‘next’ relations are overridden by the observed ones, and
how much easier it would have been to remember the series in the default order: 1 2 3
Figure 17: The 'next' relation in a series of numbers
Like any other organised behaviour, then, a string of words has a surface
ordering which can be described simply in terms of the ‘next’ relation; and for some
purposes, the ‘next’ relation is highly relevant to language processes. This is most
obviously true in phonology, where adjacency is paramount as when a morph’s final
consonant assimilates to the first consonant of the ‘next’ morph. However, cognitive
science also recognises a ‘deep’ ordering behind the surface order. The point is that
behaviour follows the general patterns which have sometimes been analyzed in terms
of ‘scripts’ for scenarios such as birthday parties or cleaning ones teeth (Cienki
2007). Whereas the ‘next’ relation records what actually happens, these deeper
analyses explain why events happened in this order rather than in other possible
permutations. For example, tooth-cleaning is organised round the brushing, so picking
up the brush and putting paste on it precedes the brushing, and rinsing and putting
away the brush follow it. A general feature of organised behaviour seems to be its
hierarchical structure, with problems (e.g. brushing ones teeth) generating subproblems (e.g. picking up the brush and putting it down again); and this structure
produces a typical ordering of events which can be described as a series linked by
‘next’. But notice that the deep ordering of problems and sub-problems is more
abstract than the ‘next’ relation into which they will eventually be translated; for
example, although picking up the tooth brush precedes brushing, we cannot say that
brushing is the ‘next’ of picking up the brush because they may be separated by other
events such as applying tooth-paste. In short, deep ordering mediates between abstract
hierarchical relations and surface ordering.
Returning to language, we find the Cognitive Assumption once again pushing
us towards a particular view of language structure, in which deep ordering mediates
between abstract hierarchical relations and surface ordering. This deep ordering is
especially important in syntax, unsurprisingly perhaps because the ordering of words
is the greatest leap in the chain of levels linking a completely unordered network of
meanings to a completely ordered chain of sounds. For instance, in Cats miaow, the
deep ordering mediates between the relation ‘subject’ and the surface relation ‘next’:
 cats is the subject of miaow.
 a typical verb follows its subject.
 therefore, miaow is the ‘next’ of cats.
What we need, then, is a suitable relation for describing deep ordering.
Cognitive Grammar has already defined the relation that we need: the relation
between a ‘trajector’ and its ‘landmark’, as in ‘the book (trajector) on the table
(landmark) (Langacker 2007:436). Langacker recognises that this relation applies to
temporal relations, giving the example of a trajector event occurring either ‘before’ or
‘after’ its landmark event. This kind of analysis is normally reserved for semantics,
where it plays an important part in defining the meaning of prepositions (such as
before and after) and tenses, where a past-tense verb refers to an event whose time is
before the verb’s time of utterance. Word Grammar, however, extends
trajector/landmark analysis right into the heart of grammar, as the basis for all deep
ordering. For example, in the day after Christmas, the word after takes its position
from the word day, on which it depends; so the relation between the word day
(landmark) and after (trajector) is an example of the same relation as that between
Christmas (landmark) and the day after (trajector), because in both cases the trajector
(the day after or the word after) follows its landmark (Christmas or the word day).
One of the salient qualities of the trajector/landmark relation is that it is
asymmetrical, reflecting the inequality of the ‘figure’ (trajector) and its ‘ground’
(landmark). For example, describing a book as ‘on the table’ treats the table as in
some sense more easily identified than the book – as indeed it would be, given that
tables are usually much bigger than books; it would be possible to describe a table as
‘under the book’, but generally this would be unhelpful. The same is true for events
such as the parts of a tooth-cleaning routine. The main event is the brushing, and the
associated sub-events are subordinate so we naturally think of them taking their
position from the brushing, rather than the other way round. And of course the same is
even more obviously true in syntax, especially given the dependency view advocated
in section 4 in which the relations between words are inherently asymmetrical.
Once again, then, we find general cognitive principles supporting the Word
Grammar version of syntax as based on word-word dependencies, and not on phrase
structure. It could of course be objected that another kind of asymmetrical relation is
meronymy, the relation between a whole and its parts; and at least in syntax this may
help to explain why the words in a phrase tend to stay together (Rosta 1994) – though
we shall see below that even this tendency can largely be explained without invoking
phrases. But meronymy cannot in itself be the abstract relation which determines deep
ordering because a whole has the same relation to all its parts, so the part:whole
relation, in itself, cannot determine how the parts are ordered among themselves. The
only relation which can determine this is a relation between the parts, such as wordword dependencies.
Let us suppose, then, that word-word dependencies determine deep ordering –
in other words, that a word acts as landmark for all its dependents. Precisely how does
this determine the ordering of words? Saying that cats has miaow as its landmark does
not in itself guarantee any particular ordering of these two words, so how can we
combine the trajector/landmark relation with ordering rules? At this point we should
pay attention to the relation tenet (6), which allows relations to be classified and
(therefore) sub-classified. Given the more general relation ‘landmark’, any particular
specification of the relation gives a sub-relation linked by ‘isa’ to ‘landmark’; so in
this case we can recognise just two sub-relations: ‘before’ and ‘after’, each of which
isa ‘landmark’. In this terminology, if X is the landmark of Y, and more precisely X is
the ‘before’ of Y, then Y is before X (or, perhaps more helpfully, ‘X puts Y before
it’). In this terminology, miaow is the before of cats, meaning that cats is before
miaow. This analysis is shown in Figure 18, where miaow is the ‘next’ of cats because
it is the ‘before’ of cats because cats is its subject.
Figure 18: Cats miaow, with deep and surface ordering
One of the challenges of analysing general behaviour is that events can have
complicated relations to one another. Consider your tooth-cleaning routine once
again. Suppose you are a child being supervised by a parent, who wants to see you
using the paste. In this case you have two goals: to be seen using paste as well as to
actually brush them. You pick up the paste and then the brush – the reverse of your
normal order – so when you are ready to put paste on the brush, it is already in your
hand. Notice how it is the dominant goal – being seen – that determines the ordering
of events, and that overrides the normal ordering. This simple example allows us to
predict similar complexities within language, and especially in syntax – which, of
course, we find in spade-fulls.
For example, Figure 8 above gives the structure for sentence (19):
(19) Which birds do you think sing best?
Several words have multiple dependencies, in the sense that they depend on more than
one other word; for example, which is the ‘extractee’ of do (from which it takes its
position), but it is also the subject of sing, and you is the subject not only of do but
also of think. (The evidence for these claims can be found in any introduction to
syntax.) This is like the multiple dependencies between picking up the paste and the
two goals of being seen and of applying paste to the brush; and as in the toothcleaning example, it is the dominant (‘highest’) dependency that determines the
surface order; so which takes its position from do, and so does you, because do is the
sentence-root – the ‘highest’ word in all the chains of dependencies.
The conclusion, then, is that a word may depend on more than one other word,
and that some of these dependencies may determine its position, while others are
irrelevant to word order. In Word Grammar notation the two kinds of dependency are
distinguished by drawing the dependency arcs that are relevant to word order above
the word-tokens, leaving the remainder to be drawn below the words; the former, but
not the latter, are paired with trajector/landmark relations which can therefore be left
implicit. (This convention can be seen in Figure 8.) This distinction is exactly as we
might expect given the logic of default inheritance. By default, a word takes as its
landmark the word on which it depends; but if it depends on more than one word, it
takes the ‘highest’ in the dependency chain – in other words, ‘raising’ is permitted,
but ‘lowering’ is not. But even this generalisation has exceptions, in which a word has
two dependents but takes the ‘lower’ one as its landmark. One such exception is the
German pattern illustrated by (20) (Hudson 2007:144)/
(20) Eine Concorde gelandet ist hier nie.
A Concorde landed is here never. ‘A Concorde has never landed here.’
No doubt there are many other exceptions.
The main point of this discussion is that general cognitive principles support
Word Grammar in its claim that word order can, and should, be handled as a separate
relation, rather than simply left implicit in the left-right notation of plain-vanilla
syntax. However, word order raises another fundamental issue where Word Grammar
is different from other versions of cognitive linguistics. Why do the words of a phrase
tend so strongly to stay together? If syntactic structure consists of nothing but
dependencies between individual words (as I argued above in section 4), what is the
‘glue’ that holds the words of a phrase together? For example, why is example (21)
(21) *She red likes wine.
In dependency terms, this structure should be permitted because all the local wordword dependencies are satisfied, as can be seen in Figure 19. In particular, red should
be able to modify wine because the order of these two words is correct (just as in red
wine), and similarly she should be able to act as subject of likes, just as in She likes
red wine. And yet the sentence is totally ungrammatical.
Figure 19: *She red likes wine and dependencies
The notation allows an obvious explanation in terms of crossing lines – and
indeed, this is very easy to explain when teaching syntactic analysis – but why should
this be relevant? After all, if a network has no left-right dimension then the
intersection of lines is just an artefact of our writing-based notation. Moreover,
intersecting arrows don’t matter at all when (following the Word Grammar
convention explained above) they are written below the words, so why should they
matter above the words? Phrase structure explanations face similar objections when
confronted with the need to be translated into network notation: if word order is
shown, network-fashion, by a relation such as ‘next’, it is meaningless to ban motherdaughter lines that intersect. The explanation that we need, therefore, must not depend
on any particular notation for linear order.
The Word Grammar explanation is based on the ‘the Best Landmark principle’
(Hudson 2010:53), which is a principle for general cognition, and not just for syntax.
When we want to define the location of something, we have to choose some other
object as its landmark, and we always have a wide range of possibilities. For instance,
imagine a scene containing a church, a tree and a bench. As a landmark for the bench,
the church would be better than the tree because it is more prominent – larger, more
permanent, more important for the local community and, because of that, more
accessible in most people’s minds. But the preferred landmark also depends on
distance, because the closer things are, the more likely it is that their relationship is
distinctive, and has already been recorded mentally; so if the bench was next to the
tree but a long way from the church, the tree would make a better landmark. The best
landmark, therefore, is the one that offers the best combination of prominence and
nearness. These considerations may in fact reduce to a single issue, identifiability: a
prominent landmark is easy to identify in itself, while nearness make the landmark’s
relation easy to identify. By hypothesis, this principle applies to memory by guiding
our choice of landmarks when recording properties; so in remembering the bench in
the scene just described, one of its properties would be its physical relationship to
either the church or the tree, depending on which qualified as the best landmark. The
principle also applies to communication, so when we describe the bench to someone
else we only call it the bench by the tree if the tree is the best landmark; and, when we
hear this phrase, we assume that the tree is the best landmark (at least in the speaker’s
Returning to syntax, the Best Landmark principle also explains why the words
in a phrase hang together. As hearers, we always assume that the speaker has chosen
the best landmark for each word, and as speakers, we try to satisfy this expectation. In
syntax, prominence can easily be defined in terms of dependency, so the most
prominent word for a word W is always the word on which W depends, its ‘parent’.
Distance can be defined in terms of the number of intervening words, so W’s parent
will always be separated from W by as few words as possible, given the needs of
other words to be near to their respective landmarks. Consequently, any word W
should always be as close as possible to its parent, and the only words which can
separate the two words are other words which depend (directly or indirectly) on W’s
parent. Now consider the example *She red likes wine. The troublesome word is red,
which depends on wine, so wine should be its landmark; but it is separated from wine
by likes, which does not need to be there. This order is misleading for a hearer,
because it implies that likes must be the best landmark for red; so a speaker would
avoid using it. In short, the ungrammaticality of *She red likes wine has just the same
explanation as the infelicity of She sat on the bench by the church, if the bench is in
fact much closer to the tree.
The most general conclusion of this paper is that the Cognitive Assumption has
important consequences for language structure. This is hardly surprising considering
the history of our ideas about language structure. Our present ideas rest on four
thousand years of thinking about language structure, very little of which was driven
by a desire to explore mental structures, or was informed by reliable information
about general cognition. Other influences were much more powerful, and not least the
influence of writing. The teaching of literacy has always been the driving force behind
a lot of thinking about language structure, so it is hardly surprising that the technology
of writing has profoundly influenced our thinking. This influence is one of the themes
of this paper, emerging in at least two areas: the difficulty of conceptually separating
types and tokens, and the temptation to treat word order in terms of the left-right
conventions of writing. Another major source of influence was the rejection of Latinbased grammar which encouraged many American structuralists to look for simplified
models in analysing ‘exotic’ languages; the effect of this was the ‘plain-vanilla
syntax’ of the American structuralists. In some ways these two influences pull in
opposite directions, but their common feature is that neither of them is concerned at
all with cognition. Similarly, standardisation, another driving force behind the study
of language, encourages us to think of language ‘out there’ in the community rather
than in any individual’s mind; and the publisher’s distinction between grammars and
dictionaries suggests a similar distinction in language structure.
These non-cognitive pressures have had a predictable impact on widely
accepted views of ‘how language works’, but this is the tradition on which we all
build (and within which most of us grew up). And as with any cultural change, it is all
too easy to include unwarranted old beliefs in the new order. To simplify, we have
seen two different ‘cognitive’ movements in the last few decades. In 1965, Chomsky
launched the idea that a grammar could model competence, while also declaring
language unique, which in effect made everything we know about general cognition
irrelevant to the study of language. As a result, Chomsky’s claims about the nature of
language structure derived more from the American Structuralists and mathematics
than from psychology, so early transformational grammar can be seen as a
continuation of Harris’s theory of syntax combined with some cognitive aims.
Similarly, Cognitive Grammar developed out of generative semantics, and
construction grammar out of a number of current approaches (including HPSG).
Unsurprisingly, perhaps, the view of language structure which these theories offer is
rather similar to the traditions out of which they grew.
No doubt the same can be said about Word Grammar, but in this case the
history is different, and the resulting mix of assumptions is different. Moreover, the
theory has been able to change a great deal since its birth in the early 1980s, and at
every point cognitive considerations have been paramount, so it is not merely a happy
coincidence that Word Grammar tends to be compatible with the assumptions of
general cognitive science. The following list includes the main claims about language
structure discussed in the paper, all of which appear to be well supported by what we
know about general cognition:
 Morphology and syntax are distinct levels, so language cannot consist of
nothing but ‘symbols’ or ‘constructions’.
 Syntactic structure is a network of (concepts for) words, not a tree of phrases.
This network provides the flexibility needed for analysing all the various kinds
of ‘construction’.
 Semantic structure is also a network, and allows detailed analyses of both
compositional and lexical meaning.
 The order of elements in syntax or morphology involves just the same
cognitive mechanisms as we use in thinking about how things or events are
related in place or time, notably the ‘landmark’ relation and a primitive ‘next’
Aronoff, Mark 1994. Morphology by Itself. Stems and inflectional classes.
Cambridge, MA: MIT Press
Barabasi, Albert L. 2009. 'Scale-Free Networks: A Decade and Beyond', Science 325:
Bloomfield, Leonard 1933. Language. New York: Holt, Rinehart and Winston
Bolte, J. and Coenen, E. 2002. 'Is phonological information mapped onto semantic
information in a one-to-one manner?', Brain and Language 81: 384-397.
Bybee, Joan 2010. Language, Usage and Cognition. Cambridge: Cambridge
University Press
Bybee, Joan and Slobin, Dan 1982. 'Rules and schemas in the development and use of
the English past tense.', Language 58: 265-289.
Casad, Eugene and Langacker, Ronald 1985. ''Inside' and 'outside' in Cora grammar',
International Journal of American Linguistics 51: 247-281.
Chomsky, Noam 2011. 'Language and Other Cognitive Systems. What Is Special
About Language?', Language Learning and Development 7: 263-278.
Cienki, Alan 2007. 'Frames, idealized cognitive models, and domains', in Dirk
Geeraerts & Hubert Cuyckens (eds.) The Oxford Handbook of Cognitive
Linguistics. Oxford: Oxford University Press, pp.170-187.
Croft, William 1998. 'Linguistic evidence and mental representations', Cognitive
Linguistics 9: 151-173.
Croft, William 2007. 'Construction grammar', in Dirk Geeraerts & Hubert Cuyckens
(eds.) The Oxford Handbook of Cognitive Linguistics. Oxford: Oxford
Univesity Press, pp.463-508.
Croft, William and Cruse, Alan 2004. Cognitive Linguistics. Cambridge University
Evans, Vyvyan and Green, Melanie 2006. Cognitive Linguistics. An introduction.
Edinburgh: Edinburgh University Press
Ferreira, Fernanda 2005. 'Psycholinguistics, formal grammars, and cognitive science',
The Linguistic Review 22: 365-380.
Fillmore, Charles 1982. 'Frame semantics.', in Anon (ed.) Linguistics in the Morning
Calm. Seoul: Hanshin; Linguistic Society of Korea, pp.111-138.
Fillmore, Charles and Atkins, Sue 1992. 'Towards a frame-based lexicon: the
semantics of RISK and its neighbours.', in Adrienne Lehrer & Eva Kittay
(eds.) Frames, Fields and Contrasts. New essays in semantic and lexical
organisation. Hillsdale, NJ: Erlbaum, pp.75-102.
Fillmore, Charles, Kay, Paul, and O'Connor, Mary 1988. 'Regularity and idiomaticity
in grammatical constructions: the case of let alone.', Language 64: 501-538.
Frost, Ram, Deutsch, Avital, Gilboa, Orna, Tannenbaum, Michael, and MarslenWilson, William 2000. 'Morphological priming: Dissociation of phonological,
semantic, and morphological factors', Memory & Cognition 28: 1277-1288.
Frost, Ram., Ahissar, M., Gotesman, R., and Tayeb, S. 2003. 'Are phonological
effects fragile? The effect of luminance and exposure duration on form
priming and phonological priming', Journal of Memory and Language 48:
Geeraerts, Dirk and Cuyckens, Hubert 2007. The Oxford Handbook of Cognitive
Linguistics. Oxford: Oxford University Press
Gisborne, Nikolas 2008. 'Dependencies are constructions', in Graeme Trousdale &
Nikolas Gisborne (eds.) Constructional approaches to English grammar. New York:
Mouton, pp.
Gisborne, Nikolas 2009. 'Light verb constructions.
', Journal of Linguistics
Gisborne, Nikolas 2010. The event structure of perception verbs. Oxford: Oxford
University Press
Gisborne, Nikolas 2011. 'Constructions, Word Grammar, and grammaticalization',
Cognitive Linguistics 22: 155-182.
Goldberg, Adele 1995. Constructions. A Construction Grammar Approach to
Argument Structure. Chicago: University of Chicago Press
Halle, Morris and Marantz, Alec 1993. 'Distributed morphology and the pieces of
inflection.', in Ken Hale & Samuel Keyser (eds.) The view from Building 20:
essays in linguistics in honor of Sylvain Bromberger. Cambridge, MA: MIT
Press, pp.111-176.
Halliday, Michael 1961. 'Categories of the theory of grammar.', Word 17: 241-292.
Halliday, Michael 1966. 'Lexis as a linguistic level', in Charles Bazell, John Catford,
Michael Halliday, & Robert Robins (eds.) In Memory of J. R. Firth. London:
Longman, pp.148-162.
Harris, Zellig 1951. Structural Linguistics. Chicago: University of Chicago Press
Hockett, Charles 1958. A Course in Modern Linguistics. New York: Macmillan
Holmes, Jasper and Hudson, Richard 2005. 'Constructions in Word Grammar', in JanOla Östman & Mirjam Fried (eds.) Construction Grammars. Cognitive
grounding and theoretical extensions. Amsterdam: Benjamins, pp.243-272.
Hudson, Richard 1984. Word Grammar. Oxford: Blackwell.
Hudson, Richard 1990. English Word Grammar. Oxford: Blackwell.
Hudson, Richard 2007. Language networks: the new Word Grammar. Oxford: Oxford
University Press
Hudson, Richard 2008. 'Word Grammar and Construction Grammar', in Graeme
Trousdale & Nikolas Gisborne (eds.) Constructional approaches to English
grammar. New York: Mouton, pp.257-302.
Hudson, Richard 2010. An Introduction to Word Grammar. Cambridge: Cambridge
University Press
Hudson, Richard and Holmes, Jasper 2000. 'Re-cycling in the Encyclopedia', in Bert
Peeters (ed.) The Lexicon/Encyclopedia Interface. Amsterdam: Elsevier,
Hutchison, Keith 2003. 'Is semantic priming due to association strength or feature
overlap? A microanalytic review.', Psychonomic Bulletin & Review 10: 785813.
Jackendoff, Ray 1997. The Architecture of the Language Faculty. Cambridge, MA:
MIT Press
Jackendoff, Ray 2008. 'Alternative Minimalist Visions of Language.', in R Edwards, P
Midtlying, K Stensrud, & C Sprague (eds.) Chicago Linguistic Society 41: The
Panels. Chicago: Chicago Linguistic Society, pp.189-226.
Karmiloff-Smith, Annette 1992. Beyond Modularity. A developmental perspective on
cognitive science. Cambridge, MA: MIT Press
Karmiloff-Smith, Annette 1994. 'Precis of Beyond modularity: A developmental
perspective on cognitive science.', Behavioral and Brain Sciences 17: 693745.
Kay, Paul and Fillmore, Charles 1999. 'Grammatical constructions and linguistic
generalizations: The what's X doing Y? Construction', Language 75: 1-33.
Lakoff, George 1977. 'Linguistic gestalts', Papers From the Regional Meeting of the
Chicago Linguistics Society 13: 236-287.
Lamb, Sydney 1998. Pathways of the Brain. The neurocognitive basis of language.
Amsterdam: Benjamins
Langacker, Ronald 1987. Foundations of Cognitive Grammar: Theoretical
prerequisites. Stanford: Stanford University Press
Langacker, Ronald 1990. Concept, Image and Symbol. The Cognitive Basis of
Grammar. Berlin: Mouton de Gruyter
Langacker, Ronald 2007. 'Cognitive grammar', in Dirk Geeraerts & Hubert Cuyckens
(eds.) The Oxford Handbook of Cognitive Linguistics. Oxford: Oxford
University Press, pp.421-462.
Lindblom, Björn, MacNeilage, Peter, and Studdert-Kennedy, Michael 1984. 'Selforganizing processes and the explanation of language universals.', in Brian
Butterworth, Bernard Comrie, & Östen Dahl (eds.) Explanations for Language
Universals. Berlin/New York: Walter de Gruyter, pp.181-203.
Luger, George and Stubblefield, William 1993. Artificial Intelligence. Structures and
strategies for complex problem solving. New York: Benjamin Cummings
Marslen-Wilson, William 2006. 'Morphology and Language Processing', in Keith
Brown (ed.) Encyclopedia of Language & Linguistics, Second edition. Oxford:
Elsevier, pp.295-300.
Nunberg, Geoffrey, Sag, Ivan, and Wasow, Thomas 1994. 'Idioms', Language 70:
Owens, Jonathan 1988. The Foundations of Grammar: an Introduction to Mediaeval
Arabic Grammatical Theory. Amsterdam: Benjamins
Percival, Keith 1976. 'On the historical source of immediate constituent analysis.', in
James McCawley (ed.) Notes from the Linguistic Underground. London:
Academic Press, pp.229-242.
Percival, Keith 1990. 'Reflections on the History of Dependency Notions in
Linguistics.', Historiographia Linguistica. 17: 29-47.
Peterson, Robert R., Burgess, Curt, Dell, Gary, and Eberhard, Kathleen 2001.
'Dissociation between syntactic and semantic processing during idiom
comprehension.', Journal of Experimental Psychology: Learning, Memory, &
Cognition. 27: 1223-1237.
Pollard, Carl and Sag, Ivan 1994. Head-Driven Phrase Structure Grammar. Chicago:
Chicago University Press
Reay, Irene 2006. 'Sound Symbolism', in Keith Brown (ed.) Encyclopedia of
Language &amp; Linguistics (Second Edition). Oxford: Elsevier, pp.531-539.
Reisberg, Daniel 2007. Cognition. Exploring the Science of the Mind. Third media
edition. New York: Norton
Robins, Robert 2001. 'In Defence of WP (Reprinted from TPHS, 1959)', Transactions
of the Philological Society 99: 114-144.
Ross, John 1970. 'On declarative sentences', in Roderick Jacobs & Peter Rosenbaum
(eds.) Readings in English Transformational Analysis. Waltham, Mass: Ginn,
Rosta, Andrew 1994. 'Dependency and grammatical relations.', UCL Working Papers
in Linguistics 6: 219-258.
Rosta, Andrew (1997). English Syntax and Word Grammar Theory. PhD dissertation,
UCL, London.
Rosta, Andrew 2005. 'Structural and distributional heads', in Kensei Sugayama &
Richard Hudson (eds.) Word Grammar: New Perspectives on a Theory of
Language Structure. London: Continuum, pp.171-203.
Sadock, Jerrold 1991. Autolexical Syntax: A theory of parallel grammatical
representations. Chicago: University of Chicago Press
Sprenger, Simone, Levelt, Willem, and Kempen, Gerard 2006. 'Lexical Access during
the Production of Idiomatic Phrases', Journal of Memory and Language 54:
Stump, Gregory 2001. Inflectional Morphology: A Theory of Paradigm Structure.
Cambridge: Cambridge University Press
Sugayama, Kensei 2002. 'The grammar of Be to: from a Word Grammar point of
view.', in Kensei Sugayama (ed.) Studies in Word Grammar. Kobe: Research
Institute of Foreign Studies, Kobe City University of Foreign Studies, pp.97111.
Tallerman, Maggie 2009. 'Phrase structure vs. dependency: the analysis of Welsh
syntactic soft mutation', Journal of Linguistics 45: 167-201.
Taylor, John 2007. 'Cognitive linguistics and autonomous linguistics', in Dirk
Geeraerts & Hubert Cuyckens (eds.) The Oxford Handbook of Cognitive
Linguistics. Oxford: Oxford University Press, pp.566-588.
Tesnière, Lucien 1959. Éléments de syntaxe structurale. Paris: Klincksieck
Wierzbicka, Anna 1985. Lexicography and Conceptual Analysis. Ann Arbor: Karoma
Wierzbicka, Anna 1996. Semantics: Primes and universals. Oxford: Oxford
University Press
Zwicky, Arnold 1985. 'The case against plain vanilla syntax', Studies in the Linguistic
Sciences 15: 1-21.