Download Pilot study - Angus B. Grieve

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Construction grammar wikipedia , lookup

Lexical analysis wikipedia , lookup

Double negative wikipedia , lookup

French grammar wikipedia , lookup

Lexical semantics wikipedia , lookup

Pipil grammar wikipedia , lookup

Transcript
The Lexical Diffusion of Postverbal Negation in French
Draft Dissertation Proposal
Angus B. Grieve-Smith
April 21, 2006
During the history of the French language, predicate negation has shifted from
being expressed by the preverbal particle ne alone, to the “double negation” construction
ne … pas, to the postverbal particle pas alone. This change is a well-studied
phenomenon in French, but not much has been written on the intermediate “double
negation” stage, where the construction spread from being used with just a few verbs in
the sixteenth century to being almost universally required with every verb in the
twentieth. In much of the literature this stage is treated as a homogeneous change, but on
closer examination it can be seen as a rich, complex change that bears out several
predictions from recent theories on language change.
I am examining this change from the perspective of grammaticization theory, a
subfield of functional linguistics. To keep the research manageable, I have restricted my
focus to the genre of theater, with a wider study possible in the future. I have already
conducted a pilot study indicating that the lexical diffusion of ne … pas in French theater
follows the principles described by Bybee and Thompson (1997), and for my dissertation
I propose an expanded version of this study, with a larger and more representative corpus.
Background
The history of French negation is one of the most popular examples of
grammaticization. Schwenter (2003) points out that t is one of the three examples used
by Jespersen (1917) to formulate general principles of the evolution of negation, later
named “Jespersen’s Cycle” by Dahl (1979):
1
Angus B. Grieve-Smith
Dissertation Proposal
The history of negative expressions in various languages makes us witness
the following curious fluctuation: the original negative adverb is first
weakened, then found insufficient and therefore strengthened, generally
through some additional word, and this in its turn may be felt as the
negative proper and may then in course of time be subject to the same
development as the original word (Jespersen 1917).
Data background
To begin with, it is important to introduce a number of words that will be referred
to throughout this study.
ne
pas
point
mie
nient
mot, goutte,
grain
this old Latin predicate negator is rarely used in modern conversational
French
derived from a noun meaning “step,” the “forclusive” (i.e. postverbal
negative) particle in the Parisian dialect of Old French, and now the most
common negator
the forclusive particle of the Norman dialects of Old French, usually
negating NPs, and a competitor to pas in Middle French and Classical
French
a common negator in the Rhine dialects of Old French
preferred negator for the Northern dialects of Old French
“expressions of minimal value” occurring in particular constructions, e.g. il
ne dit mot, “he won’t say a word”
Table 1. Words used in French negation.
Since the change from ne to ne...pas concerns negation of whole predicates, I will
be excluding from the study a number of words and constructions that are used with the
negative particle ne in contexts other than predicate negation. These are summarized in
Table 2.
negative quantifiers, equivalent to English “nothing,” “no one,”
“never,” “none,” “in no way,” “not any more,” “neither” and
“only”; these will be excluded from the study
ne is occasionally used with some “irrealis” subjunctive
constructions, such as avant que “before”; these cases will also be
excluded from the study
comparatives
ne is also used with some comparative constructions, which will be
excluded from the study
Table 2. Words and constructions used in contexts other than predicate negation.
rien, personne, jamais,
aucun, nullement, plus, ni,
que, sinon
the subjunctive mood
DRAFT 6/21/2017
2
Angus B. Grieve-Smith
Dissertation Proposal
Over the evolution of the French language, negation has shifted from being
expressed by the preverbal particle ne alone to the postverbal particle pas alone. Table 3
summarizes the stages of these changes.
Vulgar Latin
(100 - 900)
ne alone, plus several
idiomatic expressions
derived from object
nouns; largely
unwritten
Old French and Middle
French (900 - 1500)
mostly ne alone, but
other expressions with
both ne and one of pas,
point, mie or nient.
Classical French
(1500 - 1900)
most negations use both
ne and either pas or
point; several uses of ne
alone remain
Modern French
(1900 - Present)
pas is the negator used
in almost all cases; ne
alone remains in a few
idioms; both are used
together in formal
language
Table 3. Stages in the evolution of French negation.
The original French negator was ne (or non, according to some scholars), a reflex
of the original Indo-European sentence negator, as we see in example (1), from the
Chanson de Roland, one of the earliest extant French texts.
1) s
il ne cumbat a
cele gent
hardie
if 1sg
NE combat to that people hardy
“if he does not fight that hardy people”
-Chanson de Roland (ca. 1000), line 2600
While the language was still considered Latin, ne preceded the verb, and a number of
supplemental particles were added to specify the scope of the negation. Many of these
particles were negative quantifiers, determiners and adverbs. Examples of these include
(in their modern forms), jamais “never” (as shown in example 2), rien “nothing,” plus
“not any more,” que “only,” aucun/aucune “no” (determiner) and nulle part “nowhere.”
2) n
asemblereit
jamais Carl
si
grant esforz
NE assemble-COND never
Charles such big
force
“Charles would never put together such a large force”
-Chanson de Roland (ca. 1000), line 655
Even in Late Latin, however, certain negative particles began to be used as
“emphatic” negators, to supplement ne. As example (3) shows, these particles are found
in the earliest French texts. The most common of these particles in Old French were mie
DRAFT 6/21/2017
3
Angus B. Grieve-Smith
Dissertation Proposal
(which originally meant “crumb”), point (“dot,” “point,” “stitch,”) pas (“step”) and nient
(“a nothing”):
3) lor
enseignes n
i
unt
mie ubliees
3pl.POSS teachings NE there AUX.3sg not forget.PPl
“[the Franks and peasants] have not forgotten their
teachings”
-Chanson de Roland (ca. 1000), line 3549
Schwenter (to appear) takes a closer look at the notion of an “emphatic negator”
in data from present-day Catalan, (standard) Italian and Brazilian Portuguese, where the
double-negation constructions no … pas, non … mica and não … não, respectively, exist
alongside simple predicate negators no, non and não. In these languages, he argues that
the double-negation constructions have the restricted usage of marking the denial of a
discourse-activated proposition. It may be possible that this was the case in Middle
French as well, and that the usage was semantically and pragmatically extended at the
same time as it became more frequent.
Price (1997) mentions giens “person”, grain “grain, particle” and gote “drop” as
additional postverbal negative particles in Old and Middle French. There are also a
number of expressions which Price calls “idioms of minimal value,” such as bouton
“button,” clou, “nail,” and areste “fish bone,” and which have only ever been attested in a
small number of idiomatic expressions. The list of acceptable negative particles has
grown shorter and shorter over the past thousand years, until in the present day (as shown
in example 4) ne … pas is the only form used in conventional writing, and aside from a
few fixed expressions, ne … point is considered an affected usage.
DRAFT 6/21/2017
4
Angus B. Grieve-Smith
4) il
semble
PRO.3sg seem
dû
owe.PPl
à
to
un
DET
Dissertation Proposal
que
COMP
le
DET
effet
effect
phénomène
phenomenon
ne
NE
soit
be.SUBJ
pas
not
spécial
special
“it appears that the phenomenon is not due to a special
effect”
-L’Express, August 27, 1998
To the extent that there was a difference in meaning or usage between the various
particles, it was slight and difficult to determine. Price (1997) makes a convincing case
that in Old and Middle French, ne … pas, ne … mie and ne … nient were commonly used
as negative adverbs, while ne … point was used more often in conjunction with partitive
constructions, as in example 5.
5) il
n' y a
poinct
PRO.3sg NE EXIST not
“there is no enchantment”
d'
DET
enchantement
enchantment
-Gargantua (1542), line 753
He also shows that the distribution of ne … pas, ne … mie and ne …nient was largely
dialectal, with ne … nient used in Wallonia, ne … mie in Lorraine and ne … pas in the
rest of the oïl region, although poets who were speakers of one dialect would
occasionally borrow the form of another dialect to make a rhyme.
Since most of the negative particles are identical in form to nouns, it is often
supposed that they are grammaticized from these nouns; for example, pas is also the
word for “step,” as shown in example 6:
6) segnurs le
pas
tenez
lord.PL DET step hold-IMP
“my lords, stay on track”
-Chanson de Roland (ca. 1000), line 2851
so it is plausible that it once existed in constructions meaning, “I’m not going one step,”
and that this then came to mean “I’m not going at all,” and then simply “I’m not going.”
DRAFT 6/21/2017
5
Angus B. Grieve-Smith
Dissertation Proposal
Although there may be evidence for this early stage of grammaticization in the corpus of
Late Latin texts, by the time of Old French (900-1200) these words are clearly not being
used as nouns in these negative constructions. In Old French it is common for nouns to
take articles, and these particles never do. Many of them also have distinctly adverbial
uses.
In the Middle French period (1200-1500), the cultural power of Paris grew, and
with it the strength of norms based on Parisian usage. Thus Lorrain ne ... mie and
Walloon ne ... nient fell out of favor, although according to Price (1997) they were
common in regional spoken varieties as recently as the Atlas linguistique de la France of
1902. In Classical and Contemporary French (1500-present), then, the choices for
negating sentences are ne alone, ne … pas or ne … point.
The distinction in meaning between ne … pas and ne … point has troubled
scholars since at least the middle of the seventeenth century. As Price (1997) points out,
the Old French syntactic distribution where point was used primarily as a substantive
while ne … pas was used as an adverb appears to have gradually eroded, until ne … pas
could be used with some partitive constructions by the late seventeenth century, and with
all by the early nineteenth. Example 7 shows ne … pas used with a partitive construction
in the seventeenth century.
7) il
n' y a
pas de
PRO.3sg NE EXIST not DET
“there is no reply to that”
réplique
reply
à
to
cela
that
-L’Avare (1668), line 1002
From the middle of the nineteenth century on, ne … point is rarely used in written text,
other than to provide a “country” feel.
DRAFT 6/21/2017
6
Angus B. Grieve-Smith
Dissertation Proposal
From the late sixteenth to early seventeenth century (the beginning of the
Classical French period), ne … pas displaces ne alone as the negative construction of
choice. From that point on, ne alone accounts for a relatively small number of negative
constructions, and eventually dwindles to small numbers in written text, and is not used at
all in casual speech.
Recently, then, ne … pas has become almost obligatory in French predicate
negation. Once this happened, the ne became redundant, and this allowed it to be
reduced to nothing. It has been dropped more and more from casual speech, until in
present-day speech postverbal pas is the sole negator. Ne is still required by the norms of
written language, although it is often omitted to suggest a colloquial or trendy style, as in
example 8:
8) la
DET
du
of
Sibérie,
Siberia
c’
PRO.3sg
est
be
pas
not
trop
too-much
le
DET
pays
country
cyber
high-technology
“Siberia isn’t exactly a cyberland”
-Title of an article in the Webjournal Cyberculture from Canal Plus
We have thus come full circle, from ne alone to pas alone.
In Jespersen’s (1917) discussion, French differs from the other two languages,
Danish and English, in one significant respect: the original negative particle ne is placed
before the verb and the “reinforcing” particles after the verb, while in Danish and
English, both particles were placed after the verb. In English, ne was followed by a wiht,
“a thing,” and the whole expression was eventually reduced to naught and then to not. In
contrast, in French ne and pas are almost always separated by the verb, which mean that
they cannot contract into a single word.
DRAFT 6/21/2017
7
Angus B. Grieve-Smith
Dissertation Proposal
Consistent with the notion of grammaticization, the steps described above can be
summarized in a grammaticization cline, as used by Hopper and Traugott (1993). The
pathway looks as follows:
Pathway
Period
Examples
noun
>
Latin, LL
passum
punctus
mica
nient
emphatic
negator
OFr, MFr
ne … pas
ne … point
ne … mie
ne … nient
>
embracing
negation
Classical Fr
ne … pas
ne … point
>
sole negator
Modern Fr
pas
Previous approaches
Various explanations have been suggested by other scholars to account for these
changes. In this context, it is important to note when an explanation is being offered for
the underlying motivation of the change in general, and when it is being put forth as the
mechanism of a specific part of the change. To clarify this distinction, I will deal with
the underlying motivations first, and then turn to descriptions of specific mechanisms.
Several scholars see the change in French negation as a small part in a large-scale
shift in French syntax. Some argue that cognitive constraints implied by the Greenberg
universals direct the entire process. Vennemann (1974) suggests that the shift to
postverbal negation was the tail end of a greater shift from the Latin verb-final word
order to French SVO word order. Harris (1978), on the other hand, claims that the
negation shift is just the beginning of a shift to object-verb word order in French. Posner
(1985), building on Vennemann’s claims, accounts for the retention of preverbal cognates
of ne in other Romance languages by positing that in those languages the negator changes
in category from an adverb to a clitic, but does not provide any evidence why the two
categories would be treated differently by speakers.
DRAFT 6/21/2017
8
Angus B. Grieve-Smith
Dissertation Proposal
Posner (1985) also suggests the alternative hypothesis that the shift may have
been due to influence from German. However, Schwegler (1983) points out that there is
no evidence for the strengthening of ne … pas and subsequent loss of ne before 1200, and
that German influence in France ended by 900; this is therefore an unlikely explanation
for France proper, although it may have played a role in similar changes in FrancoProvençal and Romantsch.
Other research takes a more piecemeal approach to the process, focusing on small
components of the change. Price (1997) suggests that the discontinuous (ne … particle)
construction (in its early form as “emphatic negation”) may have been borrowed into
Late Latin from the Celtic languages it supplanted through a period of “Celtic-Romance
bilingualism.” Schwegler (1983) offers the semantic explanation that emphatic negators
are a useful part of communication, and that languages can invent them spontaneously.
Posner attributes to “classic histories of the language” the position that at some
point between 1300 and 1500, French stress shifted from the old Latin pattern of
penultimate pitch-contour stress to its current system of lengthening the last syllable. As
a result the preverbal clitics, including ne, went from being occasionally stressed to never
stressed. The loss of this stress heightened the phonological reduction of ne and
motivated greater use of ne … pas. As Posner points out, this would not account for
neighboring varieties including Occitan and some of the Northern dialects of Italian,
which have postverbal negation without the French stress pattern.
Ewert (1933) offers a frequency-based mechanism for the increase of the use of
ne … pas. His argument is that “the constant use of these words as negative particles
DRAFT 6/21/2017
9
Angus B. Grieve-Smith
Dissertation Proposal
resulted in a weakening of their meaning and their extension,” after which ne became
redundant and dispensable. Ne, in turn, became reduced through frequent use.
In this case, it is possible that everyone could be right to some extent. Or almost
everyone; it seems unlikely that these changes could have been due to both Vennemann’s
XV -> VX shift and Harris’s VX -> XV shift. On the other hand, general cognitive
impulses could play a role in making one construction “feel more comfortable” to the
speakers. Language contact could also have provided a model that speakers emulated,
and changes in stress patterns, semantic bleaching and reduction could have triggered the
rise of ne … pas. What remains to be seen is the relative importance of all these factors.
Of the studies mentioned above, most rely simply on the existence or absence of a
particular form in the texts from the relevant period. Only Price counts the relative
frequency of items, which lends more support to his claims.
Theoretical background
In many studies of language change, the changes are assumed to be gradual and
constant, affecting every lexical item identically. This is sometimes a useful abstraction,
but on closer examination it is clear that every change starts with a small subset of the
lexicon and spreads, in a process known as lexical diffusion. This phenomenon has been
observed at least as early as Schuchardt (1885), and described in greater detail by Wang
(1969) and Labov (1981).
Joan Bybee (Hooper 1976, Bybee 2002) refined the notion and incorporated it
into the subset of functional linguistics known as grammaticization or
grammaticalization theory. In a nutshell, grammaticization describes the process by
which grammatical elements are formed from non-grammatical words and constructions.
DRAFT 6/21/2017
10
Angus B. Grieve-Smith
Dissertation Proposal
According to Hopper and Traugott (1993), although grammaticization was first named by
Meillet (1912), some of the processes involved were described by Humboldt (1825).
In relating lexical diffusion and grammaticization, Bybee and Thompson (1997)
make a distinction between “token frequency,” which is simply the number of
occurrences of particular words or constructions as a fraction of the total words in a text,
and “type frequency,” which “counts how many different lexical items a certain pattern
or construction is applicable to,” i.e. how many different contexts it appears in.
Bybee and Thompson describe three major effects of frequency on syntactic
structure. In Analogical Extension, patterns with high type frequency are more
“productive” in terms of spreading to other contexts. “The more lexical items that are
heard in a certain position in a construction, the less likely it is that the construction will
be associated with a particular lexical item and the more likely it is that a general
category will be formed over the items that occur in that position.” (Bybee and
Thompson 1997: 71) In the Conserving Effect, lexical items with high token frequency
are resistant to the effects of analogical extension. “The more a form is used, the more its
representation is strengthened, making it easier to access the next time. Words that are
strong in memory and easy to access are not likely to be replaced by new forms created
with the regular pattern.” (Bybee and Thompson 1997: 65) The third frequency effect,
the Reduction Effect, is not relevant to this study.
Many examples of grammaticization are complicated by the fact that in the early
stages there is often more than one lexical item filling closely related roles. This is often
called specialization, but some have opted for the less confusing term of competition.
DRAFT 6/21/2017
11
Angus B. Grieve-Smith
Dissertation Proposal
The competition between the postverbal French negators pas, point, mie and nient is used
as an example of this by Hopper and Traugott (1993).
Proposed study
Since the evolution of negation in French is literally a textbook case of
grammaticization, it is an ideal test case for Bybee and Thompson’s theories about lexical
diffusion. I propose that the adoption of ne … pas is a form of analogical extension, and
will thus show evidence of the Extension Effect of type frequency on production and, of
the Conserving Effect of token frequency on entrenchment.
Hypotheses
Bybee and Thompson’s theory makes specific predictions about negation in
French, and my aim is to test these hypotheses against data from a corpus. Here are the
predictions.
1. The Progression hypothesis: that the spread of language change is gradual,
not abrupt, and thus that the token frequencies of the various predicate
negators in a text are in direct relation to the date of production of the text.
2. The Analogical Extension hypothesis: that the spread of an analogical
change is related to type frequency, and thus that the token frequencies of the
various predicate negators with a particular lexical item in a text have direct
relationships with the type frequency of that lexical item during the period
when the text was produced.
3. The Entrenchment hypothesis: that the spread of analogical change is
blocked in high-token-frequency contexts, thus that the rate of decline of the
relative token frequency of a particular lexical item with ne alone over a
DRAFT 6/21/2017
12
Angus B. Grieve-Smith
Dissertation Proposal
certain period has a direct inverse relationship with the per-word token
frequency of that lexical item during the preceding period.
4. The Dead End hypothesis: that once an analogical change is started, the old
form ceases to be productive, and any new constructions or lexical items will
take the new form instead.
In their discussion of the analogical extension and entrenchment hypotheses,
Bybee and Thompson do not make explicit predictions as to the exact nature of the
relationships between these changes and frequency. There are strong and weak
hypotheses for each of these hypotheses. The weak hypothesis is that there is a one-toone mapping between the order of the items taking part in the change and their type or
token frequency rank. The strong version is that there is not just a mapping, but a
gradient relationship between the date of production and the frequency (token or type) of
the lexical item or context. Such a gradient relationship could be linear, quadratic or
described by some other function.
It is necessary to account for the effects of competition among the various
postverbal negators, particularly for the Progression hypothesis. There are two
possibilities: the date of production may be correlated with the frequency of individual
negators, or it may be inversely correlated with the frequency of the old preverbal
negation. In other words, is it the spread of particular forms we are concerned with, or
the spread of things that are not ne alone? Once a form “loses” the competition and
begins to decline, it too will likely be subject to the Entrenchment and Productivity
hypotheses as well.
DRAFT 6/21/2017
13
Angus B. Grieve-Smith
Dissertation Proposal
5. The Loser’s Entrenchment hypothesis: that when a form loses a
grammaticization competition, its decline is blocked in high-token-frequency
contexts, thus that the rate of decline of the relative token frequency of a
particular lexical item with that form over a certain period has a direct inverse
relationship with the per-word token frequency of that lexical item during the
preceding period.
6. The Loser’s Dead End hypothesis: that when a form loses a
grammaticization competition, it ceases to be productive, and any new
constructions or lexical items will take the winning form instead.
Corpus
The best way to test a hypothesis about language evolution is by measuring a
corpus of texts from the relevant period. The hypothesis makes a claim about the
language in general, and the texts are selected to approximate the language. In this case,
the specific claims made by Bybee and Thompson (1997:65 and 71, as quoted above) are
that if a lexical item has a high type frequency in an individual’s input there will be
extension, and if it has a high token frequency in an individual’s output there will be
entrenchment. It is not clear whether Bybee and Thompson intended to make an
input/output distinction or not. Do these frequency effects relate more to perception (a
sense of “the language” shaped by input from the rest of the community) or to production
(the individual’s daily language practice)? What are the role of norms and standards,
whether genre-specific or language-wide?
The challenge, as in every corpus-based study, is to find something that
approximates the language user’s input and/or output. While in this age of pervasive
DRAFT 6/21/2017
14
Angus B. Grieve-Smith
Dissertation Proposal
surveillance it may be possible to capture the entirety of a person’s language input and
output, this was absolutely impossible in 1900, much less in 1500. However, the premise
of corpus linguistics is that we are able to approximate a language user’s input and output
through a representative selection of texts from a period of the user’s life.
An ideal corpus contains a balance of genres that approximates the input that a
typical member of the language community perceives. Of course, the majority of most
people’s language input and output is spontaneous conversation, and the proportion of
conversation only gets higher as we look back in time to periods when there was simply
less to read. It makes sense that most of the transmission of unconscious language
changes like grammaticization took place primarily through spontaneous conversation.
Because of this, spontaneous conversation is a high priority in any corpus.
Since the invention of the phonograph, and especially since the invention of
audiotape, we have samples of conversation for a corpus, but before that the technology
was simply not sufficient for recording conversations in real time. There are some fairly
detailed transcripts available, but they are not necessarily representative.
Anthony Lodge prepared a corpus of “Paris speech of the past,” for his
Sociolinguistic History of Parisian French (Lodge 2004). He assembled three sources:
the journal of Jean Héroard, the personal physician of Louis XIII for the first twentyseven years of his life (1601-1628), which includes long transcripts of things that the
young Dauphin said; the journal of Jacques-Louis Ménétra, an 18th-century Parisian
glassworker; and a collection of political pamphlets claiming to present the words of
working-class Parisians, but actually containing fictionalized dialogues written by
DRAFT 6/21/2017
15
Angus B. Grieve-Smith
Dissertation Proposal
educated upper-class men. These journals and pamphlets are thus not necessarily
representative and don’t provide adequate coverage of the period under study.
Once we lose the possibility of studying spontaneous conversation, we also lose
the possibility of a truly balanced and representative corpus. At that point, the question is
primarily what kind of corpus most closely mimics the input of the language users. If we
understood more about the relations between genres, it might be possible to set up a
complicated system of genre substitutions and thus provide an approximation to the input
that users were exposed to. Since we don’t, and since a genre imbalance can significantly
skew results, it may be simplest to choose one genre and look for the effect there.
The genre that is next closest to speech is theater. Unlike novels and newspapers,
printed versions of theatrical works in French are available for almost the entire history of
the language, at least as far back as the publication of Li gieus de Robin et Marion in
1275. Theater is fiction, but there is an expectation of naturalness in the dialogue that is
not necessarily present in legal, historical or philosophical works. It is true that the
degree of naturalness is determined by convention and has changed over the years, but it
has always been there to some extent. Theater is thus in many ways the next best thing to
speech, and theatrical texts are available for the entire period under study. For these
reasons, I have chosen to restrict my corpus to theatrical texts for this investigation.
The selection of texts is a problematic issue. Ideally, the texts in the corpus will
be representative of the dramatic works that a typical French speaker was exposed to. Or
should it be what a typical playwright was exposed to? Either way, the texts available
will not necessarily be representative in their totality. It is inevitable that many theatrical
works, especially from earlier periods, have either not been written down or not been
DRAFT 6/21/2017
16
Angus B. Grieve-Smith
Dissertation Proposal
published, or the written versions have not survived. Of the texts that survived, many of
them may not currently be available in electronic format.
Another problem is the issue of typicality. From any period, the theatrical works
that are published, preserved and digitized are likely to be the most original and most
popular, or at least most popular among those doing the publication, preservation and
digitization. These are not necessarily going to represent the typical theatergoing
experience at any given time. Because of this, it will most likely be necessary to find
additional texts and digitize them.
It is also important to note that many plays are translations from other languages,
most notably Italian. The translators may have chosen expressions that were closer to the
original Italian and not necessarily typical French. In addition, many of the earlier plays
were in verse, some of them accompanied by music. It will be necessary to test the
corpus to determine whether any of these factors affect the use of negation.
When studying spontaneous conversation, it is important to control for the effects
of linguistic accommodation and thus not directly compare speech from two participants
in the same conversation. In the case of plays, it is not clear that playwrights would be
aware of accommodation or make any significant effort to reproduce it. Thus it does not
seem necessary to control for accommodation in theatrical texts. However, it is
important to separate dialogue from stage directions, and to be aware if particular
characters are presented as speaking a marked dialect, non-native speakers, or with
speech difficulties.
DRAFT 6/21/2017
17
Angus B. Grieve-Smith
Dissertation Proposal
Analysis
The heart of the study consists of testing each of the hypotheses above against the
corpus. These hypotheses will be repeated below for your convenience. Fortunately,
predicate negation is relatively straightforward to study in French. There are a limited
number of negative particles, and although most of them are have uses other than
predicate negation, it is relatively easy to separate out those uses.
1. The Progression hypothesis: that the spread of language change is gradual,
not abrupt, and thus that the token frequencies of the various predicate
negators in a text are in direct relation to the date of production of the text.
This is the easiest hypothesis to test: simply count the proportions of all the
predicate negators in each text and test for correlation with various functions of the date
of publication. Possible functions for testing may likely be suggested by scatter plot or
additional theoretical work. To account for the possible effects of competition among the
various postverbal negators here, it will also be necessary to test for inverse correlation
between the type frequency of ne alone and the date of publication.
2. The Analogical Extension hypothesis: that the spread of an analogical
change is related to type frequency, and thus that the token frequencies of the
various predicate negators with a particular lexical item in a text have direct
relationships with the type frequency of that lexical item during the period
when the text was produced.
This will likely be the most difficult of the three hypotheses to test, because of the
difficulty involved in determining type frequency. What counts as a distinct “type”?
How is it determined? What is a lexical item in this context? Is it a verb form? A
DRAFT 6/21/2017
18
Angus B. Grieve-Smith
Dissertation Proposal
lemmatized verb? Are participles, infinitives, object noun phrases, sentential
complements and adverbs relevant to determining a distinct type?
Another issue involves how to determine the period when the text was produced.
Does the text itself count as input? Should the dates be broken into absolute ranges (for
example, 1625 – 1650), or should the period be specific to each text (for example, the
period for L’Illusion Comique, published in 1630 could be 1605 – 1630)? Should it count
as the literate lifetime of the playwright?
For the strong form of this hypothesis, once the various lexical items and their
type frequencies have been established, the remaining problem is similar to that for the
progression hypothesis: test the type frequencies of each lexical item for correlation with
various functions of the token frequency with that negator. For the weak form it is more
straightforward: test for simple correlation between the type frequency rank of each
lexical item during a particular period and the order in which the token frequency of each
lexical item with each negator reaches a particular threshold.
It doesn’t seem that competition is relevant for the Analogical Extension
hypothesis. The hypothesis makes a claim that the spread of a particular change will be
correlated with the type frequency of the lexical items as they change, so it applies to one
change at a time.
3. The Entrenchment hypothesis: that the spread of analogical change is
blocked in high-token-frequency contexts, thus that the rate of decline of the
relative token frequency of a particular lexical item with ne alone over a
certain period has a direct inverse relationship with the per-word token
frequency of that lexical item during the preceding period.
DRAFT 6/21/2017
19
Angus B. Grieve-Smith
Dissertation Proposal
Token frequency is much more straightforward to measure than type frequency,
but measuring the rate of change is somewhat problematic. What period is used? What
is used for the preceding period? I will need to count the token frequencies of each
lexical item with each predicate negator in each text and test for correlation with various
functions of the token frequency with that negator. For the weak form it is even more
straightforward: test for simple correlation between the token frequency rank of each
lexical item during a particular period and the order in which the token frequency of each
lexical item with each negator reaches a particular threshold.
4. The Dead End hypothesis: that once an analogical change is started, the old
form ceases to be productive, and any new constructions or lexical items will
take the new form.
This is a simple item to test, and doesn’t even require any counting. All that is
necessary is to set a date for the beginning of the analogical extension, make a list of all
of the constructions using the old form, and check to see whether they were all attested
before the beginning of the change.
5. The Loser’s Entrenchment hypothesis: that when a form loses a
grammaticization competition, its decline is blocked in high-token-frequency
contexts, thus that the rate of decline of the relative token frequency of a
particular lexical item with that form over a certain period has a direct inverse
relationship with the per-word token frequency of that lexical item during the
preceding period.
DRAFT 6/21/2017
20
Angus B. Grieve-Smith
Dissertation Proposal
6. The Loser’s Dead End hypothesis: that when a form loses a
grammaticization competition, it ceases to be productive, and any new
constructions or lexical items will take the winning form instead.
These two hypotheses having to do with competition are tested in identical ways
to the regular Entrenchment and Dead End hypotheses.
It will also be necessary to test other possible relationships, if only to rule them
out. A factor analysis can determine whether the relationships hypothesized above are
more important than relationships between token frequency and verb class, tense, subject,
clitic pronouns, dislocation, the presence of other adverbs, the use of dislocation
constructions, and possibly other factors.
Pilot study
In 1998 I conducted a pilot study of thirteen texts published between 1000 and
1998 (for a total of 250,000 words). The study is not of publishable quality, but it is
consistent with the lexical diffusion pattern described by Bybee and Thompson (1997).
The analogical extension of these constructions is led by the high-type-frequency verbs
être and avoir, and gradually diffuses through the other verbs beginning with vouloir, and
continuing in descending order of type frequency. The extension was resisted by the
relatively low-type-frequency, high-token-frequency verbs pouvoir, savoir and faire.
After the extension began, no new constructions using ne alone appeared in French,
according to Ewert (1933), confirming Bybee and Thompson’s prediction that the
entrenchment of such constructions is resistant to change.
The texts were analyzed with a series of Perl programs written especially for this
project. The first, vtag, allowed me to tag each negation by type: ne alone, ne … pas,
DRAFT 6/21/2017
21
Angus B. Grieve-Smith
Dissertation Proposal
ne … point, ne … mie, ne … nient, partial negation, subordinate irrealis or examples
which were not after all instances of ne. Another script, count1, reported frequency
counts for each tag in the text, and for the total number of predicate negators. A third
script, verba, helped me hand-count the verbs used in these negations; for this study I
chose to count all the various conjugated forms of a verb as instances of the same verb,
with five exceptions; the auxiliary uses of avoir and être in the passé composé, the use of
être as an auxiliary in passive constructions, the use of aller as an auxiliary in the futur
proche, and the existential y avoir were all counted as separate verbs. Type frequency for
a negator in a text was determined by counting this list of verbs with the Unix wc utility.
The following chart shows that the Progression hypothesis held true for the verbs
in the corpus:
Occurrence of Predicate Negators
120%
Percent of total predicate negations
100%
80%
ne...pas
ne...point
ne...mie
ne...nient
ne alone
60%
40%
20%
0%
1000
1100
1200
1300
1400
1500
-20%
Year
DRAFT 6/21/2017
22
1600
1700
1800
1900
2000
Angus B. Grieve-Smith
Dissertation Proposal
The relative frequency of ne alone seems to follow an S-shaped logistic curve.
The relative frequency of ne … pas would necessarily have followed an inverse of that Scurve, except that competition from ne … point takes many of the tokens. Once
ne … point is on the decline, we see that its tokens are transferred to ne … pas.
The next chart and tables show that the texts in the corpus followed the
Analogical Extension hypothesis:
Verbs without Pas or Point, 1542-1665
120%
Occurrences w ith ne alone
100%
etre
80%
avoir
60%
pouvoir
savoir
40%
vouloir
20%
0%
1540
1582
1624
1666
Date
Table 4 shows type frequencies for the most frequent verbs in Gargantua (1542)
and Médée (1558), the two sixteenth-century plays in the corpus. Table 5 shows similar
figures for L’Illusion Comique (1630).
Verb
Token frequency
Type frequency
(n=44)
être
be
11
6
avoir
have
9
3
vouloir
want
2
2
aller
go
2
1
entrer
enter
4
1
Table 4. Type frequency of ne ... pas for high token-frequency constructions in the 16th
century.
DRAFT 6/21/2017
Gloss
23
Angus B. Grieve-Smith
Verb
Dissertation Proposal
Gloss
Token frequency
Type frequency
(n=58)
être
be, auxiliary
24
24
avoir
have, auxiliary
5
4
vouloir
want
2
2
mériter
deserve
2
2
attendre
wait, expect
2
2
Table 5. High type-frequency constructions for ne ... pas in L’Illusion comique.
These numbers suggest that the relatively high type frequency of être, avoir and
vouloir shown in Table 1 did indeed influence the productivity of these verbs in the
ne … pas negation shown in the chart above, as predicted by the Analog Extension
hypothesis.
With respect to the Entrenchment and Dead End hypotheses, it is welldocumented in traditional grammars of French that there are a small number of contexts
in which pas and point have only infrequently been used in print, but an explanation is
rarely offered. A typical example is Ewert (1933:260):
In spite of the generalization of the negative complementary
particle, Mod.F. still makes a fairly extensive use of the simple ne: (a) in
fixed locutions (A Dieu ne plaise; N’importe; Il n’est pire eau que l’eau
qui dort) or verbal constructions of the type n’avoir garde, n’avoir cure;
(b) optionally with cesser + inf. (Il ne cesse de parler), with oser, savoir,
pouvoir; [...] (d) after exclamatory qui or que (Que n’est-il venu!; Qui ne
voit la raison de tout cela!); (e) often in subordinate clauses introduced by
conditional si, final que, or depending upon a negative, or expressing a
condition by inversion (Je n’ai pas d’amis qui ne soient les vôtres; N’eût
été la guerre ‘had it not been for the war’).
The Entrenchment hypothesis predicts that the constructions mentioned by Ewert
will have relatively high token frequencies. Table 6 shows the token frequency of
constructions that occurred more than three times out of a total of 211 occurrences of ne
alone in Gargantua (1542), the last text in the corpus before this shift occurred.
DRAFT 6/21/2017
24
Angus B. Grieve-Smith
Dissertation Proposal
Construction
Gloss
Frequency
être
be (auxiliary uses excluded)
56
si
conditional construction
26
pouvoir
be able
19
savoir
know; be able
15
faire
make; do
9
avoir
have (auxiliary uses excluded)
8
final que
restrictive clauses
6
Table 6. Frequency of constructions appearing more than three times in Gargantua.
The two most frequent verbs after être are pouvoir and savoir, which are also the
two most likely to not take pas in Chart 4. Since the change to ne...pas was led by être
and avoir, we would not expect them to be entrenched, but all the high-frequency
constructions other than être, avoir and faire are among those listed by Ewert. In fact,
faire is occasionally used with ne alone through 1668. One possible explanation is that
faire is often used with point during the seventeenth century. Perhaps its high type
frequency was enough to overcome the entrenchment in this case.
The Dead End hypothesis predicts that there will be very few, if any, new
constructions using ne alone; in fact, we should expect to see all of Ewert’s constructions
attested in the Classical period. This is indeed the case: in addition to the constructions
listed in Table 6, the following constructions mentioned by Ewert are attested in
Gargantua: exclamatory qui or que; à Dieu ne plaise; n’importe; n’avoir cure; cesser;
oser; savoir; pouvoir. N’avoir garde is attested in L’Avare (1668).
In the pilot study I did not address the two Loser’s hypotheses for the ne … point
construction, and it is possible that the corpus was too small to support any investigation
of that.
The pilot study suggested that postverbal negation follows the Bybee and
Thompson model, but the corpus used was problematic in a few different ways. The first
DRAFT 6/21/2017
25
Angus B. Grieve-Smith
Dissertation Proposal
is that it may have been too small. Time constraints prevented me from performing tests
of statistical significance in the original study, but statistical significance is necessary for
this study to have implications for anything beyond the texts studied. The various
hypotheses tested in this study require a variety of tests of statistical significance, and it
may well be that some of these tests require more texts, longer texts, or both.
The second issue is that the corpus was drawn from digitized texts easily available
on the World Wide Web, and thus not necessarily representative of French theater during
the period. It also mixed theatrical texts with film reviews, which fails to control for the
effect of genre on the data.
Another issue is that while some of the plays are in regular prose format, many of
them are in rhymed verse. Prose plays are rare before 1650 and verse plays become more
and more rare after 1750, so to cover the entire period it will be necessary to study both
genres. In the pilot study I compared the verse play Les Plaideurs by Jean Racine with
two contemporary plays by Molière, and found only small differences in the token
frequencies of the various negation constructions. I would expect that the two genres
would not differ significantly in their use of negation, but this will need to be confirmed.
It may be advisable to perform similar comparisons for comedies, dramas and other
genres as well.
Dissertation Timeline
Literature review, expert consultation and corpus collection
Right now I have reviewed a portion of the scientific literature on the history of
French negation, and I have collected a number of other works on this topic. During the
initial literature review, I expect to obtain and read the vast majority of the work
DRAFT 6/21/2017
26
Angus B. Grieve-Smith
Dissertation Proposal
published in the United States, Canada and the United Kingdom on this topic, and much
of the work published in France, Belgium and Switzerland.
I have not yet encountered another study that even acknowledges the lexical
diffusion of the embracing negation constructions in French, but I have not yet been able
to do a search that is comprehensive enough. Most of the articles on French negation that
I have found and read are written by researchers from the United States, Great Britain and
other countries, but relatively few of them are written by French researchers. There is a
significant possibility that there has been research done on this topic in France that I have
been unable to find in the United States or on the Internet. I already have knowledge of
four articles that are not at any of the libraries where I have access.
To conduct a more exhaustive search that would include French sources, I will
need to travel to France. This will also allow me to consult with French scholars working
in related areas. As a side benefit, it will allow me to acquire additional texts to complete
a representative corpus. Over the next several months I expect to collect texts to be used
in the corpus, but to be truly representative, it may be necessary to include texts that are
not available in the United States.
There are a number of libraries in France where corpus texts and secondary
sources may be found. Many of them are fortunately located in the Paris region,
including the Bibliothèque Nationale and the university libraries Sainte-Geneviève,
Sorbonne and Nanterre. More specifically, the Université de Paris III has libraries
specializing in linguistics and in literature that would be very useful. Between now and
the summer, I expect to compile an initial list of articles and books that I am unable to
find in the United States, and to make initial contact with French scholars in this area.
DRAFT 6/21/2017
27
Angus B. Grieve-Smith
Dissertation Proposal
After having done research in France, I expect to have acquired copies of a
number of books and journal articles relevant to this topic. I also expect to have
consulted with both American and French experts in this area of linguistics, and to be
aware of related and possibly overlapping research. At that point I will be able to know
whether anyone else has investigated lexical diffusion in this area, and change my topic if
necessary. Finally, I expect to have acquired the sources necessary for a representative
corpus of French theater.
Data Analysis
Since the results from the pilot study were satisfactory, I plan to use substantially
the same methods with the dissertation study. The most important addition will be the
use of tests for statistical significance, including the Pearson product moment and
regression analysis, depending on the measurement. I also plan to test the two Loser’s
hypotheses.
Conclusion
When this is all over I hope to have a PhD and a tenure-track job, and have made
a significant contribution to the theory of grammaticization.
Appendix A: Data Sources
ABU: L’Association des Bibliophiles Universels, <http://cedric.cnam.fr/ABU>
ATHENA: Project at the University of Geneva, <http://un2sg4.unige.ch/athena>
Canal Plus “Cyberculture” journal,
<http://www.cplus.fr/html/cyberculture/cybermain.htm>
Chantez-vous français? <http://www.home.ch/~spaw2928/robin>
L’Express: <http://www.lexpress.fr>
Ministry of Culture:
<http://www.france.diplomatie.fr/culture/france/biblio/foire_aux_textes/>
ANONYMOUS. ca. 1000. La chanson de Roland, epic poem. ABU.
CORNEILLE, PIERRE. 1630. L’Illusion Comique, drama in verse. ABU.
DRAFT 6/21/2017
28
Angus B. Grieve-Smith
DE LA HALLE, ADAM.
Dissertation Proposal
1215. Li Gieus de Robin et Marion, drama in verse. Chantez-vous
Français?
GANDILLOT, THIERRY, ed. 1998. L’Express, weekly magazine with film reviews.
L’Express.
LA PERUSE, JEAN DE. 1556. Médée, drama in verse. ATHENA.
MARIVAUX, PIERRE. 1730. Le Jeu de l’amour et du hasard, drama in prose. ABU.
MOLIERE (JEAN-BAPTISTE POQUELIN). 1665. Dom Juan, drama in prose. ABU.
1668. L’Avare, drama in prose. ABU.
1671. Les Fourberies de Scapin, drama in prose. ABU.
MUSSET, ALFRED DE. 1834. Lorenzaccio, drama in prose. Ministry of Culture.
RABELAIS, FRANÇOIS. 1542. Gargantua, drama in verse. ABU.
RACINE, JEAN. 1668. Les Plaideurs, drama in verse. Ministry of Culture
ROSTAND, EDMOND. 1897. Cyrano de Bergerac, drama in prose. ABU.
References
BYBEE, JOAN. 2003. Mechanisms of change in grammaticization: The role of frequency.
Handbook of Historical Linguistics, ed. by Richard Janda and B. Joseph. Oxford:
Blackwell.
BYBEE, JOAN AND SANDRA THOMPSON. 1997. Three frequency effects in syntax.
Berkeley Linguistic Society.
EWERT, ALFRED. 1933. The French language. London: Faber & Faber.
HARRIS, MARTIN. 1978. The evolution of French syntax: A comparative approach. New
York: Longman.
HOPPER, PAUL J. AND ELIZABETH CLOSS TRAUGOTT. 1993. Grammaticalization. New
York: Cambridge University Press.
POSNER, REBECCA. 1985. Post-verbal negation in non-standard French: A historical and
comparative view. Romance Philology 39: 170-197.
PRICE, GLANVILLE. 1971. The French language: Present and past.
1997. Negative particles in French. De mot en mot: Aspects of medieval
linguistics, ed. by Stewart Gregory and D. A. Trotter. Cardiff: University of
Wales Press.
SCHWEGLER, ARMIN. 1983. Predicate negation and word-order change: A problem of
multiple causation. Lingua 61: 297-334.
VENNEMANN, THEO. 1974. Topics, subjects and word order: From SXV to SVX via
TVX. Historical linguistics I: Syntax, morphology, internal and comparative
reconstruction, ed. by S. C. Dik and J. G. Kooij. Amsterdam: North Holland.
DRAFT 6/21/2017
29