Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Lexical Diffusion of Postverbal Negation in French Draft Dissertation Proposal Angus B. Grieve-Smith April 21, 2006 During the history of the French language, predicate negation has shifted from being expressed by the preverbal particle ne alone, to the “double negation” construction ne … pas, to the postverbal particle pas alone. This change is a well-studied phenomenon in French, but not much has been written on the intermediate “double negation” stage, where the construction spread from being used with just a few verbs in the sixteenth century to being almost universally required with every verb in the twentieth. In much of the literature this stage is treated as a homogeneous change, but on closer examination it can be seen as a rich, complex change that bears out several predictions from recent theories on language change. I am examining this change from the perspective of grammaticization theory, a subfield of functional linguistics. To keep the research manageable, I have restricted my focus to the genre of theater, with a wider study possible in the future. I have already conducted a pilot study indicating that the lexical diffusion of ne … pas in French theater follows the principles described by Bybee and Thompson (1997), and for my dissertation I propose an expanded version of this study, with a larger and more representative corpus. Background The history of French negation is one of the most popular examples of grammaticization. Schwenter (2003) points out that t is one of the three examples used by Jespersen (1917) to formulate general principles of the evolution of negation, later named “Jespersen’s Cycle” by Dahl (1979): 1 Angus B. Grieve-Smith Dissertation Proposal The history of negative expressions in various languages makes us witness the following curious fluctuation: the original negative adverb is first weakened, then found insufficient and therefore strengthened, generally through some additional word, and this in its turn may be felt as the negative proper and may then in course of time be subject to the same development as the original word (Jespersen 1917). Data background To begin with, it is important to introduce a number of words that will be referred to throughout this study. ne pas point mie nient mot, goutte, grain this old Latin predicate negator is rarely used in modern conversational French derived from a noun meaning “step,” the “forclusive” (i.e. postverbal negative) particle in the Parisian dialect of Old French, and now the most common negator the forclusive particle of the Norman dialects of Old French, usually negating NPs, and a competitor to pas in Middle French and Classical French a common negator in the Rhine dialects of Old French preferred negator for the Northern dialects of Old French “expressions of minimal value” occurring in particular constructions, e.g. il ne dit mot, “he won’t say a word” Table 1. Words used in French negation. Since the change from ne to ne...pas concerns negation of whole predicates, I will be excluding from the study a number of words and constructions that are used with the negative particle ne in contexts other than predicate negation. These are summarized in Table 2. negative quantifiers, equivalent to English “nothing,” “no one,” “never,” “none,” “in no way,” “not any more,” “neither” and “only”; these will be excluded from the study ne is occasionally used with some “irrealis” subjunctive constructions, such as avant que “before”; these cases will also be excluded from the study comparatives ne is also used with some comparative constructions, which will be excluded from the study Table 2. Words and constructions used in contexts other than predicate negation. rien, personne, jamais, aucun, nullement, plus, ni, que, sinon the subjunctive mood DRAFT 6/21/2017 2 Angus B. Grieve-Smith Dissertation Proposal Over the evolution of the French language, negation has shifted from being expressed by the preverbal particle ne alone to the postverbal particle pas alone. Table 3 summarizes the stages of these changes. Vulgar Latin (100 - 900) ne alone, plus several idiomatic expressions derived from object nouns; largely unwritten Old French and Middle French (900 - 1500) mostly ne alone, but other expressions with both ne and one of pas, point, mie or nient. Classical French (1500 - 1900) most negations use both ne and either pas or point; several uses of ne alone remain Modern French (1900 - Present) pas is the negator used in almost all cases; ne alone remains in a few idioms; both are used together in formal language Table 3. Stages in the evolution of French negation. The original French negator was ne (or non, according to some scholars), a reflex of the original Indo-European sentence negator, as we see in example (1), from the Chanson de Roland, one of the earliest extant French texts. 1) s il ne cumbat a cele gent hardie if 1sg NE combat to that people hardy “if he does not fight that hardy people” -Chanson de Roland (ca. 1000), line 2600 While the language was still considered Latin, ne preceded the verb, and a number of supplemental particles were added to specify the scope of the negation. Many of these particles were negative quantifiers, determiners and adverbs. Examples of these include (in their modern forms), jamais “never” (as shown in example 2), rien “nothing,” plus “not any more,” que “only,” aucun/aucune “no” (determiner) and nulle part “nowhere.” 2) n asemblereit jamais Carl si grant esforz NE assemble-COND never Charles such big force “Charles would never put together such a large force” -Chanson de Roland (ca. 1000), line 655 Even in Late Latin, however, certain negative particles began to be used as “emphatic” negators, to supplement ne. As example (3) shows, these particles are found in the earliest French texts. The most common of these particles in Old French were mie DRAFT 6/21/2017 3 Angus B. Grieve-Smith Dissertation Proposal (which originally meant “crumb”), point (“dot,” “point,” “stitch,”) pas (“step”) and nient (“a nothing”): 3) lor enseignes n i unt mie ubliees 3pl.POSS teachings NE there AUX.3sg not forget.PPl “[the Franks and peasants] have not forgotten their teachings” -Chanson de Roland (ca. 1000), line 3549 Schwenter (to appear) takes a closer look at the notion of an “emphatic negator” in data from present-day Catalan, (standard) Italian and Brazilian Portuguese, where the double-negation constructions no … pas, non … mica and não … não, respectively, exist alongside simple predicate negators no, non and não. In these languages, he argues that the double-negation constructions have the restricted usage of marking the denial of a discourse-activated proposition. It may be possible that this was the case in Middle French as well, and that the usage was semantically and pragmatically extended at the same time as it became more frequent. Price (1997) mentions giens “person”, grain “grain, particle” and gote “drop” as additional postverbal negative particles in Old and Middle French. There are also a number of expressions which Price calls “idioms of minimal value,” such as bouton “button,” clou, “nail,” and areste “fish bone,” and which have only ever been attested in a small number of idiomatic expressions. The list of acceptable negative particles has grown shorter and shorter over the past thousand years, until in the present day (as shown in example 4) ne … pas is the only form used in conventional writing, and aside from a few fixed expressions, ne … point is considered an affected usage. DRAFT 6/21/2017 4 Angus B. Grieve-Smith 4) il semble PRO.3sg seem dû owe.PPl à to un DET Dissertation Proposal que COMP le DET effet effect phénomène phenomenon ne NE soit be.SUBJ pas not spécial special “it appears that the phenomenon is not due to a special effect” -L’Express, August 27, 1998 To the extent that there was a difference in meaning or usage between the various particles, it was slight and difficult to determine. Price (1997) makes a convincing case that in Old and Middle French, ne … pas, ne … mie and ne … nient were commonly used as negative adverbs, while ne … point was used more often in conjunction with partitive constructions, as in example 5. 5) il n' y a poinct PRO.3sg NE EXIST not “there is no enchantment” d' DET enchantement enchantment -Gargantua (1542), line 753 He also shows that the distribution of ne … pas, ne … mie and ne …nient was largely dialectal, with ne … nient used in Wallonia, ne … mie in Lorraine and ne … pas in the rest of the oïl region, although poets who were speakers of one dialect would occasionally borrow the form of another dialect to make a rhyme. Since most of the negative particles are identical in form to nouns, it is often supposed that they are grammaticized from these nouns; for example, pas is also the word for “step,” as shown in example 6: 6) segnurs le pas tenez lord.PL DET step hold-IMP “my lords, stay on track” -Chanson de Roland (ca. 1000), line 2851 so it is plausible that it once existed in constructions meaning, “I’m not going one step,” and that this then came to mean “I’m not going at all,” and then simply “I’m not going.” DRAFT 6/21/2017 5 Angus B. Grieve-Smith Dissertation Proposal Although there may be evidence for this early stage of grammaticization in the corpus of Late Latin texts, by the time of Old French (900-1200) these words are clearly not being used as nouns in these negative constructions. In Old French it is common for nouns to take articles, and these particles never do. Many of them also have distinctly adverbial uses. In the Middle French period (1200-1500), the cultural power of Paris grew, and with it the strength of norms based on Parisian usage. Thus Lorrain ne ... mie and Walloon ne ... nient fell out of favor, although according to Price (1997) they were common in regional spoken varieties as recently as the Atlas linguistique de la France of 1902. In Classical and Contemporary French (1500-present), then, the choices for negating sentences are ne alone, ne … pas or ne … point. The distinction in meaning between ne … pas and ne … point has troubled scholars since at least the middle of the seventeenth century. As Price (1997) points out, the Old French syntactic distribution where point was used primarily as a substantive while ne … pas was used as an adverb appears to have gradually eroded, until ne … pas could be used with some partitive constructions by the late seventeenth century, and with all by the early nineteenth. Example 7 shows ne … pas used with a partitive construction in the seventeenth century. 7) il n' y a pas de PRO.3sg NE EXIST not DET “there is no reply to that” réplique reply à to cela that -L’Avare (1668), line 1002 From the middle of the nineteenth century on, ne … point is rarely used in written text, other than to provide a “country” feel. DRAFT 6/21/2017 6 Angus B. Grieve-Smith Dissertation Proposal From the late sixteenth to early seventeenth century (the beginning of the Classical French period), ne … pas displaces ne alone as the negative construction of choice. From that point on, ne alone accounts for a relatively small number of negative constructions, and eventually dwindles to small numbers in written text, and is not used at all in casual speech. Recently, then, ne … pas has become almost obligatory in French predicate negation. Once this happened, the ne became redundant, and this allowed it to be reduced to nothing. It has been dropped more and more from casual speech, until in present-day speech postverbal pas is the sole negator. Ne is still required by the norms of written language, although it is often omitted to suggest a colloquial or trendy style, as in example 8: 8) la DET du of Sibérie, Siberia c’ PRO.3sg est be pas not trop too-much le DET pays country cyber high-technology “Siberia isn’t exactly a cyberland” -Title of an article in the Webjournal Cyberculture from Canal Plus We have thus come full circle, from ne alone to pas alone. In Jespersen’s (1917) discussion, French differs from the other two languages, Danish and English, in one significant respect: the original negative particle ne is placed before the verb and the “reinforcing” particles after the verb, while in Danish and English, both particles were placed after the verb. In English, ne was followed by a wiht, “a thing,” and the whole expression was eventually reduced to naught and then to not. In contrast, in French ne and pas are almost always separated by the verb, which mean that they cannot contract into a single word. DRAFT 6/21/2017 7 Angus B. Grieve-Smith Dissertation Proposal Consistent with the notion of grammaticization, the steps described above can be summarized in a grammaticization cline, as used by Hopper and Traugott (1993). The pathway looks as follows: Pathway Period Examples noun > Latin, LL passum punctus mica nient emphatic negator OFr, MFr ne … pas ne … point ne … mie ne … nient > embracing negation Classical Fr ne … pas ne … point > sole negator Modern Fr pas Previous approaches Various explanations have been suggested by other scholars to account for these changes. In this context, it is important to note when an explanation is being offered for the underlying motivation of the change in general, and when it is being put forth as the mechanism of a specific part of the change. To clarify this distinction, I will deal with the underlying motivations first, and then turn to descriptions of specific mechanisms. Several scholars see the change in French negation as a small part in a large-scale shift in French syntax. Some argue that cognitive constraints implied by the Greenberg universals direct the entire process. Vennemann (1974) suggests that the shift to postverbal negation was the tail end of a greater shift from the Latin verb-final word order to French SVO word order. Harris (1978), on the other hand, claims that the negation shift is just the beginning of a shift to object-verb word order in French. Posner (1985), building on Vennemann’s claims, accounts for the retention of preverbal cognates of ne in other Romance languages by positing that in those languages the negator changes in category from an adverb to a clitic, but does not provide any evidence why the two categories would be treated differently by speakers. DRAFT 6/21/2017 8 Angus B. Grieve-Smith Dissertation Proposal Posner (1985) also suggests the alternative hypothesis that the shift may have been due to influence from German. However, Schwegler (1983) points out that there is no evidence for the strengthening of ne … pas and subsequent loss of ne before 1200, and that German influence in France ended by 900; this is therefore an unlikely explanation for France proper, although it may have played a role in similar changes in FrancoProvençal and Romantsch. Other research takes a more piecemeal approach to the process, focusing on small components of the change. Price (1997) suggests that the discontinuous (ne … particle) construction (in its early form as “emphatic negation”) may have been borrowed into Late Latin from the Celtic languages it supplanted through a period of “Celtic-Romance bilingualism.” Schwegler (1983) offers the semantic explanation that emphatic negators are a useful part of communication, and that languages can invent them spontaneously. Posner attributes to “classic histories of the language” the position that at some point between 1300 and 1500, French stress shifted from the old Latin pattern of penultimate pitch-contour stress to its current system of lengthening the last syllable. As a result the preverbal clitics, including ne, went from being occasionally stressed to never stressed. The loss of this stress heightened the phonological reduction of ne and motivated greater use of ne … pas. As Posner points out, this would not account for neighboring varieties including Occitan and some of the Northern dialects of Italian, which have postverbal negation without the French stress pattern. Ewert (1933) offers a frequency-based mechanism for the increase of the use of ne … pas. His argument is that “the constant use of these words as negative particles DRAFT 6/21/2017 9 Angus B. Grieve-Smith Dissertation Proposal resulted in a weakening of their meaning and their extension,” after which ne became redundant and dispensable. Ne, in turn, became reduced through frequent use. In this case, it is possible that everyone could be right to some extent. Or almost everyone; it seems unlikely that these changes could have been due to both Vennemann’s XV -> VX shift and Harris’s VX -> XV shift. On the other hand, general cognitive impulses could play a role in making one construction “feel more comfortable” to the speakers. Language contact could also have provided a model that speakers emulated, and changes in stress patterns, semantic bleaching and reduction could have triggered the rise of ne … pas. What remains to be seen is the relative importance of all these factors. Of the studies mentioned above, most rely simply on the existence or absence of a particular form in the texts from the relevant period. Only Price counts the relative frequency of items, which lends more support to his claims. Theoretical background In many studies of language change, the changes are assumed to be gradual and constant, affecting every lexical item identically. This is sometimes a useful abstraction, but on closer examination it is clear that every change starts with a small subset of the lexicon and spreads, in a process known as lexical diffusion. This phenomenon has been observed at least as early as Schuchardt (1885), and described in greater detail by Wang (1969) and Labov (1981). Joan Bybee (Hooper 1976, Bybee 2002) refined the notion and incorporated it into the subset of functional linguistics known as grammaticization or grammaticalization theory. In a nutshell, grammaticization describes the process by which grammatical elements are formed from non-grammatical words and constructions. DRAFT 6/21/2017 10 Angus B. Grieve-Smith Dissertation Proposal According to Hopper and Traugott (1993), although grammaticization was first named by Meillet (1912), some of the processes involved were described by Humboldt (1825). In relating lexical diffusion and grammaticization, Bybee and Thompson (1997) make a distinction between “token frequency,” which is simply the number of occurrences of particular words or constructions as a fraction of the total words in a text, and “type frequency,” which “counts how many different lexical items a certain pattern or construction is applicable to,” i.e. how many different contexts it appears in. Bybee and Thompson describe three major effects of frequency on syntactic structure. In Analogical Extension, patterns with high type frequency are more “productive” in terms of spreading to other contexts. “The more lexical items that are heard in a certain position in a construction, the less likely it is that the construction will be associated with a particular lexical item and the more likely it is that a general category will be formed over the items that occur in that position.” (Bybee and Thompson 1997: 71) In the Conserving Effect, lexical items with high token frequency are resistant to the effects of analogical extension. “The more a form is used, the more its representation is strengthened, making it easier to access the next time. Words that are strong in memory and easy to access are not likely to be replaced by new forms created with the regular pattern.” (Bybee and Thompson 1997: 65) The third frequency effect, the Reduction Effect, is not relevant to this study. Many examples of grammaticization are complicated by the fact that in the early stages there is often more than one lexical item filling closely related roles. This is often called specialization, but some have opted for the less confusing term of competition. DRAFT 6/21/2017 11 Angus B. Grieve-Smith Dissertation Proposal The competition between the postverbal French negators pas, point, mie and nient is used as an example of this by Hopper and Traugott (1993). Proposed study Since the evolution of negation in French is literally a textbook case of grammaticization, it is an ideal test case for Bybee and Thompson’s theories about lexical diffusion. I propose that the adoption of ne … pas is a form of analogical extension, and will thus show evidence of the Extension Effect of type frequency on production and, of the Conserving Effect of token frequency on entrenchment. Hypotheses Bybee and Thompson’s theory makes specific predictions about negation in French, and my aim is to test these hypotheses against data from a corpus. Here are the predictions. 1. The Progression hypothesis: that the spread of language change is gradual, not abrupt, and thus that the token frequencies of the various predicate negators in a text are in direct relation to the date of production of the text. 2. The Analogical Extension hypothesis: that the spread of an analogical change is related to type frequency, and thus that the token frequencies of the various predicate negators with a particular lexical item in a text have direct relationships with the type frequency of that lexical item during the period when the text was produced. 3. The Entrenchment hypothesis: that the spread of analogical change is blocked in high-token-frequency contexts, thus that the rate of decline of the relative token frequency of a particular lexical item with ne alone over a DRAFT 6/21/2017 12 Angus B. Grieve-Smith Dissertation Proposal certain period has a direct inverse relationship with the per-word token frequency of that lexical item during the preceding period. 4. The Dead End hypothesis: that once an analogical change is started, the old form ceases to be productive, and any new constructions or lexical items will take the new form instead. In their discussion of the analogical extension and entrenchment hypotheses, Bybee and Thompson do not make explicit predictions as to the exact nature of the relationships between these changes and frequency. There are strong and weak hypotheses for each of these hypotheses. The weak hypothesis is that there is a one-toone mapping between the order of the items taking part in the change and their type or token frequency rank. The strong version is that there is not just a mapping, but a gradient relationship between the date of production and the frequency (token or type) of the lexical item or context. Such a gradient relationship could be linear, quadratic or described by some other function. It is necessary to account for the effects of competition among the various postverbal negators, particularly for the Progression hypothesis. There are two possibilities: the date of production may be correlated with the frequency of individual negators, or it may be inversely correlated with the frequency of the old preverbal negation. In other words, is it the spread of particular forms we are concerned with, or the spread of things that are not ne alone? Once a form “loses” the competition and begins to decline, it too will likely be subject to the Entrenchment and Productivity hypotheses as well. DRAFT 6/21/2017 13 Angus B. Grieve-Smith Dissertation Proposal 5. The Loser’s Entrenchment hypothesis: that when a form loses a grammaticization competition, its decline is blocked in high-token-frequency contexts, thus that the rate of decline of the relative token frequency of a particular lexical item with that form over a certain period has a direct inverse relationship with the per-word token frequency of that lexical item during the preceding period. 6. The Loser’s Dead End hypothesis: that when a form loses a grammaticization competition, it ceases to be productive, and any new constructions or lexical items will take the winning form instead. Corpus The best way to test a hypothesis about language evolution is by measuring a corpus of texts from the relevant period. The hypothesis makes a claim about the language in general, and the texts are selected to approximate the language. In this case, the specific claims made by Bybee and Thompson (1997:65 and 71, as quoted above) are that if a lexical item has a high type frequency in an individual’s input there will be extension, and if it has a high token frequency in an individual’s output there will be entrenchment. It is not clear whether Bybee and Thompson intended to make an input/output distinction or not. Do these frequency effects relate more to perception (a sense of “the language” shaped by input from the rest of the community) or to production (the individual’s daily language practice)? What are the role of norms and standards, whether genre-specific or language-wide? The challenge, as in every corpus-based study, is to find something that approximates the language user’s input and/or output. While in this age of pervasive DRAFT 6/21/2017 14 Angus B. Grieve-Smith Dissertation Proposal surveillance it may be possible to capture the entirety of a person’s language input and output, this was absolutely impossible in 1900, much less in 1500. However, the premise of corpus linguistics is that we are able to approximate a language user’s input and output through a representative selection of texts from a period of the user’s life. An ideal corpus contains a balance of genres that approximates the input that a typical member of the language community perceives. Of course, the majority of most people’s language input and output is spontaneous conversation, and the proportion of conversation only gets higher as we look back in time to periods when there was simply less to read. It makes sense that most of the transmission of unconscious language changes like grammaticization took place primarily through spontaneous conversation. Because of this, spontaneous conversation is a high priority in any corpus. Since the invention of the phonograph, and especially since the invention of audiotape, we have samples of conversation for a corpus, but before that the technology was simply not sufficient for recording conversations in real time. There are some fairly detailed transcripts available, but they are not necessarily representative. Anthony Lodge prepared a corpus of “Paris speech of the past,” for his Sociolinguistic History of Parisian French (Lodge 2004). He assembled three sources: the journal of Jean Héroard, the personal physician of Louis XIII for the first twentyseven years of his life (1601-1628), which includes long transcripts of things that the young Dauphin said; the journal of Jacques-Louis Ménétra, an 18th-century Parisian glassworker; and a collection of political pamphlets claiming to present the words of working-class Parisians, but actually containing fictionalized dialogues written by DRAFT 6/21/2017 15 Angus B. Grieve-Smith Dissertation Proposal educated upper-class men. These journals and pamphlets are thus not necessarily representative and don’t provide adequate coverage of the period under study. Once we lose the possibility of studying spontaneous conversation, we also lose the possibility of a truly balanced and representative corpus. At that point, the question is primarily what kind of corpus most closely mimics the input of the language users. If we understood more about the relations between genres, it might be possible to set up a complicated system of genre substitutions and thus provide an approximation to the input that users were exposed to. Since we don’t, and since a genre imbalance can significantly skew results, it may be simplest to choose one genre and look for the effect there. The genre that is next closest to speech is theater. Unlike novels and newspapers, printed versions of theatrical works in French are available for almost the entire history of the language, at least as far back as the publication of Li gieus de Robin et Marion in 1275. Theater is fiction, but there is an expectation of naturalness in the dialogue that is not necessarily present in legal, historical or philosophical works. It is true that the degree of naturalness is determined by convention and has changed over the years, but it has always been there to some extent. Theater is thus in many ways the next best thing to speech, and theatrical texts are available for the entire period under study. For these reasons, I have chosen to restrict my corpus to theatrical texts for this investigation. The selection of texts is a problematic issue. Ideally, the texts in the corpus will be representative of the dramatic works that a typical French speaker was exposed to. Or should it be what a typical playwright was exposed to? Either way, the texts available will not necessarily be representative in their totality. It is inevitable that many theatrical works, especially from earlier periods, have either not been written down or not been DRAFT 6/21/2017 16 Angus B. Grieve-Smith Dissertation Proposal published, or the written versions have not survived. Of the texts that survived, many of them may not currently be available in electronic format. Another problem is the issue of typicality. From any period, the theatrical works that are published, preserved and digitized are likely to be the most original and most popular, or at least most popular among those doing the publication, preservation and digitization. These are not necessarily going to represent the typical theatergoing experience at any given time. Because of this, it will most likely be necessary to find additional texts and digitize them. It is also important to note that many plays are translations from other languages, most notably Italian. The translators may have chosen expressions that were closer to the original Italian and not necessarily typical French. In addition, many of the earlier plays were in verse, some of them accompanied by music. It will be necessary to test the corpus to determine whether any of these factors affect the use of negation. When studying spontaneous conversation, it is important to control for the effects of linguistic accommodation and thus not directly compare speech from two participants in the same conversation. In the case of plays, it is not clear that playwrights would be aware of accommodation or make any significant effort to reproduce it. Thus it does not seem necessary to control for accommodation in theatrical texts. However, it is important to separate dialogue from stage directions, and to be aware if particular characters are presented as speaking a marked dialect, non-native speakers, or with speech difficulties. DRAFT 6/21/2017 17 Angus B. Grieve-Smith Dissertation Proposal Analysis The heart of the study consists of testing each of the hypotheses above against the corpus. These hypotheses will be repeated below for your convenience. Fortunately, predicate negation is relatively straightforward to study in French. There are a limited number of negative particles, and although most of them are have uses other than predicate negation, it is relatively easy to separate out those uses. 1. The Progression hypothesis: that the spread of language change is gradual, not abrupt, and thus that the token frequencies of the various predicate negators in a text are in direct relation to the date of production of the text. This is the easiest hypothesis to test: simply count the proportions of all the predicate negators in each text and test for correlation with various functions of the date of publication. Possible functions for testing may likely be suggested by scatter plot or additional theoretical work. To account for the possible effects of competition among the various postverbal negators here, it will also be necessary to test for inverse correlation between the type frequency of ne alone and the date of publication. 2. The Analogical Extension hypothesis: that the spread of an analogical change is related to type frequency, and thus that the token frequencies of the various predicate negators with a particular lexical item in a text have direct relationships with the type frequency of that lexical item during the period when the text was produced. This will likely be the most difficult of the three hypotheses to test, because of the difficulty involved in determining type frequency. What counts as a distinct “type”? How is it determined? What is a lexical item in this context? Is it a verb form? A DRAFT 6/21/2017 18 Angus B. Grieve-Smith Dissertation Proposal lemmatized verb? Are participles, infinitives, object noun phrases, sentential complements and adverbs relevant to determining a distinct type? Another issue involves how to determine the period when the text was produced. Does the text itself count as input? Should the dates be broken into absolute ranges (for example, 1625 – 1650), or should the period be specific to each text (for example, the period for L’Illusion Comique, published in 1630 could be 1605 – 1630)? Should it count as the literate lifetime of the playwright? For the strong form of this hypothesis, once the various lexical items and their type frequencies have been established, the remaining problem is similar to that for the progression hypothesis: test the type frequencies of each lexical item for correlation with various functions of the token frequency with that negator. For the weak form it is more straightforward: test for simple correlation between the type frequency rank of each lexical item during a particular period and the order in which the token frequency of each lexical item with each negator reaches a particular threshold. It doesn’t seem that competition is relevant for the Analogical Extension hypothesis. The hypothesis makes a claim that the spread of a particular change will be correlated with the type frequency of the lexical items as they change, so it applies to one change at a time. 3. The Entrenchment hypothesis: that the spread of analogical change is blocked in high-token-frequency contexts, thus that the rate of decline of the relative token frequency of a particular lexical item with ne alone over a certain period has a direct inverse relationship with the per-word token frequency of that lexical item during the preceding period. DRAFT 6/21/2017 19 Angus B. Grieve-Smith Dissertation Proposal Token frequency is much more straightforward to measure than type frequency, but measuring the rate of change is somewhat problematic. What period is used? What is used for the preceding period? I will need to count the token frequencies of each lexical item with each predicate negator in each text and test for correlation with various functions of the token frequency with that negator. For the weak form it is even more straightforward: test for simple correlation between the token frequency rank of each lexical item during a particular period and the order in which the token frequency of each lexical item with each negator reaches a particular threshold. 4. The Dead End hypothesis: that once an analogical change is started, the old form ceases to be productive, and any new constructions or lexical items will take the new form. This is a simple item to test, and doesn’t even require any counting. All that is necessary is to set a date for the beginning of the analogical extension, make a list of all of the constructions using the old form, and check to see whether they were all attested before the beginning of the change. 5. The Loser’s Entrenchment hypothesis: that when a form loses a grammaticization competition, its decline is blocked in high-token-frequency contexts, thus that the rate of decline of the relative token frequency of a particular lexical item with that form over a certain period has a direct inverse relationship with the per-word token frequency of that lexical item during the preceding period. DRAFT 6/21/2017 20 Angus B. Grieve-Smith Dissertation Proposal 6. The Loser’s Dead End hypothesis: that when a form loses a grammaticization competition, it ceases to be productive, and any new constructions or lexical items will take the winning form instead. These two hypotheses having to do with competition are tested in identical ways to the regular Entrenchment and Dead End hypotheses. It will also be necessary to test other possible relationships, if only to rule them out. A factor analysis can determine whether the relationships hypothesized above are more important than relationships between token frequency and verb class, tense, subject, clitic pronouns, dislocation, the presence of other adverbs, the use of dislocation constructions, and possibly other factors. Pilot study In 1998 I conducted a pilot study of thirteen texts published between 1000 and 1998 (for a total of 250,000 words). The study is not of publishable quality, but it is consistent with the lexical diffusion pattern described by Bybee and Thompson (1997). The analogical extension of these constructions is led by the high-type-frequency verbs être and avoir, and gradually diffuses through the other verbs beginning with vouloir, and continuing in descending order of type frequency. The extension was resisted by the relatively low-type-frequency, high-token-frequency verbs pouvoir, savoir and faire. After the extension began, no new constructions using ne alone appeared in French, according to Ewert (1933), confirming Bybee and Thompson’s prediction that the entrenchment of such constructions is resistant to change. The texts were analyzed with a series of Perl programs written especially for this project. The first, vtag, allowed me to tag each negation by type: ne alone, ne … pas, DRAFT 6/21/2017 21 Angus B. Grieve-Smith Dissertation Proposal ne … point, ne … mie, ne … nient, partial negation, subordinate irrealis or examples which were not after all instances of ne. Another script, count1, reported frequency counts for each tag in the text, and for the total number of predicate negators. A third script, verba, helped me hand-count the verbs used in these negations; for this study I chose to count all the various conjugated forms of a verb as instances of the same verb, with five exceptions; the auxiliary uses of avoir and être in the passé composé, the use of être as an auxiliary in passive constructions, the use of aller as an auxiliary in the futur proche, and the existential y avoir were all counted as separate verbs. Type frequency for a negator in a text was determined by counting this list of verbs with the Unix wc utility. The following chart shows that the Progression hypothesis held true for the verbs in the corpus: Occurrence of Predicate Negators 120% Percent of total predicate negations 100% 80% ne...pas ne...point ne...mie ne...nient ne alone 60% 40% 20% 0% 1000 1100 1200 1300 1400 1500 -20% Year DRAFT 6/21/2017 22 1600 1700 1800 1900 2000 Angus B. Grieve-Smith Dissertation Proposal The relative frequency of ne alone seems to follow an S-shaped logistic curve. The relative frequency of ne … pas would necessarily have followed an inverse of that Scurve, except that competition from ne … point takes many of the tokens. Once ne … point is on the decline, we see that its tokens are transferred to ne … pas. The next chart and tables show that the texts in the corpus followed the Analogical Extension hypothesis: Verbs without Pas or Point, 1542-1665 120% Occurrences w ith ne alone 100% etre 80% avoir 60% pouvoir savoir 40% vouloir 20% 0% 1540 1582 1624 1666 Date Table 4 shows type frequencies for the most frequent verbs in Gargantua (1542) and Médée (1558), the two sixteenth-century plays in the corpus. Table 5 shows similar figures for L’Illusion Comique (1630). Verb Token frequency Type frequency (n=44) être be 11 6 avoir have 9 3 vouloir want 2 2 aller go 2 1 entrer enter 4 1 Table 4. Type frequency of ne ... pas for high token-frequency constructions in the 16th century. DRAFT 6/21/2017 Gloss 23 Angus B. Grieve-Smith Verb Dissertation Proposal Gloss Token frequency Type frequency (n=58) être be, auxiliary 24 24 avoir have, auxiliary 5 4 vouloir want 2 2 mériter deserve 2 2 attendre wait, expect 2 2 Table 5. High type-frequency constructions for ne ... pas in L’Illusion comique. These numbers suggest that the relatively high type frequency of être, avoir and vouloir shown in Table 1 did indeed influence the productivity of these verbs in the ne … pas negation shown in the chart above, as predicted by the Analog Extension hypothesis. With respect to the Entrenchment and Dead End hypotheses, it is welldocumented in traditional grammars of French that there are a small number of contexts in which pas and point have only infrequently been used in print, but an explanation is rarely offered. A typical example is Ewert (1933:260): In spite of the generalization of the negative complementary particle, Mod.F. still makes a fairly extensive use of the simple ne: (a) in fixed locutions (A Dieu ne plaise; N’importe; Il n’est pire eau que l’eau qui dort) or verbal constructions of the type n’avoir garde, n’avoir cure; (b) optionally with cesser + inf. (Il ne cesse de parler), with oser, savoir, pouvoir; [...] (d) after exclamatory qui or que (Que n’est-il venu!; Qui ne voit la raison de tout cela!); (e) often in subordinate clauses introduced by conditional si, final que, or depending upon a negative, or expressing a condition by inversion (Je n’ai pas d’amis qui ne soient les vôtres; N’eût été la guerre ‘had it not been for the war’). The Entrenchment hypothesis predicts that the constructions mentioned by Ewert will have relatively high token frequencies. Table 6 shows the token frequency of constructions that occurred more than three times out of a total of 211 occurrences of ne alone in Gargantua (1542), the last text in the corpus before this shift occurred. DRAFT 6/21/2017 24 Angus B. Grieve-Smith Dissertation Proposal Construction Gloss Frequency être be (auxiliary uses excluded) 56 si conditional construction 26 pouvoir be able 19 savoir know; be able 15 faire make; do 9 avoir have (auxiliary uses excluded) 8 final que restrictive clauses 6 Table 6. Frequency of constructions appearing more than three times in Gargantua. The two most frequent verbs after être are pouvoir and savoir, which are also the two most likely to not take pas in Chart 4. Since the change to ne...pas was led by être and avoir, we would not expect them to be entrenched, but all the high-frequency constructions other than être, avoir and faire are among those listed by Ewert. In fact, faire is occasionally used with ne alone through 1668. One possible explanation is that faire is often used with point during the seventeenth century. Perhaps its high type frequency was enough to overcome the entrenchment in this case. The Dead End hypothesis predicts that there will be very few, if any, new constructions using ne alone; in fact, we should expect to see all of Ewert’s constructions attested in the Classical period. This is indeed the case: in addition to the constructions listed in Table 6, the following constructions mentioned by Ewert are attested in Gargantua: exclamatory qui or que; à Dieu ne plaise; n’importe; n’avoir cure; cesser; oser; savoir; pouvoir. N’avoir garde is attested in L’Avare (1668). In the pilot study I did not address the two Loser’s hypotheses for the ne … point construction, and it is possible that the corpus was too small to support any investigation of that. The pilot study suggested that postverbal negation follows the Bybee and Thompson model, but the corpus used was problematic in a few different ways. The first DRAFT 6/21/2017 25 Angus B. Grieve-Smith Dissertation Proposal is that it may have been too small. Time constraints prevented me from performing tests of statistical significance in the original study, but statistical significance is necessary for this study to have implications for anything beyond the texts studied. The various hypotheses tested in this study require a variety of tests of statistical significance, and it may well be that some of these tests require more texts, longer texts, or both. The second issue is that the corpus was drawn from digitized texts easily available on the World Wide Web, and thus not necessarily representative of French theater during the period. It also mixed theatrical texts with film reviews, which fails to control for the effect of genre on the data. Another issue is that while some of the plays are in regular prose format, many of them are in rhymed verse. Prose plays are rare before 1650 and verse plays become more and more rare after 1750, so to cover the entire period it will be necessary to study both genres. In the pilot study I compared the verse play Les Plaideurs by Jean Racine with two contemporary plays by Molière, and found only small differences in the token frequencies of the various negation constructions. I would expect that the two genres would not differ significantly in their use of negation, but this will need to be confirmed. It may be advisable to perform similar comparisons for comedies, dramas and other genres as well. Dissertation Timeline Literature review, expert consultation and corpus collection Right now I have reviewed a portion of the scientific literature on the history of French negation, and I have collected a number of other works on this topic. During the initial literature review, I expect to obtain and read the vast majority of the work DRAFT 6/21/2017 26 Angus B. Grieve-Smith Dissertation Proposal published in the United States, Canada and the United Kingdom on this topic, and much of the work published in France, Belgium and Switzerland. I have not yet encountered another study that even acknowledges the lexical diffusion of the embracing negation constructions in French, but I have not yet been able to do a search that is comprehensive enough. Most of the articles on French negation that I have found and read are written by researchers from the United States, Great Britain and other countries, but relatively few of them are written by French researchers. There is a significant possibility that there has been research done on this topic in France that I have been unable to find in the United States or on the Internet. I already have knowledge of four articles that are not at any of the libraries where I have access. To conduct a more exhaustive search that would include French sources, I will need to travel to France. This will also allow me to consult with French scholars working in related areas. As a side benefit, it will allow me to acquire additional texts to complete a representative corpus. Over the next several months I expect to collect texts to be used in the corpus, but to be truly representative, it may be necessary to include texts that are not available in the United States. There are a number of libraries in France where corpus texts and secondary sources may be found. Many of them are fortunately located in the Paris region, including the Bibliothèque Nationale and the university libraries Sainte-Geneviève, Sorbonne and Nanterre. More specifically, the Université de Paris III has libraries specializing in linguistics and in literature that would be very useful. Between now and the summer, I expect to compile an initial list of articles and books that I am unable to find in the United States, and to make initial contact with French scholars in this area. DRAFT 6/21/2017 27 Angus B. Grieve-Smith Dissertation Proposal After having done research in France, I expect to have acquired copies of a number of books and journal articles relevant to this topic. I also expect to have consulted with both American and French experts in this area of linguistics, and to be aware of related and possibly overlapping research. At that point I will be able to know whether anyone else has investigated lexical diffusion in this area, and change my topic if necessary. Finally, I expect to have acquired the sources necessary for a representative corpus of French theater. Data Analysis Since the results from the pilot study were satisfactory, I plan to use substantially the same methods with the dissertation study. The most important addition will be the use of tests for statistical significance, including the Pearson product moment and regression analysis, depending on the measurement. I also plan to test the two Loser’s hypotheses. Conclusion When this is all over I hope to have a PhD and a tenure-track job, and have made a significant contribution to the theory of grammaticization. Appendix A: Data Sources ABU: L’Association des Bibliophiles Universels, <http://cedric.cnam.fr/ABU> ATHENA: Project at the University of Geneva, <http://un2sg4.unige.ch/athena> Canal Plus “Cyberculture” journal, <http://www.cplus.fr/html/cyberculture/cybermain.htm> Chantez-vous français? <http://www.home.ch/~spaw2928/robin> L’Express: <http://www.lexpress.fr> Ministry of Culture: <http://www.france.diplomatie.fr/culture/france/biblio/foire_aux_textes/> ANONYMOUS. ca. 1000. La chanson de Roland, epic poem. ABU. CORNEILLE, PIERRE. 1630. L’Illusion Comique, drama in verse. ABU. DRAFT 6/21/2017 28 Angus B. Grieve-Smith DE LA HALLE, ADAM. Dissertation Proposal 1215. Li Gieus de Robin et Marion, drama in verse. Chantez-vous Français? GANDILLOT, THIERRY, ed. 1998. L’Express, weekly magazine with film reviews. L’Express. LA PERUSE, JEAN DE. 1556. Médée, drama in verse. ATHENA. MARIVAUX, PIERRE. 1730. Le Jeu de l’amour et du hasard, drama in prose. ABU. MOLIERE (JEAN-BAPTISTE POQUELIN). 1665. Dom Juan, drama in prose. ABU. 1668. L’Avare, drama in prose. ABU. 1671. Les Fourberies de Scapin, drama in prose. ABU. MUSSET, ALFRED DE. 1834. Lorenzaccio, drama in prose. Ministry of Culture. RABELAIS, FRANÇOIS. 1542. Gargantua, drama in verse. ABU. RACINE, JEAN. 1668. Les Plaideurs, drama in verse. Ministry of Culture ROSTAND, EDMOND. 1897. Cyrano de Bergerac, drama in prose. ABU. References BYBEE, JOAN. 2003. Mechanisms of change in grammaticization: The role of frequency. Handbook of Historical Linguistics, ed. by Richard Janda and B. Joseph. Oxford: Blackwell. BYBEE, JOAN AND SANDRA THOMPSON. 1997. Three frequency effects in syntax. Berkeley Linguistic Society. EWERT, ALFRED. 1933. The French language. London: Faber & Faber. HARRIS, MARTIN. 1978. The evolution of French syntax: A comparative approach. New York: Longman. HOPPER, PAUL J. AND ELIZABETH CLOSS TRAUGOTT. 1993. Grammaticalization. New York: Cambridge University Press. POSNER, REBECCA. 1985. Post-verbal negation in non-standard French: A historical and comparative view. Romance Philology 39: 170-197. PRICE, GLANVILLE. 1971. The French language: Present and past. 1997. Negative particles in French. De mot en mot: Aspects of medieval linguistics, ed. by Stewart Gregory and D. A. Trotter. Cardiff: University of Wales Press. SCHWEGLER, ARMIN. 1983. Predicate negation and word-order change: A problem of multiple causation. Lingua 61: 297-334. VENNEMANN, THEO. 1974. Topics, subjects and word order: From SXV to SVX via TVX. Historical linguistics I: Syntax, morphology, internal and comparative reconstruction, ed. by S. C. Dik and J. G. Kooij. Amsterdam: North Holland. DRAFT 6/21/2017 29