* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Compound (linguistics) wikipedia , lookup
American Sign Language grammar wikipedia , lookup
Zulu grammar wikipedia , lookup
Navajo grammar wikipedia , lookup
Modern Hebrew grammar wikipedia , lookup
Junction Grammar wikipedia , lookup
Portuguese grammar wikipedia , lookup
Agglutination wikipedia , lookup
Macedonian grammar wikipedia , lookup
Modern Greek grammar wikipedia , lookup
Old Irish grammar wikipedia , lookup
Distributed morphology wikipedia , lookup
Lithuanian grammar wikipedia , lookup
Ojibwe grammar wikipedia , lookup
Ukrainian grammar wikipedia , lookup
Udmurt grammar wikipedia , lookup
Spanish grammar wikipedia , lookup
French grammar wikipedia , lookup
Latin syntax wikipedia , lookup
Esperanto grammar wikipedia , lookup
Kannada grammar wikipedia , lookup
Old Norse morphology wikipedia , lookup
Old English grammar wikipedia , lookup
Ancient Greek grammar wikipedia , lookup
Scottish Gaelic grammar wikipedia , lookup
Swedish grammar wikipedia , lookup
Yiddish grammar wikipedia , lookup
Serbo-Croatian grammar wikipedia , lookup
Russian grammar wikipedia , lookup
Polish grammar wikipedia , lookup
Malay grammar wikipedia , lookup
Lexical semantics wikipedia , lookup
323 Morphology The Structure of Words 3. Lexicon and Rules (This page last updated 25 OC 06) •3.1 Productivity and the Lexicon • The lexicon is in theory infinite, but in practice it is limited. Human beings know only a certain amount of information at any one time and it is impossible for a human to know an infinite amount of information. This holds in the lexicon, as well. Comparing a lexicon to a dictionary (the printed lexemes), a dictionary can hold only so much information at one time. The list can grow and grow, but it is never infinite. • The potentiality for making up new words by means of the rules of word building is potentially infinite, but this has never been proved. Nevertheless, it possible to create a large number of words, larger than what most humans could possibly memorize. Thus we must distinguish between actual words and potential words. • A neologism is a new word that has been created. Neologisms that do not catch on except occasionally are called occasionalisms. Note that this word was probably created recently and I doubt if it has really caught on. If true, then the word occasionalism is itself an occasionalism. •Affixes that are readily adjoined to words to create new words (bases and stems) are called productive. • E.g. The English suffix ‘-er’ can be added to most verbs that denote an agent oriented action: doer, fixer, baker, worker, runner, swimmer, writer, and so forth. The same suffix can also denote an instrument ‘cooker, pickle slicer, popcorn maker, double-boiler, but it is doubtful that this verb productive, though it may be productive if the semantic class is known. Other affixes are clearly not productive: • E.g. ‘-ic’, ‘ion’, ‘-ive’, ‘be-’, ‘de-’, ‘a-’, re-’ and so forth. •Another problem with unproductivity (sic) is that unproductive affixes easily change the meaning of the word. 3.1 Productivity and the Lexicon E.g. head, be-head; give, forgive; stand, understand; woman, womanize; and so forth. There are affixes that are very productive, rather unproductive, somewhat unproductive, very unproductive. H lists a finer list of productiveness (p. 42). Another problem are complex words that are lexical, but underlying base is not lexical. To illustrate this, consider disgruntled. It is derived from the base *‘gruntle’, which is not a lexeme with the associated meaning of disgruntled. I take the view that forming bases is productive given the restrictions on the base, but the base is not always a lexeme. There no way to be absolutely sure whether a given base will or will not be a lexeme. As a consequence, all lexemes must be enterred in the lexicon. If a base is created, one must check to see if it is a lexeme, or one may occasionally determine a lexical meaning for the new base thus creating a new word, as I did with unproductivity above. H argues that a word-form lexicon is more desirable. A word-form lexicon is one in which every declined or conjugated form of each word is listed in it. Inflected forms are generally predictable given the class forms of each lexeme, except the irregular ones such oxen, children, brethren; is, are, be, was, were (being and been) are regular (except for the pronunciation of been in the US and in Canada whether the American pronunciation has taken over the earlier one which is still standard in Britain. Even so, there is evidence that all the word forms of everyday usage are memorized and listed in the lexicon. (I read a paper at SFU claiming that the lexicon is divided into two parts: the list of lexemes and the list of word-forms derived from them. Each set of word forms derived from a lexeme are linked to that lexeme at little cost to the grammar.) Linking is another research topic of mine, which I cannot get into here. 3.1 Productivity and the Lexicon H mentions that a lexicon should be elegant which means the least number of rules that will produce all the inflected forms. The lexical part of the lexicon contains a list of all lexemes that a speaker has. The word-form part of the lexicon contains the inflected forms for each inflectable lexeme (conjunctions, prepositions and other functions are not inflected in English): play plays PLAY played playing The lexeme PLAY is connected to the word-forms play, plays, played, and playing by means of a link. The links are for information transference from the lexeme to the word-form, which we might call formation, and from the word-form to the lexeme; the latter is called interpretation. The most common word-forms are most likely memorized. The word-form component will differ for each speaker just each speaker probably knows a different set of lexemes, everybody’s experiences are unique to that individual. The hypothesis is that speakers normally draw from the set of word forms in forming a sentence. To form an unusual word, he must form the word-form from the lexeme using the rules of his grammar. The above diagram is incomplete, but it will suffice for now. 3.2 The form of Morphological Rules A morpheme rule is any kind of regularity that is ‘noticed’ by speakers and is reflected in their unconscious linguistic knowledge (H p. 44). Though there may be several formal descriptions that can be conjectured, H will discuss two formalisms: the morpheme based model and the word-form based model. 3.2.1 The morpheme based model In this model morphemes are combined together to form a new form, expressed by a set if wordbuilding rules. H compares these to syntactic rules forming phrases, clauses and sentences. Consider the following words as examples: E.g. fox -> fox-es, school + house -> school+house, build -> re-build, con-trast -> con-trast-ive-ness, sad -> sadd-est. Word-structure (word-formation) rules: word-form <--> stem (+ inflectional suffix) stem <--> base + lexical meaning (bad format here) base <--> {{(deriv. prefix +) {root, base} + (deriv. suffix)} , {stem + stem}} inflectional suffix = -es, -est derivational prefix = rederivation suffix = -ive, -ness root = fox, school, house, build, happy, sad, down, never, do, be, and so forth. Phrase-structure rules (top down and bottom up): S <--> NP + VP VP --> V + NP NP <--> Det + Adj + N or better NP <--> Det + [Adj + N] (an intermediate phrase). N = car, house, mouse, stupidity, delight, forever, down, … V = run, sleep, smoke, rise, depend, forage, … Note: The symbol ‘<-->’ means that a form on the left side of the arrow is mapped into the structure on the right and the form on the right side is mapped into the left side. 3.2 The form of Morphological Rules D = {the, this, these, that, those} Q = {{a, an, one, ø}, some, few, a few, several, … } A = {happy, red, large, petite, long, deep, fuzzy, …} Some syntacticians question question whether rules such as the VP expansion rule is really necessary. For example, the lexical entry for DESTROY should include the fact that it requires a direct object (a complement): E.g. [V DESTROY + ____ NP]. They query whether the rule ‘VP -> V + (NP)’ is really necessary. I don’t like the idea that the VP ‘rule’ is really a rule. Rather, it is a statement of sets: E.g. VP is a set that contains V and NP; V and NP are members of the set VP. This is merely a statement of sets. I will go one step further and write it as: E.g. {VP} <--> <V, NP>. Note: In set theory notation, the comma enclosed in angled brackets indicates linear order: VP is a set that contains the ordered set V then NP. This notation is not normally used in linguistics; the plus ‘+’ denotes order as shown above. Note: the curly braces can be omitted once it is understood that VP, V and NP are each a set. The lexical expansion above is a statement that in essence says: If one member of the set V is DESTROY, then the second set is NP, which is the complement of the verb. What remains in question is how to account for an optional member. In reality, there are no optional members. Recall that ø as a phonological sign is permitted in set theory. An optional member actually exists; it merely has ‘ø’ as its sign: The S ‘John likes to eat’ implies he likes to eat something. The pronoun may take on a zero form for certain verbs: [V EAT [NP ø]]. 3.2 The form of Morphological Rules The lexical entry for EAT now should be: {V EAT, {complement, NP, {-ø, ø}. By ‘-ø’ I mean it must have a phonetic sign. Not all verbs take a zero complement such as DESTROY. [V DESTROY, {complement, NP, -ø}]. The logic for the ø complement rests in set theory and the 3-component theory. The complement fills the function role, and its form is ø, and its sign is ø, in most cases anyway. If it has no form how can it have a sign? Each component constitutes a set, usually called the complement of the verb, or an argument of the verb: E.g. :{COOK, {-ø}, {-ø, +ø}}. The first argument is the agent, the second the theme. The fact that EAT takes a theme argument prompts this analysis. The function of the first ø is [Pl] and that of the second one is [+Pl], or the reverse. There are morphological forms that have no form, but they have a sign: E.g.: {[+Pl] (of certain nouns), ø, voiced final obstruent}: E.g.: calf, calves = /kæf+ø/, /kævz+ø/. /s/ and /z/ belong the set usually called an alveolar fricative. F = {/v/, /f/}. The sign is subtle; it is not a phoneme. It seems reasonable at this time to assume that the sign is a feature: [+Voice]on the final consonant. In morphology, the plural suffix ‘-s’ = /z/ would have the grammatical feature indicating that it requires a noun as a host: H: [/z/, N ___, ‘plural’]. D: {[+Plural], N ___, /z/]}. The ordering is not crucial, but it should be used consistently. The square brackets are often used to denote a feature. The ‘+’ (or ‘-’) is a binary value: E.g. [+Plural] = ‘plural’, [-Plural] - ‘singular’. This distinction becomes important once the theory of binary oppositions is adopted. 3.2 The form of Morphological Rules 3.2.2 The Word Based Model. Here, the fundamental significance of the word is significant. Rather than breaking words down, a word-schema is formulated. For example, the following English verb-forms can expressed in the following: a) hits, sits, types, knows, feels, acts, procrastinate, regurgitate, and so forth. b) [/Xz/, V, ‘third person singular of V]. Where X is the Lexeme of each verb in a): E.G. X = {it, sit, know, feel, act, procrastinate, regurgitate (and so forth)}, V and /z/ ‘third person singular of Vs’. /X/ is a phonemic string such as /plej/. The word-schema a) that there is a list of word-forms that end in /z/, and that they are verbs (V), and that /z/ is the third person singular of V. There is a closely related schema: E. g [/X/, V, ‘x’] Now.the two schemas can be represented in the following mapping correspondence: E. g. [/Xz/, V, ‘third person singular of X] <--> [/X/, V, ‘x’]. 3.2 The form of Morphological Rules The word based model eliminates the need for morphemes, stems, bases, or roots. Words are related by mapping one scheme to another. E.g. play/played: [/X/, V, ‘x’] <--> [Xd, V, ‘past tense of ‘xd’] “there is a string /X/, it is a verb (go, write, play, cough, …), and ‘a function ‘x’”; this corresponds with the string /Xd/, the same verb, and the past tense of ‘xd’ ‘xd’ is the function x of /X/, and ‘d’ which has the function ‘past tense’. E.g. [/plej/,V, ‘engage in games’] corresponds with [/plejd/. V. ‘engage in games past tense’. PLAY <—> PLAYD. PLAY stands for the first bracketed sequence in the above line and PLAYD stands the second sequence. Now is this perfectly clear? Methinks not. These rule schemas don’t cut the mustard as afar as I am concerned. Dr. A. told me that these are not explanatory, but just descriptive. If they cannot lead to an explanatory goal, why bother. At least we should try to become familiar with them, just in case I turn out to be on the wrong track. Let us go along with set theory, followed by many logicians and mathematicians and possibly others in other fields. First, the following: “hits, sits, types, knows, feels, acts, procrastinates, regurgitates” form a set in their own right, just as the uninflected form (infinitive form) is a set and the proposed Lexemes of them form a set: E.g. {HIT, SIT, TYPE, KNOW, FEEL, ACT, PROCRASTINATE, REGURGITATE, …}. 3.2 The form of Morphological Rules The third person singular is a set with one member: /z/. Recall that each lexeme and grammeme has three properties — function, form, and sign. E.g.: {HIT, V stem, /hit/}; {3 P. Sg. (3PS), suffix, /z/}. The form hits is a set that contains two subsets (members): E.g.: {HIT, 3PS} <--> {/hit/+/z/}. This tells us that {HIT,V, 3PS} can be spelled out as {/hit/}+{/z/} = {/hitz/}. Note the phoneme symbol ‘/’ is a set marker for phonemic sets; using both ‘{/’ is totally redundant. I I did so here to emphasize this point. Using one or the other is fine; just that ‘/’ give us more information. First, I’ll do writing as two morphemes: Lexeme WRITE V / rajt / Inflect ional Suff ix function [ +Prog] function f orm [ + Suf fix, V-host ] f orm sign / iN/ sign 3.2 The form of Morphological Rules The feature [+Suffix, Noun house] accounts for the adjunction of the suffix to the noun stem: writing (writ-ing). This operation takes place in the syntax in the version of 322 that I taught until 2003. [+Prog] is short for progressive (aspect) In the word-form based model we obtain: ‘write’ function V form /rajt/ sign <----> ‘progressive of write’ function V form /rajtɪŋ/ sign The symbol ‘<---->’ means that there is a relation between the word-form on the left and the word-form on the right. The relationship should be in the function, not the form or the sign directly. Although the notion of derivation is not used in a synchronic word-based analysis, the relation does correspond to derivation. ‘Write’ is the key function in the relationship. There is an alternate way to express such relations. If a set of verb, nouns, or adjectives is to be used in the relation, the ‘x’ notation is used. Think of ‘x’ as a set that contains nouns such as ‘house, finger, desk, car, cougar and so forth’: E.g. ‘x’ = {‘house, finger, desk, car, cougar and so forth’}. The ‘x’ is used in the function and can be replaced any member of the set ‘x’. /X/ stands for a phonemic string. For example, if ‘house’ is selected as ‘x’, its sign is /hæws/. ‘x’ will appear in the other box (technically a set) along with some other phonemes. ‘x’ is spelled here as ‘hæws/ here as well. ‘x’ cannot be spelled out as two stringss with unrelated phonemes in a relation like the above. A system for suppletion will have to be worked out. 3.2 The form of Morphological Rules The use of ‘x’ and /X/ is illustrated: ‘x’ = {‘house, finger, desk, car, cougar and so forth’}. ‘x’ function N form /X/ sign <----> ‘plural of ‘x’ function N form /Xz/ sign If ‘house’ is selected for the ‘s’ on the right, then ‘x’ on the left is replaced with ‘house’: Let’s look at an example of word derivation. E.g. work+agent <----> worker. (more or less) This a kind of mapping, another process that the symbol ‘<---->’ represents. The form on the left can be mapped to the morph on the right, and the one on the right can be mapped onto the one on the left. [/X/]V, ‘x (= an agent)’] <--> [/X/]V, N, one who ‘xs’]. But is it really desirable to do so? H seems to think so, but I am being a bit reserved for reasons that I will talk about in the next chapter (I hope). 3.2 The form of Morphological Rules WRITE function [+Agent] function V form [+Suffix-N, [+Host, V]] form /rajt/ sign /r/ sign ‘write’ function V form /rajt/ sign <----> ‘one who writes’ function N form /rajtr/ sign The word-class category of the lexeme must be part of the lexical entry. Note that the Symbol ‘A’ refers to lexical modifiers; ‘Adj’ refers to the set of adjectives, and ‘Adv’ refers to the set of adverbs. Degree words, such as very, quite, rather, somewhat, …, are not lexical and hence are not members of the set A. I will now represent A as a set: E.g. A = {Adj, Adv}. The lexical items in English include nouns, verbs, and lexical modifiers: E.g. Lexeme = {N, V, A}. (Note: we could use ‘L’ for ‘Lexeme’ as long as ‘L’ is not used for anything else. Additionally, some functional terms are useful. Above there is ‘agent’ = [+Agent], which has been around for decades. It corresponds with ‘one who Vs’. Another is ‘manner’ = [+Manner], which corresponds to 3.2 The form of Morphological Rules ‘in an A manner’. Another term is iterate = [+Iterative] , which means to do over and over again.: E.g. iterative <--> do over and over again.(V) E.g. iterative <--> pertaining to doing over and over again. (Adj) Which of the two above terms is appropriate to describe slowly? Note that there is a correspondence between the way I have transposed H’s way of writing a lexical entry with paradigmatic cells and horizontal cells: / X/ E.g. {/X/, V, ‘x’} = V ‘ x’ This is technically called a notational variant. This notation does not correspond with or is not a notational variant of the morpheme based notation. There are differences (theoretical) between the two notations. In the morpheme based notation, V is a class of verb stems, while in word-based theory, V is a verbal word-form. Stems are not recognized in word-based morphology. 3.2 The form of Morphological Rules H generalizes the word-based by replacing ‘write’ with ‘X’: E.g. ‘one who Xs’ ‘X’ stands for a simple word form, In the morpheme based model, one can do similarly: {X, V-wordform, /x/}, where /x/ is the sign of X. X stands for any member of the set of lexemes. 3.2 The form of Morphological Rules There does not appear to be a big difference with the exception that the word-based grammar uses repetition. H uses ‘X’ which really means the set of words of a given class. The section following on morpheme subtraction seems to support the notion of a morpheme rather than a string of phonemes. Set theory predicts morphemes, word-based grammar does not. H mentions that bases are not necessary. Another difference is that in word-based grammar, the entire word-based form corresponds to another word-based form. In morpheme based grammar, the mapping is from function to sign and vice versa. In the morpheme based model we need to be more specific of rule writing. The concatenation rules of Morphology will join sign of the lexeme and the sign of the grammeme. The feature [+Suffix] tells us that the grammeme is a suffix to be adjoined to the right end of a lexeme, and the feature [V-host] tells us that the host, the lexeme to which it must be adjoined, must be a verb. Similarly, [N-host] tells us that the host is a noun. Although Chomsky uses A for modifiers, we must be careful to exclude degree words and phrases from A: A = {adjective, adverb}. Therefore, WRITE and [+Progressive ([+Prog]) <--> /rajt-iŋ /. The rules for obtaining a verb and one of its inflectional suffixes is determined in the syntax, especially where syntax and morphology overlap. I taught this approach in syntax (L322) for several years. Where as some linguists believe there is no formal division between syntax and morphology (the distinction is one of convenience rather than formal), there are others who believe that the words syntax and morphology should not be used in the same sentence. (I suspect that H is leaning in the direction of the latter.) The schemas are not a theoretical device, but a descriptive device. Recall the 3 goals of a theory. The main point of word-based grammars are the schemas. That is where a true comparison will occur. The schemas have no explanatory value. Recall that our third goal is to find the best explanatory system that will account for the facts of the corpus. H will have more to say on this later. Therefore, we should not make any conclusions at this time, but we should try to understand both approaches. 3.3 The form of Morphological Rules 3.3.1 Pattern Loss Pattern loss is one (or more) inflectional categories. For example, H mentions Ancient Greek and its evolution into New Testament Greek. The nominative case forms for Ancient Greek adelphós ‘brother’: adelphós (singular), adelphó: (dual), adelphoí (plural) By the time of NT Greek, the dual had disappeared completely without a trace leaving two grammatical categories: Sg. and Pl. 3.3.2 Coalescence (Merger) Coalescence is a diachronic change where two syntactically separate word-forms (related grammatically) coalesce or merge into one complex word-form. In Old Russian, the reflexive of a verb was formed with a reflexive functional word. By Modern Russian the reflexive had been phonologically reduced so that it could not bear independent stress. The form had become inseparably conjoined to the end of any-word form of a particular verb. Not all verbs form the reflexive in this way. : E.g. OR ‘mytɩ sebja’ to wash oneself. MR ‘mytsja’ to wash oneself. In MR some verbs still take the modern form of the reflexive to form the reflexive. The reflexive suffix has other functions and can change the meaning of the stem. This form seems to be more of weak clitic that must adjoined on to a verb. This is somewhat similar to H’s hypothesis that ‘walk did’ (roughly) -> ‘walked’. In pre-English, the past t. suffix became adjoined to the verb stem. 3.3.3 Analogical Change Analogical change is a when an inflectional pattern is modified to be another pattern. There are verbs in English which were regular (weak) at one time but became irregular (strong) following the pattern of another verb class. E.g. the past tense of dive is dived in standard English. 3.3 The form of Morphological Rules Over time the substandard form dove came into existence due to the pattern of strive - strove. Arrive follows the same pattern. The interesting item here is that the analogical change applies to the past tense, but not the non-progressive participle (in my dialect, at least; based on for some verbs; based strive, strove, striven: E.g. dive, dove, dived (*diven); arrive, arrove, arrived (*arriven), But for other verbs, the non-progressive participle as well; based on sting, stung, stung: E.g. dig, dug, dug; drag, drug, drug (this one really gets the purists going). 3.3.4 Reanalysis Reanalysis is when a morpheme — a root — loses its function and becomes part of morpheme. H cites the Ancient Greek example of kithára. kithára guitar kithar-îzo to play the guitar: -izo is adjoined to the stem, ‘to use X’ kithar-is-té:s A guitar player (who does something with X, X a noun): -tés ‘one who uses X. later: kithar-isté:s A guitar player (who does something with X, X a noun). Through reanalysis, the two suffixes ‘-es-’ and ‘-tés’ became merged as ‘-istés’ single ‘one who does something with X, it is directly added to noun stems. another suffix: Secretion Secretion is defined (roughly) as the process when a string of phonemes is reanalyzed so that the string becomes a morpheme with such-and-such a function. H cites the example of alcoholic (alcohol-ic) which influences new words ‘X’ to first form a blend workaholic. A new morpheme arose ‘aholic’, when ‘work-’ is subtracted from workaholic leaving ‘aholic’ behind as a suffix, when can then be added to some nouns with the meaning of one who indulges in X to a relatively high degree. 3.3 The form of Morphological Rules 3.3.5 Other Changes Phonological changes cease to be regular and predictable: E.g. At one time the Old-English word for ‘foot’ was ‘fo:t’ and ‘feet’ was ‘föti’. The umlauted vowel gradually became /e:/, and at some point in time, the plural marker for Old English, /i/ was gradually lost leaving /fo:t/ (Sg.) and /fe:t/. After the Great Vowel Shift (16th to 18th C), /e:/ -> /i/, and inexplicably, /o:/ in foot’ delaxed to ‘/ʊ/’. After the loss of word-final (plural marker) /I/, the conditioning factor was lost. The alternation is now morphological. Semantic shift of the morpheme is another change: At one time in the earlier stages of Russian, the suffix /l/ referred to a participle, but it is unclear what the original function of the participle was. The participle combined with an auxiliary verb which corresponds with the auxiliary be in English. In time the auxiliary was lost in this construction as was the original way of forming the past tense. The lparticiple took on the meaning of the past tense. Something similar probably occurred in the Germanic languages. I.E. ‘t’ was a participle, and in German this morpheme came to mark the past tense of weak verbs (the regular ones), and it remained to mark participles, breaking into two morphemes, [Past] and the nonprogressive participle. Hence played is ambiguous. Go to Course Outline, Go to Chapter 2, Go to Chapter 4, Go to Exercise 3