Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Spanish grammar wikipedia , lookup

Arabic grammar wikipedia , lookup

Lithuanian grammar wikipedia , lookup

Zulu grammar wikipedia , lookup

Navajo grammar wikipedia , lookup

Latin syntax wikipedia , lookup

Ancient Greek grammar wikipedia , lookup

Kannada grammar wikipedia , lookup

Ukrainian grammar wikipedia , lookup

Distributed morphology wikipedia , lookup

Serbo-Croatian grammar wikipedia , lookup

Lexical semantics wikipedia , lookup

Sanskrit grammar wikipedia , lookup

Esperanto grammar wikipedia , lookup

Yiddish grammar wikipedia , lookup

Ojibwe grammar wikipedia , lookup

Scottish Gaelic grammar wikipedia , lookup

Swedish grammar wikipedia , lookup

Icelandic grammar wikipedia , lookup

Agglutination wikipedia , lookup

Inflection wikipedia , lookup

Polish grammar wikipedia , lookup

Old English grammar wikipedia , lookup

Old Norse morphology wikipedia , lookup

Russian grammar wikipedia , lookup

Turkish grammar wikipedia , lookup

Stemming wikipedia , lookup

Malay grammar wikipedia , lookup

Old Irish grammar wikipedia , lookup

Pipil grammar wikipedia , lookup

Morphology (linguistics) wikipedia , lookup

Transcript
323 Morphology
The Structure of Words
2. Basic Concepts
2.1 Lexemes and Word Forms
Words are not easy to define.
A preliminary definition is based on the English orthographic system.
The spaces used in orthography represent words (usually).
Most dictionaries list only one word of an inflected set:
E.g. sing, sang, sung, singing, sings.
The form ‘sing’ is always chosen as a dictionary entry.
The form is technically an infinitive.
In linguistics the term is lexeme represents the basic or dictionary form
of the word.
Lexemes are usually written in CAPS: SING
Lexemes are abstract representations, which presumably are listed in the
brain in a component called the lexicon.
Each inflected form of a lexeme is called a word-form.
E.g. ‘sing, sang, sung, singing, sings’ are each a word-form and
each one belongs to the lexeme SING.
The set of word-forms of a given lexeme is called a paradigm.
2.1 Lexemes and Word Forms
By convention in each language, the dictionary representation may be the
infinitive form of the verb as in Russian, the first person singular in Latin (which
has no infinitive), the third person singular in Arabic, or perhaps by some other
form. The entry form for nouns in normally the singular nominative case form
of the noun: Latin, Russian, English, Czech, German.
A lexeme family, or less formally a word family, is a set of lexemes that are
related. They should share some phonological properties and be related
semantically. The latter is easier said than determined.
E.g. print, printable, unprintable, printer, printability, reprint.
This list is not necessarily complete.
Complex lexemes are lexemes formed with an affix (a morpheme).
E.g. ‘able’, ‘un’, ‘er’, ‘ity’, ‘re’ in the above list.
Complex lexemes must each be listed separately in a dictionary as the
meaning may differ.
The various word-forms of a given lexeme do not change the
meaning of the lexeme.
Which affixes that occur with which basic lexeme is not predictable.
E.g. we find in English un-happy, un-ripe, but not *un-sad, *unred, *un-tall, and so forth.
2.1 Lexemes and Word Forms
Sometimes a lexeme with an affix occurs but the basic form does not
exist:
E.g. dis-gruntled but not *gruntled, in-cognito, but not *cognito,
un-gainly, but not*gainly.
Sometimes the expected affix does not occur but another affix does:
E.g. natural-ness in place *natural-ity.
Or the expected affix occurs with another meaning:
E.g. cook, cook-er (an instrument for cooking, not a person who
cooks, which is simply the noun ‘cook’.
Kinds of morphological relationship
inflection: the relationship between the word-forms of a lexeme.
E. g. mask, masks; sit, sat, sitting, sits; blue, bluer, bluest.
derivation: the relationship between lexemes of a lexical family.
E. g. singer, singer; write, writer; cookV, cookN, cooker.
Derivation usually implies forming one lexeme from another lexeme in the same
lexical family.
E.g. sing -> singer, write -> writer, cookV, cookN and cooker.
Word is used whenever the distinction between derivation and inflection is
uncertain. (no examples currently).
Compound *lexeme) refers to words that are made up of two or more lexemes:
doghouse, catfish, greenhouse, whiplash, tattletale, and so forth.
2.2 Morphemes
A morpheme is the smallest constituent with a function. I prefer this distinction to
‘smallest constituent with meaning. There are some forms that appears to be
constituents but have no discernable meaning, but have a function in terms of word
building:
E.g. doof-us, radi-us, cf. radi-al, radi-an.
Some inflectional morphemes have no true meaning, but they have a grammatical
function:
E.g. he, him; who, whom; they, them,
The suffix ‘-m’ marks the accusative (objective) Case. This is a syntactic relation and no
meaning can be associated with it.
The term function includes meaning.
To go one step further than H., the hierarchy for constituents is:
Sentence -> phrase -> word -> morpheme.
Phrases are very important constituents in syntax.
Some grammatical categories cannot be expressed in terms of morphemes. For
example, note the following partial inflection of the English verb sing and others similar
to it:
E.g. sing, sang, sung.
The past tense is marked by a change of the root vowel. The latter form marks two
distinct grammatical functions — the passive form of the verb and the perfect form of the
verb. Each form is a distinctive morpheme with a different function but phonologically the
same.
2.3 Affixes, Bases and Roots
Affixes are morphemes that are adjoined to the left of the base of a word or to the
right of the base of a word:
A prefix is an affix that is adjoined to the left of the base of a word.
E.g. ‘un-’ in un-happy, un-regulated; ‘re-’ re-do, reheat, re-write, and so
forth.
A suffix is an affix that is adjoined to the right of the base of a word.
E.g. ‘s’ in book-s, cat-s; eat-s, smell-s; linguistic-s.
An infix is an affix that is inserted into the base of the word forming a noncontiguous base. There are no infixes in English. Infixes occur in the Semitic
language.
E. g. “ktb” is the base for book and read and words which refer to
book/read in some related sense. To form the noun in Arabic, the
infixes ‘I’ and ‘a’ are inserted into the base between the firsts two
consonants and the second two consonants, respectively:
E.g. kitab.
A circumflex is an affix that occurs on both sides of the base. (H.)
E. g. (per H) German ge-les-en ‘read’ (passive participle).
English dialects: a-walk-ing, a-read-ing..
The English “a-” is etymologically related to the German “ge-”.
Stem, Base and Root
A root is a morpheme that cannot be broken down into further morphemes.
A base is one or more morphemes formed of a root plus affixes.
2.3 Affixes, Bases and Roots
A stem is a base that has lexical meaning (meaning that is stored in the
lexicon). It is also called a lexeme:
E. g. Pen is a root, a base and a stem (null affix) (=a lexical item).
E. g. Gruntle is a root, but a stem since no lexeme corresponds to it.
E. g. Disgruntle is not a root, it is a base, but not a stem.
E. g. disgruntled is not a root, but it is a base and a stem.
E.g. -ed is a suffix; it is not a root, base or stem.
In English the word dog, for example, is a root since it cannot be broken
into
further morphological units:
E. g. ‘do’ is not a morpheme of dog, it is basically a verb. There is no
morpheme ‘og’ that has any kind of function.
Dog is also a base. It has lexical meaning.
The English word disgruntled consists of three morpheme dis-, gruntle,
and
ed. ‘dis’ is a prefix, and ed’ is an inflectional affix marking the past
tense among
other functions. The morpheme gruntle is a root, since two affixes are adjoined to it.
It is not a base, since it has no lexical meaning
(what does gruntle mean?) Once
both affixes are adjoined to it, then
disgruntled, which is a base, is a
lexical stem since it does have meaning.
Technically, the prefix ‘dis-’ is adjoined first to gruntle to form the base
‘disgruntle’. Apparently this form has no lexical meaning and remains a
base. Once the adjectival suffix ‘-ed’ is added to disgruntle then the base
receives lexical meaning and is a stem.
English has several words usually considered compounds, where at least one
member of the compound doesn’t behave like a normal prefix or affix.
2.4 Formal Operations
E. g. tele-graph. Although graph may have lexical meaning, tele- does not.
It
does not occur in isolation. The form is borrowed from Greek where it means ‘far’.
It is more like a root that cannot become a stem in its own right, but it may be
adjoined to a stem to form a new stem.
2.4
Formal Operations
Some words such as derive imply a process. A true process is a historical
phenomenon and does not imply a process in terms of how language is
represented in the mind (the grammar of a language).
Diachronic refers to a temporal process.
E. g. Middle English -> modern English.
Synchronic refers to a grammar at a particular point in time.
E. g. Modern English, Modern French.
In addition to affixation and compounding, there is another ‘process’ or operation
that refers to inflection formed without affixation and word formation formed without
affixation. It is called a non-concatenative operation:
(Some) Albanian nouns form their plural by palatalizing their final
consonant:
E. g. armik (Sg.), armiq [c] (pl.) ‘enemy, ies’.
A few nouns in English are formed from verbs by voicing their final
consonant:
E. g. hoof (Sg.), hooves (Pl.).
2.4 Formal Operations
Causative verbs in Arabic are formed by geminating the second root consonant.
Gemination is the doubling of a consonant:
E. g. darasa (root DRS) ‘learn’, darrasa ‘cause to learn’.
The first person singular of Huallaga Quechua verbs is formed by lengthening the final stem
vowel:
E. g. aywanqui ‘you (Sg.) go’,aywa: ‘I go’. Stem = AYWA.
Intransitive verbs are formed by shortening the stem vowel of Hindi transitive verbs:
E. g. maar ‘kill, [trans].’ mar ‘die’.
Adjectives in Chalcontongo Mixtex are derived from nouns by raising the tone on the final
vowel of the noun:
E. g. ká?ba ’filth’, ká?bá ’dirty’.
The past tense of a certain class of English verbs and verbs of several Indo-European languages
replace the stem vowel of the verb with another:
E. g. sit -> sat; sing -> sang; German sing-en -> sang ‘sing, sang’.
The replacement (which it really isn’t) of Arabic vowels in inflection and derivation is
called
transfixation:
E. g. root KTB ‘write, book’, kataba ‘wrote’.
H mentions several other operations such prereduplication, postreduplication, duplifxation,and
subtraction. (pp. 23 and 24).
With the addition of the above operations, the definition of a stem as a root plus affixes is obviously
insufficient.
definition: a stem is a root plus morphological operations (both concatenative and nonconcatenative).
I prefer to think of these morphological operations in terms of set theory. Set theory belongs to pure
logic and is used in various logic-related fields such as mathematics where extensive use of it is made.
I will delve in set theory just a tiny bit to enlighten the class.
2.4 Formal Operations
A set is a group of one object or two or more objects. Any two or objects can form a set. In linguistics,
we will restrict a set to two or more objects that are related in some linguistic way.
E. g. {noun, verb, adjective, adverb, quantifier (and so forth)}
E. g. {infix, suffix, prefix}
E. g. all vowels are a set: {a, e, o. I, u} in 5 vowel languages.
E. g. all lexemes are a set.
E. g. all phrases are a set.
E. g. each alternating vowel in a paradigm is a set: {i, a, u} in S_NG.
One vowel is consider the default or basic vowel, the others are marked in that they occur in specific
contexts.
E. g. ‘a’ occurs in the context of the past tense, and ‘u’ occurs in the context of the passive and
the perfect grammatical categories of the English verb sing. ‘I’ is the default, since it occurs in wordformation: sing-er.
The evidence that is beginning to appear in neural representations of language is the formal lexical
representation of the lexeme SING is more properly S{I, A, U}NG. In the present tense, the default
vowel ‘I’ is selected from the set. In the past tense the vowel ‘a’ is selected from the set. I can’t go to
much further here at this time. I will make reference from time-to-time of set theory in linguistics, but I
will not make this the standard theory here.
2.4 Morphemes and Ållomorphs
An allomorph is one of the variant forms of a morpheme:
E. g. the two forms of the English lexeme HAVE: ha-, and hav(e), phonetically /hæ/ and
/hæv/ as in ha-d, ha-s; and have, hav-ing.
The dropping of the letter ‘e’ in the progressive participle is an orthographic (spelling) operation
unrelated to
morphology.
In set theory, a member is the generic name for subsets; in linguistics each member of a
morpheme is called an allomorph.
This alternation is a true morphological alternation. The morpheme is a set, which
contains
two members (allomorphs), {/hæ/ /hæv/}.
Note: in set theory, a comma means the two or more members are ordered. Here
there is no ordering. Therefore, a comma cannot be used to separate the two
allomorphs.
One thing that most morphologists do is confuse the nature of the alternation. For
example,
∔
the past tense if almost always said to consist of three allomorphic suffixes: [- d], [-d], and [-t]. In
a technical sense, the alternation is determined by phonological rules, not morphological rules. [d] is the default (the elsewhere
condition), [-t] is the variant following voiceless
obstruents, and [∔d] is the variant
which occurs between homorganic stops.
E. g. loved [lʌv-d], [play-ed], cite [sayt-∔d].
Note: allophones are members of the set called the phoneme.
We note from H that Korean has two allomorphs for the accusative singular suffix: {‘ul’ ‘lul’}. It is obvious
that ‘ul’ follows consonants and ‘lul’ follows vowels. If the phonological rules of Korean predict that
alter alternation ‘ø’ and ‘l’, then the alternation phonological. If it can’t be, then the alternation is
morphological.
2.4 Morphemes and Ållomorphs
In Turkish the first person possessive suffix has 5 forms: ‘im’, ‘üm’, ‘um’, / m’ and ‘m’. Turkish is wellknown of the vowel harmony which occurs in the language.. All these variations are part of the
phonological system. H p. 26.
In German final voiced stops ‘become’ voiceless at the end of a word or if they precede a voiceless
obstruent. H p. 26. Assimilation is definitely phonological and if word boundaries count as phonological
markers (they do), then we can consider obstruent devoicing as part of the phonological system. In
English there are subtle differences when the voicing stops in final voiced obstruents. This is purely
phonological.
In Russian and nearly all of the Slavic languages, the vowels /e/ and /o/ (which are reduced in Russian
and Byelorussian, are deleted if they are not stressed. This does not happen to all of them, just certain
classes. This is a morphological alternation and it is phonologically unpredictable.
Underlying Forms
The standards theories since about 1960 or so is that there is an underlying from for all the
allomorphs of a particular morpheme and all the allophones of a particular phoneme.
2.4 Morphemes and Ållomorphs