Download Introduction to Computational Linguistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Introduction to Linguistics II
Ling 2-121C, group b
Eleni Miltsakaki
AUTH
Spring 2006
1
Course outline
• Morphology
– Content words and function words
– Bound and free morphemes
– Word formation processes
•
•
•
•
Syntax
Semantics
Pragmatics
Historical Linguistics
2
What is morphology?
• The study of the structure of
words
– Words are part of our linguistic
knowledge
– Words are part of our mental
grammars
3
Basic questions for morphology
• What are words and how are they
formed?
• How are complex words formed from simpler parts?
• What are the basic building blocks in the formation of
complex words?
• How is the meaning of the complex
word related to the meaning of its
parts?
• How are individual words of a
language related to other words of 4
What do we know when we
know a ‘word’?
• Phonological info: How it is
pronounced
• Morphological info: Its
internal structure
• Syntactic info: Part of speech
• Semantic info: What it means
• Pragmatic info: How we use it
5
What is a word?
• Video-show
• An arbitrary pairing of sound
and meaning
– E.g. house, casa, maison etc
6
Content and function words
• Content words
– They denote concepts
– They are open class
– They are nouns, adjectives, adverbs
• Function words
– They have a grammatical function
– They are closed class
– They are conjunctions, prepositions, articles,
demonstratives, pronouns
7
Simple and complex words
• Simple words
– Minimal unit
– Cannot be further analyzed
– E.g. tree
• Complex words
– Made of more than one part
– E.g. trees
 We need a name for the parts which combine to make complex words
8
Morphemes
• Morphemes are the building blocks of complex
words
– ‘Trees’: base morpheme + plural morpheme
• Types of morphemes
– Free: independent words
– Bound: affixes
9
Types of affixes
•
Prefixes: They are attached to the beginning of another
morpheme
–
•
E.g. rewrite, rethink
Suffixes: They are attached to the end of another morpheme
–
•
E.g. modernize, centralize
Infixes: They are attached within another morpheme (less
common but certain languages do have infixes)
–
E.g. kayu = wood
-in- = product of a completed action
kinayu = gathered wood
10
How are new words created?
•
•
•
•
•
•
•
•
Word formation rules (derivations)
Coining
Compounding
Blending
Acronyms
Clippings
Backformation
Conversion
11
Derivational morphology
• Bound morphemes added to a root morpheme to form a
new word with new meaning are called derivational
morphemes.
• E.g. -ify, -cation
pure  purify  purification
|
|
to make pure the process of making pure
“pouzy”  pouzify  pouzification
• The form that results from the addition of a derivational
morpheme is called derived word
12
The hierarchical structure of
words
• Morphemes are added in a fixed order according
to the morphological rules of a language
• E.g. system  systematic  unsystematic
13
Tree diagrams
• The hierarchical organization of words can be
represented in a tree diagram
Adjective
Un
Adjective
Noun
system
atic
14
Adverb
Adjective
Adjective
un
ly
al
Adjective
Noun
atic
system
*unsystem
15
More about trees
• Tree diagrams are the linguist’s hypothesis
of how speakers represent the internal
structure of words
• Take a look at ambiguous cases such as
unlockable
16
Not able to be locked
Able to be unlocked
Adjective
Adjective
Verb
un
able
Adjective
un
Verb
verb
able
lock
lock
17
• If words were only strings of morphemes
without any internal organization, we could
not explain the ambiguity of words like
‘unlockable’
18
Inflectional morphology
• Inflectional morphology indicates
grammatical aspects of a word
– Plurality (boy – boys)
– Tense (walk – walked)
– Person (walk – walks)
• In English all inflectional morphemes are
suffixes
19
How many morphemes?
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Retroactive
Befriended
Televise
Margin
Psychology
Unpalatable
Deactivation
Airsickness
Grandmother
Morphemic
20
Can you “tree” the ambiguity?
A: Have you finished your ten-page book report,
Norman?
B: I haven’t even started it.
A: But it’s due tomorrow! I started mine a month
ago! Why did you wait until last minute??
B: Perhaps I have more confidence in my
intellectual abilities that you have in yours!
Besides, how long could it possibly take to read
a ten-page book?
21
Coining
• Speakers invent (coin) new words to
describe previously non-existent objects
• E.g., xerox, fax, nylon, vaseline etc
22
Compounding
• When two or more words are combined to
form a new word
• E.g., bittersweet, homework, spoonfeed,
sleepwalk etc.
• In English the rightmost of a compound is
the head of the compound
– Noun+verb=verb, e.g., spoonfeed
23
Meaning of compounds
• The meaning of compounds is not always the
sum of its parts
• E.g. a blackboard maybe green or white
• Also
– A boathouse is a house for boats but a cathouse is not
a house for cats (slang for whorehouse)
– A jumping bean is a bean that jumps, a falling star is a
star that falls but a looking glass is not a glass that
looks
– Peanut oil and olive oil but baby oil?
24
Pronunciation of compounds
• In a compound the first word is usually
stressed:
• Compare: REDcoat (slang for British
soldier) with red COAT
25
Blending
• The combination of two separate forms to
produce a single new term
– Smoke + fog = smog
– Breakfast + lunch = brunch
– Motor + hotel = motel
26
Acronyms
• Acronyms are words derived from the initials of
several words
– NASA, from National Aeronautics and Space Agency
– UNESCO, from United Nations Educational, Scientific,
and Cultural Organization
– Radar, from radio detecting and ranging
– Laser, from light amplification by stimulated emission
of radiation
– Scuba, from self-contained underwater breathing
apparatus
– RAM, random access memory
27
Backformation
• A new word may enter the language
because of an incorrect morphological
analysis
– beggar  beg
– editor  edit
– Enthusiasm  enthuse
28
Abbreviation
• Abbreviations of longer words may be
lexicalized
– Fax  facsimile
– Telly  television
– Gym  gymnasium
29
Eponyms
• Eponyms are words derived from proper
names
– Sandwich: named for the fourth Earl of
Sandwich who put his food between two slices
of bread so that he could eat while he
gambled
30
Clipping
• Clipping occurs when a word of more than
one syllable is reduced to a shorter form
– Fan  fanatic
– Plane  airplane
– Pro  professional
– Lab  laboratory
– Gas  gasoline
31
Conversion
• Conversion is a change in the function of a
word
– Verbs  nouns (guess, must, spy, etc.)
– Adjectives  verbs (dirty, empty, total, etc.)
– Particles  verbs (up, down)
32
33