Download TRACE model (McClelland and Elman 1986)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Neuroeconomics wikipedia , lookup

Types of artificial neural networks wikipedia , lookup

Biological neuron model wikipedia , lookup

Neural modeling fields wikipedia , lookup

Recurrent neural network wikipedia , lookup

Neurocomputational speech processing wikipedia , lookup

Speech synthesis wikipedia , lookup

Nervous system network models wikipedia , lookup

Speech recognition wikipedia , lookup

Transcript
PS: Introduction to Psycholinguistics
Winter Term 2005/06
Instructor: Daniel Wiechmann
Office hours: Mon 2-3 pm
Email: [email protected]
Phone: 03641-944534
Web: www.daniel-wiechmann.net
Session 4:
Understanding speech
 Problems with recognition of speech

Segmentation problem (how to seperate sounds
in speech)
 Possible remedies:




Possible-word constraint
Metrical segmentation strategy
Stress-based segmentation
Syllable-based segmentation
Session 4:
Understanding speech
 Categorical perception

Experiment Liberman et al. (1957)
Speech synthesizer creates continuum of artificial
syllable that differ in the place of articulation of one
phoneme
 Subjects placed syllables into three categories (/b/, /d/,
/g/)

Session 4:
Understanding speech
 Categorical perception

voice onset time (VOT)
voiced and unvoiced consonants (e.g. /b/,/d/ vs /p/,/t/)
differ with respect to VOT (difference ~ 60 ms)
 Experimenters varied VOT on a scale (e.g. 30ms)
 Subjects make ‚either-or‘ distinctions

Session 4:
Understanding speech
 Categorical perception

Selective adaptation
Repeated presentation of /ba/ makes people less
sensitive to voicing feature (fatigue feature detector)
 cut-off point for /b/-/p/ destinction shifts toward /p/end of continuum

Session 4:
Understanding speech
 Prelexical (phonetic) vs postlexical
(phonemic) code


Prelexical code computed directly from
perceptual analysis (bottom-up)
Postlexical coded is computed from higher-level
units such as words (top-down)
Foss and Blank (1980) phoneme-monitoring task
 But cf. Foss and Gernsbacher (1983 and MarslenWilson and Warren (1994)

Session 4:
Understanding speech
 In summary:

There is a controversy about whether or not we
identify phonemes before we recognize higher
level units (e.g. syllbles or words)
Session 4:
Understanding speech
 The role of context in identifying sounds: the
phonemic restoration effect (cf. Warren and
Warren 1970)
Session 4:
Understanding speech




It was found that the *eel was on the orange
It was found that the *eel was on the axle
It was found that the *eel was on the shoe
It was found that the *eel was on the table
Session 4:
Understanding speech




It was found that the peel was on the orange
It was found that the wheel was on the axle
It was found that the heel was on the shoe
It was found that the meal was on the table
Understanding speech
 Phonemic restoration effect: 2 explanations


1. Context interacts directly with buttom-up
processes (sensitivity effect)
2. Context may simply provide additional source
of information (response bias effect)
Understanding speech:
Samuel (1981, 1990)
 Method:




Subjects listen to sentences and meaningless noise was
presented during each sentence
On some trials, noise was superimposed on one of the
phonemes of a word
On other trials, phoneme was deleted
Finally, sometimes phoneme was predicatble from
context
 Task

decide whether or not crucial phoneme had been
presented
Understanding speech:
Samuel (1981, 1990)
 Phonemic restoration effect: 2 explanations
 Hypotheses
 1. If context improves sensitivity, then the ability to
dicriminate between phoneme plus noise and noise
alone should be improved by predicatble context
 2. If context affects response bias, then participants
should simply be more likely to decide that the
phoneme was presented when the word was presented
in predictable context
Understanding speech:
Samuel (1981, 1990)
 Results:


Context affected response bias but not
sensitivity
Contextual information does not have a direct
effect on bottom-up processing
Understanding speech:
Models of speech recognition
 Most influential models

Motor theory (Libermann et al 1967)



Listeners mimic the articulatory movements of the
speaker
Cohort theory (Marslen-Wilsen and Tyler 1980)
TRACE model (McClelland and Elman 1986)
Understanding speech:
Models of speech recognition: neurons
Understanding speech:
Models of speech recognition: neuron (schematic)
Synapse:
The junction
across which a
nerve impulse
passes from an
axon terminal to
a neuron
Understanding speech:
Models of speech recognition: neuronal networks
The brain is composed of over 10-100 billion nerve cells, or
neurons, that communicate with one another through
specialized contacts called synapses.
Typically, a single neuron receives 2000-5000 synapses from
other neurons; these synapses are located almost exclusively on
the neuron's dendrites, long projections that radiate out from the
neuron's cell body.
In turn, the neuron's axon, a long thin process that grows out
from the cell body of a neuron, makes synaptic connections
with 1000 other neurons. In this way, neuronal signals pass from
neuron to neuron to form extensive and elaborate neural
circuits.
Understanding speech:
Models of speech recognition: number of neurons
human brain
Understanding speech:
Models of speech recognition: introducing
connectionist models
Understanding speech:
Models of speech recognition: introducing
connectionist models
 Two central assumptions artificial neural nets
(ANN):
1) processing occurs through the action of many
simple, interconnected processing units
(neurons)
2) activation spreads around the network in a way
determined by the strength of the links, i.e. the
connections between units
Understanding speech:
Models of speech recognition: introducing
connectionist models
 Some models learn

back-propagation
 Some don‘t


Interactive activation model
(IAC) McClelland and
Rumelhart (1981) does not
learn
TRACE model (McClelland
and Elman 1986) is an IAC
model
Understanding speech:
Models of speech recognition: from neural
networks to connectionist models
Connections can be inhibitory or excitatory(facilitatory)
Connections (or
links) have different
weights
Threshold: the total amount of
activation needed to make the
node fire
Understanding speech:
Models of speech recognition: from neural
networks to connectionist models
+ 0.6 (excitatory)
- 0.5 (inhibitory)
+ 0.7 (excitatory)
-1 to +1
Threshold: 1.0
Ergo: no firing
Understanding speech:
Models of speech recognition: from neural
networks to connectionist models
-1 to +1
- 0.5
+ 0.9 (excitatory)
- 0.2 (inhibitory)
- 0.9
+ 0.4 (excitatory)
-1 to +1
+ 0.5
Threshold: 1.0
Ergo: firing
Understanding speech:
Models of speech recognition: from neural
networks to connectionist models
Understanding speech:
Models of speech recognition: from neural
networks to connectionist models
Interactive activation network (McClelland and Rumelhart 1981)
Understanding speech:
Models of speech recognition: TRACE
 TRACE model (McClelland and Elman
1986)

There are individual processing units, or nodes,
at three different levels:
FEATURES (place & manner of production, voicing)
 PHONEMES
 WORDS

Understanding speech:
Models of speech recognition: TRACE
 TRACE model (McClelland and Elman
1986)


Feature nodes are connected to phoneme nodes
Phoneme nodes are connected to word nodes
 Connections between levels operate in both
directions, and are only facilitatory (i.e. no
inhibition)
Understanding speech:
Models of speech recognition: TRACE
 TRACE model (McClelland and Elman
1986)


There are connections among units or nodes at
the same level
These connections are inhibitory
Understanding speech:
Models of speech recognition: TRACE
 TRACE model (McClelland and Elman
1986)


Nodes influence each other in proportion to their
activation levels and the strength of their
interconnections
As excitation and inhibition spread among
nodes, a pattern of activation, or TRACE,
develops
Understanding speech:
Models of speech recognition: TRACE
 TRACE model (McClelland and Elman 1986)

The word that is recognized is determined by the
activation level of the possible candidate words.