Download Jennifer S. Cole

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Mind-wandering wikipedia , lookup

Herbert H. Clark wikipedia , lookup

Transcript
ISCA Distinguished Lectures 2017-2018
(1) Professor Jennifer S. Cole, Linguistics, Northwestern University, USA
1.
Individual differences and attentional effects on cue weighting for prosody
perception
Through the temporal pattern of F0, intensity, local tempo and other acoustic
properties, prosody conveys meaning related to the syntactic and semantic
properties of an utterance, and meaning related to the broader situational context.
Prosodically annotated speech materials are an important source of data for
researchers investigating prosodically encoded meaning and its phonetic expression,
but prosodic annotation is itself a challenging task, with varying degrees of
inter-annotator agreement. One likely source of annotator disagreement is
variability in the phonological specification of prosody, and in its phonetic
implementation, both within and across speakers, due to factors such as speaking
style, discourse context and the speaker’s emotional state. In this talk I examine
individual listener differences in the perception of prosodic features, which may be
another factor contributing to variable inter-annotator agreement. I present
findings from a study of rapid prosody transcription, with 32 untrained annotators
who marked the prosodic features of American English conversational speech, based
on auditory impression alone. Mixed-effects non-linear regression models are used
to examine the effects of acoustic and contextual properties of a word on the
perception of its prosodic status. The findings reveal individual differences among
listeners in the selection and weighting of acoustic and contextual cues to prosodic
features, though with consistent trends across listeners in the direction of effect a
given cue may have on the perception of prosody. We also observe that listeners
cluster into roughly three groups according to the number and type of cues that are
predictive of their prosodic annotation, and that there is an implicational hierarchy
in the selection of cues, where cues with the strongest effect on prosodic rating are
selected by the greatest number of listeners. Further differences in cue weighting
are observed depending on task instructions that focus listeners’ attention on the
acoustic vs. meaning dimensions of the heard utterance. The overall finding of
variation in prosody perception is discussed in terms of individual and task-based
differences in attention and speech processing strategies.
2.
Prosodic entrainment and its relation to dialogue conditions
A central question about prosody concerns the units by which prosody is encoded in
the mental representation of words and phrases, and how those units contribute to
signaling linguistic meaning. In this talk I present evidence for the cognitive
representation
of prosody from the
analysis
of prosodic entrainment—a
phenomenon whereby conversation partners become more similar to one another in
their prosodic expression—and evidence from prosodic imitation. Prosodic patterns
that are entrained or imitated reveal those properties of a speaker’s prosodic
patterns that are perceptually salient to a listener, and subsequently reproduced on
the basis of a stored mental representation. Findings from two studies with
American English speakers are introduced—a study of intonational imitation, and a
study of intonational entrainment. Both studies assess the similarity of f0
parameters between paired productions—involving a stimulus and its explicit
imitation by a different speaker, or involving utterances with matching dialog-act
tags from potentially entraining conversation partners. The similarity of f0 patterns
is evaluated at different levels of granularity to determine if imitation or
entrainment is targeting patterns at the level of the intonational phrase, the
phonological pitch accent, and/or at the level of fine phonetic detail. Statistical
modeling using mixed-effects regression and clustering analyses provide evidence
from both studies for a hybrid encoding of intonational patterns that represents
both the acoustic detail of heard f0 patterns, and the abstract (phonological)
category that they represent. The entrainment study further shows that f0 contours
are encoded by the listener in relation to the dialog act in which they are produced,
indicating a tight association between prosodic and discourse information in
cognitive representation.
3.
Memory for Prosody
Phonological accounts of speech perception postulate that listeners map variable
instances of speech to categorical features (phonemes, positional allophones,
syllables, etc.) that form the memory representations of heard words and phrases.
Other research maintains that listeners perceive and remember fine-grain phonetic
detail that distinguishes among exemplars or instances that belong to the same
category. In this talk I consider the nature of the memory representation of prosody
through experiments testing the perception and recall of pitch accents in their
phonological and phonetic specification, in American English. Two types of prosodic
variation are tested: phonological variation in the presence, absence and type of
pitch accent, and variation in the phonetic cues to pitch accent (F0 peak, word
duration). The findings from these experiments show that while listeners encode
both categorical distinctions and phonetic detail in memory, the categorical
distinctions are more reliably retrieved than cues in later tests of episodic memory.
These findings also show that listeners may vary in the degree to which they
remember phonetic detail that cues prosody. I consider these findings in relation to
other recent evidence for variation and individual differences in the production and
perception of prosodic of prosodic patterns, and discuss their implications for the
role of prosody in the processing of linguistic and para-linguistic meaning.