Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ISCA Distinguished Lectures 2017-2018 (1) Professor Jennifer S. Cole, Linguistics, Northwestern University, USA 1. Individual differences and attentional effects on cue weighting for prosody perception Through the temporal pattern of F0, intensity, local tempo and other acoustic properties, prosody conveys meaning related to the syntactic and semantic properties of an utterance, and meaning related to the broader situational context. Prosodically annotated speech materials are an important source of data for researchers investigating prosodically encoded meaning and its phonetic expression, but prosodic annotation is itself a challenging task, with varying degrees of inter-annotator agreement. One likely source of annotator disagreement is variability in the phonological specification of prosody, and in its phonetic implementation, both within and across speakers, due to factors such as speaking style, discourse context and the speaker’s emotional state. In this talk I examine individual listener differences in the perception of prosodic features, which may be another factor contributing to variable inter-annotator agreement. I present findings from a study of rapid prosody transcription, with 32 untrained annotators who marked the prosodic features of American English conversational speech, based on auditory impression alone. Mixed-effects non-linear regression models are used to examine the effects of acoustic and contextual properties of a word on the perception of its prosodic status. The findings reveal individual differences among listeners in the selection and weighting of acoustic and contextual cues to prosodic features, though with consistent trends across listeners in the direction of effect a given cue may have on the perception of prosody. We also observe that listeners cluster into roughly three groups according to the number and type of cues that are predictive of their prosodic annotation, and that there is an implicational hierarchy in the selection of cues, where cues with the strongest effect on prosodic rating are selected by the greatest number of listeners. Further differences in cue weighting are observed depending on task instructions that focus listeners’ attention on the acoustic vs. meaning dimensions of the heard utterance. The overall finding of variation in prosody perception is discussed in terms of individual and task-based differences in attention and speech processing strategies. 2. Prosodic entrainment and its relation to dialogue conditions A central question about prosody concerns the units by which prosody is encoded in the mental representation of words and phrases, and how those units contribute to signaling linguistic meaning. In this talk I present evidence for the cognitive representation of prosody from the analysis of prosodic entrainment—a phenomenon whereby conversation partners become more similar to one another in their prosodic expression—and evidence from prosodic imitation. Prosodic patterns that are entrained or imitated reveal those properties of a speaker’s prosodic patterns that are perceptually salient to a listener, and subsequently reproduced on the basis of a stored mental representation. Findings from two studies with American English speakers are introduced—a study of intonational imitation, and a study of intonational entrainment. Both studies assess the similarity of f0 parameters between paired productions—involving a stimulus and its explicit imitation by a different speaker, or involving utterances with matching dialog-act tags from potentially entraining conversation partners. The similarity of f0 patterns is evaluated at different levels of granularity to determine if imitation or entrainment is targeting patterns at the level of the intonational phrase, the phonological pitch accent, and/or at the level of fine phonetic detail. Statistical modeling using mixed-effects regression and clustering analyses provide evidence from both studies for a hybrid encoding of intonational patterns that represents both the acoustic detail of heard f0 patterns, and the abstract (phonological) category that they represent. The entrainment study further shows that f0 contours are encoded by the listener in relation to the dialog act in which they are produced, indicating a tight association between prosodic and discourse information in cognitive representation. 3. Memory for Prosody Phonological accounts of speech perception postulate that listeners map variable instances of speech to categorical features (phonemes, positional allophones, syllables, etc.) that form the memory representations of heard words and phrases. Other research maintains that listeners perceive and remember fine-grain phonetic detail that distinguishes among exemplars or instances that belong to the same category. In this talk I consider the nature of the memory representation of prosody through experiments testing the perception and recall of pitch accents in their phonological and phonetic specification, in American English. Two types of prosodic variation are tested: phonological variation in the presence, absence and type of pitch accent, and variation in the phonetic cues to pitch accent (F0 peak, word duration). The findings from these experiments show that while listeners encode both categorical distinctions and phonetic detail in memory, the categorical distinctions are more reliably retrieved than cues in later tests of episodic memory. These findings also show that listeners may vary in the degree to which they remember phonetic detail that cues prosody. I consider these findings in relation to other recent evidence for variation and individual differences in the production and perception of prosodic of prosodic patterns, and discuss their implications for the role of prosody in the processing of linguistic and para-linguistic meaning.