Format

Send to

Choose Destination
Atten Percept Psychophys. 2018 May;80(4):871-883. doi: 10.3758/s13414-018-1489-8.

Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

Lu L1, Bao X1, Chen J2,3, Qu T2,3, Wu X2,3, Li L4,5,6,7.

Author information

1
School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, 100080, China.
2
Department of Machine Intelligence, Peking University, Beijing, China.
3
Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China.
4
School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University, Beijing, 100080, China. liangli@pku.edu.cn.
5
Department of Machine Intelligence, Peking University, Beijing, China. liangli@pku.edu.cn.
6
Speech and Hearing Research Center, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China. liangli@pku.edu.cn.
7
Beijing Institute for Brain Disorders, Beijing, China. liangli@pku.edu.cn.

Abstract

Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.

KEYWORDS:

Attentional modulation; Cocktail-party problem; Emotion; Speech recognition; Unmasking

PMID:
29473143
DOI:
10.3758/s13414-018-1489-8
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Springer
Loading ...
Support Center