Tue-1-9-3 Differences in Gradient Emotion Perception: Human vs. Alexa Voices

Michelle Cohn(University of California, Davis), Eran Raveh(Saarland University), Kristin Predeck(UC Davis), Iona Gessinger(Saarland University), Bernd Möbius(Saarland University) and Georgia Zellou(UC Davis)

Abstract: The present study compares how individuals perceive gradient acoustic realizations of emotion produced by a human voice versus an Amazon Alexa TTS voice. We manipulated semantically neutral sentences spoken by both talkers with identical emotional synthesis methods, using three levels of increasing 'happiness' (0%, 33%, 66% 'happier'). On each trial, listeners (native American English, n=99) rated a given sentence on two scales to assess dimensions of emotion: valence (negative-positive) and arousal (calm-excited). Participants also rated the Alexa voice on several parameters to assess anthropomorphism (e.g., naturalness, human-likeness, etc.). Results showed that the emotion manipulations led to increases in perceived positive valence and excitement. Yet, the effect differed by interlocutor: increasing 'happiness' manipulations led to larger changes for the human voice than the Alexa voice. Additionally, we observed individual differences in perceived valence/arousal based on participants' anthropomorphism scores. Overall, this line of research can speak to theories of computer personification and elucidate our changing relationship with voice-AI technology.

Paper

prev Tue-1-9-2 Individual variation in language attitudes toward voice-AI: The role of listeners’ autistic-like traits

next Tue-1-9-4 The MSP-Conversation Corpus

About

About the Conference

Welcome from the Chair

Conference Committees

Calls