Graduate Student: Peter Moriarty, M.S. 2017
Paralinguistic features of speech communicate emotion in the human voice. In addition to
semantic content, speakers imbue their messages with prosodic features comprised of
acoustic variations that listeners decode to extract meaning. Psychological science refers
to these acoustic variations as affective prosody. This process of encoding and decoding
emotion remains a phenomenon that has yet to be acoustically operationalized.
Studies aimed at sifting and searching for the salience in emotional speech are often
limited to conducting new analyses on material generated by other researchers.
This project presented an opportunity for analyzing the communication of emotion
on a corpus of naturalistic emotional speech generated in collaboration with Penn
State’s Psychology Department. To this end, fifty-five participants were recorded
speaking the same semantic content in angry, happy, and sad expressive voicings in
addition to a neutral tone. Classic parameters were extracted including pitch, loudness,
timing, as well as other low-level descriptors (LLDs). The LLDs were compared with
published evidence and theory. In general, results were congruent with previous
studies for portrayals of more highly aroused emotions like anger and happiness,
but less so for sadness. It was determined that a significant portion of the
deviations from the scientific consensus could be explained by baseline definitions
alone, i.e. whether deviations referenced neutral or emotional LLD values.
A listening study was subsequently conducted in an effort to qualify and contrast
the objectively determined effects with perceptual input. Only three of the fifty-five
speakers were sampled due to practical concerns for testing time. The study tested
whether the sampled recordings reflected naturally recognizable emotion, and the
perceived intensity of these emotions. Listeners were able to discriminate the
intended emotion of the speaker with success rates in excess of 87%. Perceptual
intensity ratings revealed that some of the prototypical acoustical cues did not
significantly correlate with the perception of emotional intensity. Results from both
rounds of analysis indicate that a wealth of emotionally salient acoustical
information has yet to be fully characterized.