Stressed speech processing: Human vs automatic in non-professional speakers scenario

S. Shukla; Prasanna S; S. Dandapat

doi:10.1109/NCC.2011.5734704

Profiles Research Units Publications

Conferences

Stressed speech processing: Human vs automatic in non-professional speakers scenario

S. Shukla, , S. Dandapat

Published in

2011

DOI: 10.1109/NCC.2011.5734704

Abstract

This study analyzes the effect of stress in human and automatic stressed speech processing tasks for speech collected from non-professional speakers. The database of 33 keywords is collected under five stress conditions, namely, neutral, angry, happy, sad and Lombard from fifteen speakers. The first study is to understand the ability to identify stress by human and automatic speech processing. The average performance of human stress classification is 59.44%. The average performance of automatic stress classifier using Vector Quantization (VQ) and Hidden Markov model (HMM) is 54.65% and 56.02%, respectively. The second study has been done to understand the effect of stress in human and automatic speech recognition. The average performance of human stressed speech recognition is 99.60%. The automatic stressed speech recognition performance using VQ and HMM is 82.42% and 76.79%, respectively. Even in the non-professional speakers scenario, human performance is better than automatic processing. Also, automatic processing seem to show considerable degradation in performance that warrant development of new methods to handle stress information. © 2011 IEEE.