Header menu link for other important links
X
Shouted and Normal Speech Classification Using 1D CNN
S. Baghel, M. Bhattacharjee, , P. Guha
Published in Springer
2019
Volume: 11942 LNCS
   
Pages: 472 - 480
Abstract
Automatic shouted speech detection systems usually model its spectral characteristics to differentiate it from normal speech. Mostly hand-crafted features have been explored for shouted speech detection. However, many works on audio processing suggest that approaches based on automatic feature learning are more robust than hand-crafted feature engineering. This work re-demonstrates this notion by proposing a 1D-CNN architecture for shouted and normal speech classification task. The CNN learns features from the magnitude spectrum of speech frames. Classification is performed by fully connected layers at later stages of the network. Performance of the proposed architecture is evaluated on three datasets and validated against three existing approaches. As an additional contribution, a discussion of features learned by the CNN kernels is provided with relevant visualizations. © 2019, Springer Nature Switzerland AG.