Header menu link for other important links
X
Low frequency region of vocal tract information for speech/music classification
B.K. Khonglah,
Published in Institute of Electrical and Electronics Engineers Inc.
2017
Pages: 2593 - 2597
Abstract
This work explores the energy variation in low frequency region corresponding mostly to the first formant of speech for speech/music classification. In speech, there are alternating states between high and low vowels. The information about highness of a vowel is characterized by the first formant. Due to the presence of these states in speech, there is higher energy variation around the first formant of speech whereas this kind of nature is not present in case of music. The energy variation around the first formant can be expressed in terms of the filter bank energy variance. Different distributions of the filter bank are explored to find the frequency range which gives the best discrimination, in terms of the first formant energy variation for the task. The best performing filter bank energy variance coefficient which represents this frequency range is derived. This coefficient called the Energy Variance of Inverse Mel Filter No. 2 (EVIMF2) is compared with the state of the art existing features. It is observed that performance of the proposed feature is better than the existing features. On combining this feature with the existing ones, an additional improvement is achieved for the speech/music classification task. © 2016 IEEE.
About the journal
JournalData powered by TypesetIEEE Region 10 Annual International Conference, Proceedings/TENCON
PublisherData powered by TypesetInstitute of Electrical and Electronics Engineers Inc.
ISSN21593442