Header menu link for other important links
X
Robust Recognition of Tone Specified Mizo Digits Using CNN-LSTM and Nonlinear Spectral Resolution
V. Kothapalli, B.D. Sarma, A. Dey, P. Gogoi, W. Lalhminghlui, P. Sarmah, , S.R. Nirmala, R. Sinha
Published in Institute of Electrical and Electronics Engineers Inc.
2018
Abstract
In this work, we attempt Mizo digit recognition under degraded conditions, using spectrograms as visual inputs to a Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) network. As a tone language, each digit in Mizo is associated with a specific sequence of tones. To emphasize the tonal information, low frequency resolution is increased by applying a nonlinear spectral resolution method. The use of nonlinear spectral resolution improves the recognition rate of the system, as evident from the word error rate decrease of about 4% when the training data contains speech data with similar noise profiles as in the testing data. When the training data is clean, improvement in recognition rate is about 2%, using the nonlinear spectral resolution method. The proposed method, compared with the Deep Neural Network-Hidden Markov Model (DNN-HMM) based baseline system, gives an improvement of around 40% and 15% for 0 dB and 5 dB SNRs, respectively when noise profiles of speech sounds included in training and testing conditions are similar. © 2018 IEEE.
About the journal
JournalData powered by TypesetINDICON 2018 - 15th IEEE India Council International Conference
PublisherData powered by TypesetInstitute of Electrical and Electronics Engineers Inc.