Header menu link for other important links
X
Enhancements in Assamese spoken query system: Enabling background noise suppression and flexible queries
A. Dey, S. Shahnawazuddin, K.T. Deepak, S. Imani, , R. Sinha
Published in Institute of Electrical and Electronics Engineers Inc.
2016
Abstract
In the work presented in this paper, the recent improvements incorporated in the earlier developed Assamese spoken query (SQ) system for accessing the price of agricultural commodities are discussed. The developed SQ system consists of interactive voice response (IVR) and automatic speech recognition (ASR) modules. These are developed using open source resources. The speech data used for developing the ASR system was collected in the field conditions, thus contained significantly high level of background noise. On account of the background noise, the recognition performance of earlier version of the SQ system was severely affected. In order to deal with that, a front-end noise suppression module-based on zero frequency filtering has been added in the current version. Furthermore, we have also incorporated the subspace Gaussian mixture (SGMM) and deep neural network (DNN)-based acoustic modeling approaches. These techniques are found to be more effective than the Gaussian mixture model (GMM)-based approach which was employed in the previous version. The combination of noise removal and DNN-based acoustic modeling is found to result in a relative improvement of almost 32% in word error rate in comparison to the earlier reported GMM-HMM-based ASR system. The earlier SQ system was designed expecting the users' queries in form of isolated words only and, therefore, a high degraded recognition performance was noted whenever the queries were in the form of continuous sentences. In order to overcome that, we present a simple technique exploiting the inherent patterns in the user queries. These patterns are then incorporated in the employed language model. The modified language model is observed to result in significant improvements in the recognition performances in case of continuous queries. © 2016 IEEE.
About the journal
JournalData powered by Typeset2016 22nd National Conference on Communication, NCC 2016
PublisherData powered by TypesetInstitute of Electrical and Electronics Engineers Inc.