Header menu link for other important links
X
Speaker verification in sensor and acoustic environment mismatch conditions
G. Pradhan, B.C. Haris, , R. Sinha
Published in
2012
Volume: 15
   
Issue: 3
Pages: 381 - 392
Abstract
Our initial speaker verification study exploring the impact of mismatch in training and test conditions finds that the mismatch in sensor and acoustic environment results in significant performance degradation compared to other mismatches like language and style (Haris et al. in Int. J. Speech Technol., 2012). In this work we present a method to suppress the mismatch between the training and test speech, specifically due to sensor and acoustic environment. The method is based on identifying and emphasizing more speaker specific and less mismatch affected vowellike regions (VLRs) compared to the other speech regions. VLRs are separated from the speech regions (regions detected using voice activity detection (VAD)) using VLR onset point (VLROP) and are processed independently during training and testing of the speaker verification system. Finally, the scores are combined with more weight to that generated by VLRs as those are relatively more speaker specific and less mismatch affected. Speaker verification studies are conducted using the mel-frequency cepstral coefficients (MFCCs) as feature vectors. The speaker modeling is done using the Gaussian mixture model-universal background model and the state-of-the-art i-vector based approach. The experimental results show that for both the systems, proposed approach provides consistent performance improve-ment on the conversational approach with and without different channel compensation techniques. For instance, with IITG-MV Phase-II dataset for headphone trained and voice recorder test speech, the proposed approach provides a relative improvement of 25.08 % (in EER) for the i-vector based speaker verification systems with LDA and WCCN compared to conventional approach. © 2012 Springer Science+Business Media, LLC.
About the journal
JournalInternational Journal of Speech Technology
ISSN13812416