Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

Khwaja M.K; Vikash P; Arulmozhivarman P; Lui S.

doi:10.1007/s10772-016-9377-x

Profiles Research Units Publications

Journal Article

Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model

Khwaja M.K, Vikash P, , Lui S.

Published in Springer Science and Business Media LLC

2016

DOI: 10.1007/s10772-016-9377-x

Volume: 19

Issue: 4

Pages: 895 - 905

Abstract

Phoneme recognition is an important aspect of speech processing and recognition. Research on phoneme recognition is several years old and numerous algorithms have been developed over the years to improve its accuracy. In this paper, a quantitative analysis of phoneme recognition using supervised learning is investigated. Most approaches to phoneme recognition rely on using mel frequency cepstrum based features for identification of the phoneme class. In our approach, we take into consideration the vocal tract area function along with mel frequency cepstrum coefficients and analyze the change in accuracy obtained by its introduction in the feature set. Support Vector Machines have been an attractive approach to pattern recognition and its usage as a supervised learning model has been popular in the speech processing community. We compare Support Vector Machines to other supervised learning models like the Naïve Bayes, the k-Nearest Neighbors and the linear discriminant analysis classifiers, for our feature set. We impose a soft voting rule between the three best classifiers to produce our variation of a voting classifier. We enhance the accuracy of our classifier by using a priority based approach to estimate the three most likely phonemes, after the predicted phoneme. Through a figurative and quantitative approach, we show that our modified algorithm outperforms other traditional methods. Experiments were conducted on the WSJCAM0 corpus, a British English corpus. © 2016, Springer Science+Business Media New York.

About the journal

Journal	Data powered by TypesetInternational Journal of Speech Technology
Publisher	Data powered by TypesetSpringer Science and Business Media LLC
ISSN	1381-2416
Open Access	0

Authors (1)

Arulmozhivarman P

ABOUT US

ACADEMICS

INTERNATIONAL RELATIONS

RESEARCH

RANKINGS & PLACEMENT

ABOUT US

ACADEMICS

INTERNATIONAL RELATIONS

RESEARCH

RANKINGS & PLACEMENT