Header menu link for other important links
X
Processing of linear prediction residual in spectral and cepstral domains for speaker information
D. Pati,
Published in Kluwer Academic Publishers
2015
Volume: 18
   
Issue: 3
Pages: 333 - 350
Abstract
In this work the linear prediction (LP) residual is processed in spectral and cepstral domains to model the speaker-specific excitation information. In the spectral domain, the excitation energy information is modeled from subband energies (SBE). The excitation periodicity information is modeled by power differences of spectrum in subband (PDSS) measure. This work carries some refinements in the existing methods of extracting SBE and PDSS by exploiting the nature of the excitation spectrum. The SBE and PDSS values are computed from mel warped residual subband spectrum and called as residual mel subband energies (R-MSE) and mel power differences of subband spectra (M-PDSS), respectively. The different speaker recognition studies performed using NIST-99 and NIST-03 databases demonstrate that R-MSE and M-PDSS features represent good speaker information. It is also demonstrated that the excitation energy information can be better modeled in the cepstral domain by residual mel frequency cepstral coefficients (R-MFCC). Furhter, the evidences provided by M-PDSS and R-MFCC features are different and combine well and provides improved recognition performance. The combined evidence from M-PDSS and R-MFCC together with the vocal tract information further improves the performance. Finally, a comparative study on processing the LP residual in temporal, spectral and cepstral domains demonstrates that with a small compromise with the recognition performance, processing LP residual in spectral and cepstral domains provide compact and effective way of representing the excitation information, as compared to temporal processing. © 2015, Springer Science+Business Media New York.
About the journal
JournalInternational Journal of Speech Technology
PublisherKluwer Academic Publishers
ISSN13812416