Enhancement of noisy speech by temporal and spectral processing

P. Krishnamoorthy; Prasanna S

doi:10.1016/j.specom.2010.08.011

This paper presents a noisy speech enhancement method by combining linear prediction (LP) residual weighting in the time domain and spectral processing in the frequency domain to provide better noise suppression as well as better enhancement in the speech regions. The noisy speech is initially processed by the excitation source (LP residual) based temporal processing that involves identifying and enhancing the excitation source based speech-specific features present at the gross and fine temporal levels. The gross level features are identified by estimating the following speech parameters: sum of the peaks in the discrete Fourier transform (DFT) spectrum, smoothed Hilbert envelope of the LP residual and modulation spectrum values, all from the noisy speech signal. The fine level features are identified using the knowledge of the instants of significant excitation. A weight function is derived from the gross and fine weight functions to obtain the temporally processed speech signal. The temporally processed speech is further subjected to spectral domain processing. Spectral processing involves estimation and removal of degrading components, and also identification and enhancement of speech-specific spectral components. The proposed method is evaluated using different objective and subjective quality measures. The quality measures show that the proposed combined temporal and spectral processing method provides better enhancement, compared to either temporal or spectral processing alone. © 2010 Elsevier B.V. All rights reserved.

Journal	Speech Communication
ISSN	01676393