Improving Speech Intelligibility in Monaural Segregation System by Fusing Voiced and Unvoiced Speech Segments

Shoba S; R. Rajavel

doi:10.1007/s00034-018-1005-3

Improving the speech intelligibility remains a challenging problem in digital hearing aids. This research work proposes a new speech segregation algorithm to improve the speech intelligibility by effectively fusing the voiced and unvoiced segment of the speech signal using the genetic algorithm. The voiced speech segments are obtained using perceptual speech cues such as auto-correlation, cross-channel correlation and pitch. Similarly, the unvoiced speech segments are obtained using another perceptual speech cue onset/offset after subtracting the voiced segments. The speech onset- and offset-based segregation process actually produce segments for both voiced and unvoiced. The unvoiced speech segments are obtained by subtracting the voiced speech segments from the segments obtained using speech onset and offset. The unvoiced speech segments obtained using onset and offset may contain interference. This research work proposes a scheme to remove those interferences from the unvoiced speech segments and effectively fuse the segments of voiced and unvoiced speech using the genetic algorithm. The performance of the proposed algorithm is evaluated using the intelligibility measures such as CSII, NCM and STOI. The experimental results show that the proposed algorithm significantly improves the speech intelligibility with an average of 0.23 for CSII, 0.20 for NCM and 0.16 for STOI as compared with other existing systems. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.

Journal	Circuits, Systems, and Signal Processing
Publisher	Birkhauser Boston
ISSN	0278081X