Classification of multi speaker shouted speech and single speaker normal speech

S. Baghel; Prasanna S; P. Guha

doi:10.1109/TENCON.2017.8228261

Profiles Research Units Publications

Conferences

Classification of multi speaker shouted speech and single speaker normal speech

S. Baghel, , P. Guha

Published in Institute of Electrical and Electronics Engineers Inc.

2017

DOI: 10.1109/TENCON.2017.8228261

Volume: 2017-December

Pages: 2388 - 2392

Abstract

This work proposes a method for the shouted and multi speaker's vs normal and single speaker's speech classification, which is the most frequently occurring scenario in news debates. In this work, multi speaker shouted and single speaker normal speech classes are addressed as shouted and normal speech, respectively. Spectral features and source features are explored for the classification task. The source characteristics are studied in terms of strength of excitation (SoE). Spectral flux, spectral tilt, sum of ten largest spectral peaks (STLP), modulation spectrum energy (ModSE) and Mel frequency cepstral coefficients (MFCCs) are explored as the spectral features. Shouted and normal speech are classified using two approaches. In the first approach, these features, except MFCCs, are non-linearly mapped and combined using a threshold based technique. In the second approach, a predefined radial basis function (RBF) kernel based Support Vector Machine (SVM) classifier is used for the classification task on the extracted features. The performance evaluation is done in terms of F-Score. The performance is also evaluated on the basis of leave one out analysis to measure the strength of a particular feature for this task. By leave one out analysis, SoE is the most important feature among all one-dimensional features. When all the features are combined for classification, F-score of forty four dimensional feature is highest. © 2017 IEEE.

Topics: Feature (machine learning) (54)%, Mel-frequency cepstrum (53)%, Spectral flux (53)% and Support vector machine (50)%

View more info for "Classification of multi speaker shouted speech and single speaker normal speech"

About the journal

Journal	Data powered by TypesetIEEE Region 10 Annual International Conference, Proceedings/TENCON
Publisher	Data powered by TypesetInstitute of Electrical and Electronics Engineers Inc.
ISSN	21593442

Authors (1)

Prasanna S

ABOUT US

ACADEMICS

INTERNATIONAL RELATIONS

RESEARCH

RANKINGS & PLACEMENT

ABOUT US

ACADEMICS

INTERNATIONAL RELATIONS

RESEARCH

RANKINGS & PLACEMENT