Automatic Hate Speech Detection in English-Odia Code Mixed Social Media Data Using Machine Learning Techniques

Sudhir Kumar Mohapatra; Srinivas Prasad; Dwiti Krishna Bebarta; Kathiravan Srinivasan; Yuh-Chung Hu

doi:10.3390/app11188575

Profiles Research Units Publications

Articles

Open Access

Automatic Hate Speech Detection in English-Odia Code Mixed Social Media Data Using Machine Learning Techniques

Sudhir Kumar Mohapatra, Srinivas Prasad, Dwiti Krishna Bebarta, , Yuh-Chung Hu

Published in Multidisciplinary Digital Publishing Institute

2021

DOI: 10.3390/app11188575

Volume: 11

Issue: 18

Pages: 1 - 21

Abstract

Hate speech on social media may spread quickly through online users and subsequently, may even escalate into local vile violence and heinous crimes. This paper proposes a hate speech detection model by means of machine learning and text mining feature extraction techniques. In this study, the authors collected the hate speech of English-Odia code mixed data from a Facebook public page and manually organized them into three classes. In order to build binary and ternary datasets, the data are further converted into binary classes. The modeling of hate speech employs the combination of a machine learning algorithm and features extraction. Support vector machine (SVM), naïve Bayes (NB) and random forest (RF) models were trained using the whole dataset, with the extracted feature based on word unigram, bigram, trigram, combined n-grams, term frequency-inverse document frequency (TF-IDF), combined n-grams weighted by TF-IDF and word2vec for both the datasets. Using the two datasets, we developed two kinds of models with each feature—binary models and ternary models. The models based on SVM with word2vec achieved better performance than the NB and RF models for both the binary and ternary categories. The result reveals that the ternary models achieved less confusion between hate and non-hate speech than the binary models.

About the journal

Journal	Applied Sciences
Publisher	Multidisciplinary Digital Publishing Institute
Open Access	Yes

Authors (1)

Kathiravan Srinivasan
- School of Computer Science and Engineering
- Vellore Campus

ABOUT US

ACADEMICS

INTERNATIONAL RELATIONS

RESEARCH

RANKINGS & PLACEMENT

ABOUT US

ACADEMICS

INTERNATIONAL RELATIONS

RESEARCH

RANKINGS & PLACEMENT