Header menu link for other important links
X
DLRG@HASOC 2020: A hybrid approach for hate and offensive content identification in multilingual tweets
, B. Yashwanth Reddy
Published in CEUR-WS
2020
Volume: 2826
   
Pages: 304 - 310
Abstract
In recent times, most of the people prefer social media platforms as a communication tool and express their views publicly and anonymously. Hate speech and posting offensive contents has become a major issue nowadays. To handle these problems, automated methods are necessary that can help to analyse the social media posts and to identify the hate speech. Existing methods do not focus more on multilingual posts and it poses more challenges, not only due to the linguistic properties but also due to the class imbalance problem. The task of identifying hate and offensive content posted in Hindi or German languages has the same issues. To address the problem of class imbalance, we have combined a over sampling technique with a suitable feature weighting method. In the proposed approach, Multi-class imbalance-based feature selection method is combined with an SVM classifier to classify the tweet as a hate speech or not. This work was submitted to Hate and Offensive Content Identification (HASOC) task@FIRE2020 and scored third rank. We have achieved an accuracy of 80% and 72% on the released German and Hindi language tweets respectively. © 2020 Copyright for this paper by its authors.
About the journal
JournalCEUR Workshop Proceedings
PublisherCEUR-WS
ISSN16130073