Header menu link for other important links
X
Twitter spam account detection based on clustering and classification methods
Adewole K.S, Han T, Wu W, Song H,
Published in Springer Science and Business Media LLC
2020
Volume: 76
   
Issue: 7
Pages: 4802 - 4837
Abstract
Twitter social network has gained more popularity due to the increase in social activities of registered users. Twitter performs dual functions of online social network (OSN), acting as a microblogging OSN, and at the same time as a news update platform. Recently, the growth in Twitter social interactions has attracted the attention of cybercriminals. Spammers have used Twitter to spread malicious messages, post phishing links, flood the network with fake accounts, and engage in other malicious activities. The process of detecting the network of spammers who engage in these activities is an important step toward identifying individual spam account. Researchers have proposed a number of approaches to identify a group of spammers. However, each of these approaches addressed a specific category of spammer. This paper proposes a different approach to detect spammers on Twitter based on the similarities that exist among spam accounts. A number of features were introduced to improve the performance of the three classification algorithms selected in this study. The proposed approach applied principal component analysis and tuned K-means algorithm to cluster over 200,000 accounts, randomly selected from more than 2 million tweets to detect the clusters of spammers. Experimental results show that Random Forest achieved the highest accuracy of 96.30%. This result is followed by multilayer perceptron with 96.00% and support vector machine, which achieved 95.60%. The performance of the selected classifiers based on class imbalance also revealed that Random Forest achieved the highest accuracy, precision, recall, and F-measure. © 2018, Springer Science+Business Media, LLC, part of Springer Nature.
About the journal
JournalData powered by TypesetThe Journal of Supercomputing
PublisherData powered by TypesetSpringer Science and Business Media LLC
ISSN0920-8542
Open Access0