A Comparative Analysis of Classification Algorithms for Cyberbullying Crime Detection: An Experimental Study of Twitter Social Media in Indonesia

Ari Muzakir(1), Hadi Syaputra(2), Febriyanti Panjaitan(3),

(1) Department of Information System, Universitas Bina Darma, Indonesia
(2) Department of Computer Science, Universitas Bina Darma, Indonesia
(3) Department of Computer Science, Universitas Bina Darma, Indonesia


Purpose: This research aims to identify content that contains cyberbullying on Twitter. We also conducted a comparative study of several classification algorithms, namely NB, DT, LR, and SVM. The dataset we use comes from Twitter data which is then manually labeled and validated by language experts. This study used 1065 data with a label distribution, namely 638 data with a non-bullying label and 427 with a bullying label.
Methods: The weighting process for each word uses the bag of word (BOW) method, which uses three weighting features. The three-word vector weighting features used include unigram, bigram, and trigram. The experiment was conducted with two scenarios, namely testing to find the best accuracy value with the three features. The following scenario looks at the overall comparison of the algorithm's performance against all the features used.
Result: The experimental results show that for the measurement of accuracy weighting based on features and algorithms, the SVM classification algorithm outperformed other algorithms with a percentage of 76%. Then for the weighting based on the average recall, the DT classification algorithm outperformed the other algorithms by an average of 76%. Another test for measuring overall performance (F-measure) based on accuracy and precision, the SVM classification algorithm, managed to outperform other algorithms with an F-measure of 82%.
Value: Based on several experiments conducted, the SVM classification algorithm can detect words containing cyberbullying on social media.


Cyberbullying, Model Comparison, Machine Learning, Bag of Words, Classification

