Optimalisasi Algoritma Naïve Bayes untuk Klasifikasi Tweet Berbahasa Indonesia dalam Mengatasi Hate Speech di Platform X

Authors

  • Muhammad Haikal Author
  • Alamsyah Alamsyah Author

DOI:

https://doi.org/10.15294/zphjtr75

Keywords:

Hate speech detection, Naïve Bayes, tweet classification

Abstract

Hate speech is a form of expression used to express hatred and is often destructive, aimed at opposing individuals or certain groups for various reasons. Cases of hate speech are frequently found on social media, especially during election seasons, which occur regularly. To combat hate speech, monitoring actions are needed by censoring words that have the potential to offend and attack personal elements, such as ethnicity, religion, and race. Previous research has generally focused only on sentiment analysis of tweets to determine their positive or negative weights. This research continues the study on hate speech by developing the application of the Naive Bayes algorithm, specifically the Multinomial and Gaussian variants, along with an automatic censorship system aimed at improving classification accuracy. This system is implemented on social media with the hope of significantly reducing the amount of hate speech. From 13,169 tweets collected as the dataset, the data was classified into 12 categories with the highest accuracy rate being 90%. The test results are stored in the form of a hate speech dictionary that contains inappropriate words, allowing the algorithm to detect and automatically censor tweets containing hate speech

Downloads

Download data is not yet available.

Downloads

Published

2024-10-22

Article ID

15322