Detecting Hate Speech Tweets and Abusive Tweets In Indonesian Languange Using Random Forest and Support Vector Machine with Voting Classifier Technique

  • Dandi Indra Wijaya Universitas Negeri Semarang
  • Riza Arifudin
Keywords: Sentiment Anlysis, Support Vector Machine, Random Forest, Voting Classifier

Abstract

The use of social media has become one of the main things in everyday life. This happens because the features provided make it easy for people to communicate and disseminate information. One of the social media used by many people is Twitter. the main feature of twitter is that its users can post posts that are termed tweets. There is a negative thing about the freedom to write a tweet, namely a tweet that does not contain things that harm other people or community. The problem that arises from this negative thing is to distinguish between hatespeech tweets and abusive tweets. Hate speech and abusive speech are often the same thing. These differences need to be considered because they can have a negative impact on social life. Sentiment analysis is used to distinguish the two things. Sentiment analysis is an implementation of natural language processing which is part of machine learning. The algorithms used in this research are Support Vector Machine, Random Forest, and Voting Classifier with soft voting type. The estimator for the Voting Classifier is the Support Vector Machine and Random Forest. TF-IDF and N-gram were used as feature extraction. The data used is a tweet dataset that has been labeled neutral, hate speech, and rude speech. Measurement of model accuracy is done by using confusion matrix. The highest accuracy was produced by a combination of Voting Classifier technique with TF-IDF feature extraction and the amount of N-gram was 1 gram, which was 82.57% accuracy.

Published
2022-12-08
How to Cite
Wijaya, D., & Arifudin, R. (2022). Detecting Hate Speech Tweets and Abusive Tweets In Indonesian Languange Using Random Forest and Support Vector Machine with Voting Classifier Technique. Journal of Advances in Information Systems and Technology, 4(1), 24-32. https://doi.org/10.15294/jaist.v4i1.59521
Section
Articles