Sentiment Analysis on Twitter Social Media Regarding Covid-19 Vaccination with Naive Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT)

  • Angga Riski Dwi Saputra Universitas Negeri Semarang
  • Budi Prasetiyo Universitas Negeri Semarang
Keywords: Sentiment Analysis, Naïve Bayes Classifier (NBC), Bidirectional Encoder Representations from Transformers (BERT)

Abstract

Abstract. The Covid-19 vaccine is an important tool to stop the Covid-19 pandemic, however, there are pros and cons from the public regarding this Covid-19 vaccine.

Purpose: These responses were conveyed by the public in many ways, one of which is through social media such as Twitter. Responses given by the public regarding the Covid-19 vaccination can be analyzed and categorized into responses with positive, neutral or negative sentiments.

Methods: In this study, sentiment analysis was carried out regarding Covid-19 vaccination originating from Twitter using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. The data used in this study is public tweet data regarding the Covid-19 vaccination with a total of 29,447 tweet data in English.

Result: Sentiment analysis begins with data preprocessing on the dataset used for data normalization and data cleaning before classification. Then word vectorization was performed with TF-IDF and data classification was performed using the Naïve Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT) algorithms. From the classification results, an accuracy value of 73% was obtained for the Naïve Bayes Classifier (NBC) algorithm and 83% for the Bidirectional Encoder Representations from Transformers (BERT) algorithm.

Novelty: A direct comparison between classical models such as NBC and modern deep learning models such as BERT offers new insights into the advantages and disadvantages of both approaches in processing Twitter data. Additionally, this study proposes temporal sentiment analysis, which allows evaluating changes in public sentiment regarding vaccination over time. Another innovation is the implementation of a hybrid approach to data cleansing that combines traditional methods with the natural language processing capabilities of BERT, which more effectively addresses typical Twitter data issues such as slang and spelling errors. Finally, this research also expands sentiment classification to be multi-label, identifying more specific sentiment categories such as trust, fear, or doubt, which provides a deeper understanding of public opinion.

References

[1] D. S. Hui et al., “The Continuing 2019-Ncov Epidemic Threat of Novel Coronaviruses to Global Health — The Latest 2019 Novel Coronavirus Outbreak in Wuhan, China,” International Journal of Infectious Diseases, vol. 91. 2020. doi: 10.1016/j.ijid.2020.01.009.
[2] C. Liu et al., “Research and Development on Therapeutic Agents and Vaccines for COVID-19 and Related Human Coronavirus Diseases,” ACS Cent Sci, vol. 6, no. 3, 2020, doi: 10.1021/acscentsci.0c00272.
[3] P. Amira Sumitro et al., “Analisis Sentimen Terhadap Vaksin Covid-19 di Indonesia pada Twitter Menggunakan Metode Lexicon Based.” Jurnal Informatika Dan Teknologi Komputer ( J-ICOM), 2(2), 50-56. https://doi.org/10.33059/j-icom.v2i2.4009
[4] E. Kontopoulos, C. Berberidis, T. Dergiades, and N. Bassiliades, “Ontology-based Sentiment Analysis Of Twitter Posts,” Expert Syst Appl, vol. 40, no. 10, pp. 4065–4074, Aug. 2013, doi: 10.1016/j.eswa.2013.01.001.
[5] H. Hayati and M. R. Alifi, “Analisis Sentimen pada Tweet Terkait Vaksin Covid-19 Menggunakan Metode Support Vector Machine.” Jurnal Teknologi Terapan |, 7(2). https://doi.org/10.31884/jtt.v7i2.349
[6] W. Athira Luqyana, I. Cholissodin, and R. S. Perdana, “Analisis Sentimen Cyberbullying pada Komentar Instagram dengan Metode Klasifikasi Support Vector Machine,” 2018. [Online]. Available: http://j-ptiik.ub.ac.id
[7] R. Fajar, S. Program, P. Rekayasa, N. Lunak, and R. Bengkalis, “Implementasi Algoritma Naive Bayes Terhadap Analisis Sentimen Opini Film pada Twitter. 3(1).” Jurnal Inovtek Polbeng Seri Informatika, vol. 3, no. 1, 2018, pp. 50-59, doi:10.35314/isi.v3i1.335.
[8] R. Kusnadi, Y. Yusuf, A. Andriantony, R. Ardian Yaputra, and M. Caintan, “Analisis Sentimen Terhadap Game Genshin Impact Menggunakan Bert,” Rabit : Jurnal Teknologi dan Sistem Informasi Univrab, vol. 6, no. 2, pp. 122–129, Jul. 2021, doi: 10.36341/rabit.v6i2.1765.
[9] Q. G. To et al., “Applying Machine Learning to Identify Anti‐Vaccination Tweets During the Covid‐19 Pandemic,” Int J Environ Res Public Health, vol. 18, no. 8, Apr. 2021, doi: 10.3390/ijerph18084069.
Published
2024-09-30
How to Cite
Saputra, A., & Prasetiyo, B. (2024). Sentiment Analysis on Twitter Social Media Regarding Covid-19 Vaccination with Naive Bayes Classifier (NBC) and Bidirectional Encoder Representations from Transformers (BERT). Recursive Journal of Informatics, 2(2), 106 - 113. https://doi.org/10.15294/rji.v2i2.67502