Sentiment Analysist of the TPKS Law on Twitter Using InSet Lexicon with Multinomial Naïve Bayes and Support Vector Machine Based on Soft Voting

  • Salsabila Rahadatul Aisy Universitas Negeri Semarang
  • Budi Prasetiyo Universitas Negeri Semarang
Keywords: sentiment analysis, TPKS Law, multinomial naive Bayes, support vector machine, soft voting, InSet lexicon

Abstract

Abstract. The Indonesian Sexual Violence Law (TPKS Law) is a law that regulates forms of sexual violence. The TPKS Law reaped pros and cons in the drafting process and was officially ratified on April 12th, 2022. However, after being ratified, pros and cons can still be found and supervision is needed over the implementation of the law.

Purpose: This study was conducted to identify the application and accuracy of soft voting on multinomial naïve Bayes and support vector machine algorithm, also to find out public opinion on the TPKS Law as a support tool in evaluating the law.

Methods/Study design/approach: The method used is InSet lexicon for labeling with the soft voting classification method on the multinomial naive Bayes and support vector machine algorithm.

Result/Findings: The accuracy obtained by applying 10 k-fold cross validation in soft voting is 84.31%, which uses a weight of 1:3 for multinomial naive Bayes and support vector machines. Soft voting obtains better accuracy than its standalone predictor, and also works well for sentiment analysis of the TPKS Law.

Novelty/Originality/Value: This study using two combined lexicons (Colloquial Indonesian lexicon and the InaNLP formalization dictionary) in normalization process and using InSet lexicon as automatic labeling for sentiment analysis on TPKS Law.

References

[1] F. A. Pozzi, E. Fersini, E. Messina, and B. Liu, “Challenges of sentiment analysis in social networks: An overview,” in Sentiment Analysis in Social Networks, Elsevier Inc., 2017, pp. 1–11. doi: 10.1016/B978-0-12-804412-4.00001-2.
[2] A. Reyes-Menendez, J. R. Saura, and C. Alvarez-Alonso, “Understanding #worldenvironmentday user opinions in twitter: A topic-based sentiment analysis approach,” Int J Environ Res Public Health, vol. 15, no. 11, pp. 1–18, Nov. 2018, doi: 10.3390/ijerph15112537.
[3] A. F. Abbas, A. Jusoh, A. Mas’od, A. H. Alsharif, and J. Ali, “Bibliometrix analysis of information sharing in social media,” Cogent Business & Management, vol. 9, no. 1, pp. 1–23, 2022, doi: 10.1080/23311975.2021.2016556.
[4] A. Sadia, F. Khan, and F. Bashir, “An overview of lexicon-based approach for sentiment analysis,” in 3rd International Electrical Engineering Conference (IEEC), 2018, pp. 1–6.
[5] R. Štrimaitis, P. Stefanovič, S. Ramanauskaitė, and A. Slotkienė, “Financial context news sentiment analysis for the Lithuanian language,” Applied Sciences (Switzerland), vol. 11, no. 10, pp. 1–13, May 2021, doi: 10.3390/app11104443.
[6] S. Biswas, K. Young, and J. Griffith, “A comparison of automatic labelling approaches for sentiment analysis,” arXiv:2211.02976v1. Nov. 05, 2022. doi: 10.5220/0011265900003269.
[7] F. Koto and G. Y. Rahmaningtyas, “Inset lexicon: Evaluation of a word list for Indonesian sentiment analysis in microblogs,” in Proceedings of the 2017 International Conference on Asian Language Processing, IALP 2017, Institute of Electrical and Electronics Engineers Inc., Feb. 2018, pp. 391–394. doi: 10.1109/IALP.2017.8300625.
[8] D. A. Kristiyanti, D. A. Putri, E. Indrayuni, A. Nurhadi, and A. H. Umam, “E-wallet sentiment analysis using naïve Bayes and support vector machine algorithm,” J Phys Conf Ser, vol. 1641, no. 1, Nov. 2020, doi: 10.1088/1742-6596/1641/1/012079.
[9] M. Abbas, K. Ali Memon, A. Aleem Jamali, S. Memon, and A. Ahmed, “Multinomial naive Bayes classification model for sentiment analysis,” IJCSNS International Journal of Computer Science and Network Security, vol. 19, no. 3, pp. 62–67, 2019.
[10] R. Gholami and N. Fakhari, “Support vector machine: Principles, parameters, and applications,” in Handbook of Neural Computation, Elsevier, 2017, pp. 515–535. doi: 10.1016/B978-0-12-811318-9.00027-2.
[11] R. A. Alraddadi and M. I. E.-K. Ghembaza, “Anti-islamic Arabic text categorization using text mining and sentiment analysis techniques,” IJACSA) International Journal of Advanced Computer Science and Applications, vol. 12, no. 8, pp. 776–785, 2021, doi: 10.14569/IJACSA.2021.0120889.
[12] B. B. Bayusuta and Y. Suwanto, “Analisis yuridis undang-undang tindak pidana kekerasan seksual dalam penegakan hukum di Indonesia,” Souvereignty : Jurnal Demokrasi dan Ketahanan Nasional, vol. 1, no. 1, pp. 37–43, 2022.
[13] A. Caterine, B. Adi, and D. Wahyu, “Kebijakan penegakan kukum kekerasan berbasis gender online (KBGO): Studi urgensi pengesahan RUU PKS,” Jurist-Diction, vol. 5, no. 1, pp. 17–34, Jan. 2022, doi: 10.20473/jd.v5i1.32869.
[14] F. A. Paulina and M. Madalina, “Urgensi RUU TPKS sebagai payung hukum bagi korban kekerasan seksual beserta tantangan-tantangan dalam proses pengesahannya,” Souvereignty: Jurnal Demokrasi dan Ketahanan Nasional, vol. 1, no. 1, pp. 136–150, 2022.
[15] Kementerian PPPA, “Menteri PPPA ajak masyarakat kawal implementasi UU TPKS,” Kementerian Pemberdayaan Perempuan dan Perlindungan Anak (PPPA), Apr. 22, 2022. https://www.kemenpppa.go.id/index.php/page/read/29/3867/menteri-pppa-ajak-masyarakat-kawal-implementasi-uu-tpks (accessed Feb. 18, 2023).
[16] M. Asif, A. Ishtiaq, H. Ahmad, H. Aljuaid, and J. Shah, “Sentiment analysis of extremism in social media from textual information,” Telematics and Informatics, vol. 48, no. 101345, pp. 1–20, May 2020, doi: 10.1016/j.tele.2020.101345.
[17] A. Özçift, “Medical sentiment analysis based on soft voting ensemble algorithm,” Yönetim Bilişim Sistemleri Dergisi, vol. 6, no. 1, pp. 42–50, 2020.
[18] A. K. Verma, S. Pal, and S. Kumar, “Classification of skin disease using ensemble data mining techniques,” Asian Pacific Journal of Cancer Prevention, vol. 20, no. 6, pp. 1887–1894, Jun. 2019, doi: 10.31557/APJCP.2019.20.6.1887.
[19] R. Atallah and A. Al-Mousa, “Hearth disease detection using machine learning majority voting ensemble method,” in 2nd International Conference on New Trends in Computing Sciences (ICTCS)), IEEE, 2019, pp. 1–6.
[20] G. P. de Oliveira, A. Fonseca, and P. C. Rodrigues, “Diabetes diagnosis based on hard and soft voting classifiers combining statistical learning models,” Brazilian Journal of Biometrics, vol. 40, no. 4, pp. 415–427, Dec. 2022, doi: 10.28951/bjb.v40i4.605.
[21] B. Andrian, T. Simanungkalit, I. Budi, and A. F. Wicaksono, “Sentiment analysis on customer satisfaction of digital banking in Indonesia,” IJACSA) International Journal of Advanced Computer Science and Applications, vol. 13, no. 3, pp. 466–473, 2022, [Online]. Available: www.ijacsa.thesai.org
[22] R. Novendri, A. S. Callista, D. N. Pratama, and C. E. Puspita, “Sentiment analysis of YouTube movie trailer comments using naïve Bayes,” Bulletin of Computer Science and Electrical Engineering, vol. 1, no. 1, pp. 26–32, Jun. 2020, doi: 10.25008/bcsee.v1i1.5.
[23] S. Taj, B. B. Shaikh, and A. F. Meghji, “Sentiment analysis of news articles: A lexicon based Approach,” in In 2019 2nd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), IEEE, 2019, pp. 1–5. doi: 10.1109/ICOMET.2019.8673428.
[24] O. Karal, “Performance comparison of different kernel functions in SVM for different k value in k-fold cross-validation,” in Innovations in Intelligent Systems and Applications Conference, IEEE, 2020, pp. 1–5.
[25] M. Niu, Y. Li, C. Wang, and K. Han, “RFAmyloid: A web server for predicting amyloid proteins,” Int J Mol Sci, vol. 19, no. 7, pp. 1–13, Jul. 2018, doi: 10.3390/ijms19072071.
[26] P. R. Sihombing and O. P. Hendarsin, “Perbandingan metode artificial neural network (ANN) dan support vector machine (SVM) untuk klasifikasi kinerja perusahaan daerah air minum (PDAM) di Indonesia,” Jurnal Ilmu Komputer, vol. XII, no. 1, pp. 9–20, 2020.
[27] H. A. Santoso, E. H. Rachmawanto, A. Nugraha, A. A. Nugroho, D. R. I. M. Setiadi, and R. S. Basuki, “Hoax classification and sentiment analysis of Indonesian news using naive Bayes optimization,” Telkomnika (Telecommunication Computing Electronics and Control), vol. 18, no. 2, pp. 799–806, Apr. 2020, doi: 10.12928/TELKOMNIKA.V18I2.14744.
[28] A. A. Farisi, Y. Sibaroni, and S. Al Faraby, “Sentiment analysis on hotel reviews using multinomial naïve Bayes classifier,” in Journal of Physics: Conference Series, Institute of Physics Publishing, May 2019, pp. 1–10. doi: 10.1088/1742-6596/1192/1/012024.
[29] N. S. M. Nafis and S. Awang, “An enhanced hybrid feature selection technique using term frequency-inverse document frequency and support vector machine-recursive feature elimination for sentiment classification,” IEEE Access, vol. 9, pp. 52177–52192, 2021, doi: 10.1109/ACCESS.2021.3069001.
[30] S. W. A. Sherazi, J. W. Bae, and J. Y. Lee, “A soft voting ensemble classifier for early prediction and diagnosis of occurrences of major adverse cardiovascular events for STEMI and NSTEMI during 2-year follow-up in patients with acute coronary syndrome,” PLoS One, vol. 16, no. 6, Jun. 2021, doi: 10.1371/journal.pone.0249338.
[31] D. Musfiroh, U. Khaira, P. E. P. Utomo, and T. Suratno, “Analisis sentimen terhadap perkuliahan daring di Indonesia dari Twitter dataset menggunakan InSet lexicon,” MALCOM: Indonesian Journal of Machine Learning and Computer Science, vol. 1, no. 1, pp. 24–33, 2021.
[32] J. P. D. Delizo, M. B. Abisado, and Ma. I. P. D. L. Trinos, “Philippine Twitter sentiments during Covid-19 pandemic using multinomial naïve Bayes,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 1.3, pp. 408–412, Jun. 2020, doi: 10.30534/ijatcse/2020/6491.32020.
[33] F. S. Khurniawan and Y. Ruldeviyani, “Twitter sentiment analysis case study on the revision of the Indonesia’s corruption eradication commision (KPK) law 2019,” in 2020 International Conference on Data Science and Its Applications (ICoDSA), 2020, pp. 1–6. doi: 10.1109/ICoDSA50139.2020.9212851.
[34] T. Sontayasara et al., “Twitter sentiment analysis of bangkok tourism during Covid-19 pandemic using support vector machine algorithm,” Journal of Disaster Research, vol. 16, no. 1, pp. 24–30, 2021, doi: 10.20965/jdr.2021.p0024.
[35] A. T. Mahmood, S. S. Kamaruddin, R. K. Naser, and M. M. Nadzir, “A combination of lexicon and machine learning approaches for sentiment analysis on facebook,” Journal of System and Management Sciences, vol. 10, no. 3, pp. 140–150, 2020, doi: 10.33168/JSMS.2020.0310.
[36] S. Pradha, M. N. Halgamuge, and N. T. Q. Vinh, “Effective text data preprocessing technique for sentiment analysis in social media data,” in 11th International Conference on Knowledge and Systems Engineering, 2019, pp. 1–8. doi: 10.1109/KSE.2019.8919368.
Published
2023-09-29
How to Cite
Aisy, S., & Prasetiyo, B. (2023). Sentiment Analysist of the TPKS Law on Twitter Using InSet Lexicon with Multinomial Naïve Bayes and Support Vector Machine Based on Soft Voting. Recursive Journal of Informatics, 1(2), 93-101. https://doi.org/10.15294/rji.v1i2.68324