Diagnosis of Heart Disease Using Optimized Naïve Bayes Algorithm with Particle Swarm Optimization and Gain Ratio

  • Anisa Meidina Universitas Negeri Semarang
  • Zaenal Abidin Universitas Negeri Semarang
Keywords: Diagnosis, Heart, Naïve Bayes, PSO, Gain Ratio

Abstract

Abstract.

Purpose: This study aims to apply feature selection particle swarm optimization (PSO) and gain ratio to the naïve Bayes algorithm and gauging the level of accuracy before and after applying PSO feature selection and gain ratio to the naïve Bayes algorithm in the diagnosis of heart disease.
Methods/Study design/approach: Data collection is done by using taking the Cleveland dataset obtained from the UCI machine learning repository. The data used in this study were 303 samples. The data is processed using the preprocessing stage. The naïve Bayes algorithm is used for a classifier, while PSO and gain ratio for feature selection.
Result/Findings: The results of the study revealed that the classification accuracy of the naïve Bayes algorithm without the application of feature selection in the Cleveland dataset is 86.88%, while the results of the classification accuracy of the naïve Bayes algorithm after applying PSO and gain ratio in the Cleveland dataset is 93.44%. Application of PSO and gain ratio as feature selection algorithms can improve classification accuracy by 6.56%.
Novelty/Originality/Value: This study combines the PSO feature selection and gain ratio on the naïve Bayes algorithm using the Cleveland dataset. The research model that was carried out was enriched by carrying out the preprocessing stages, namely data cleaning, changing the number of class labels, data normalization, and data discretization. This study shows that using a combination of the PSO feature selection algorithm and the gain ratio gives better accuracy to the naïve Bayes algorithm in diagnosing heart disease.

References

[1] F. Babic, J. Olejar, Z. Vantova, and J. Paralic, “Predictive and descriptive analysis for heart disease diagnosis,” in Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, FedCSIS 2017, Nov. 2017, pp. 155–163. doi: 10.15439/2017F219.
[2] D. Umadevi and M. Snehapriya, “A survey on prediction of heart disease using data mining techniques,” Int. J. Sci. Res., vol. 6, no. 4, pp. 2228–2232, 2017.
[3] M. Shouman, T. Turner, and R. Stocker, “Applying k-nearest neighbour in diagnosing heart disease patients,” Int. J. Inf. Educ. Technol., vol. 2, no. 3, pp. 220–223, 2012.
[4] U. N. Dulhare, “Prediction system for heart disease using naive Bayes and particle swarm optimization.,” Biomed. Res., vol. 29, no. 12, pp. 2646–2649, 2018.
[5] M. Ridwan, H. Suyono, and M. Sarosa, “Penerapan data mining untuk evaluasi kinerja akademik mahasiswa menggunakan algoritma naive Bayes classifier,” J. EECCIS (Electrics, Electron. Commun. Control. Informatics, Syst., vol. 7, no. 1, pp. 59–64, 2013.
[6] M. Muhathir, M. H. Santoso, and R. Muliono, “Analysis naïve Bayes in classifying fruit by utilizing hog feature extraction,” J. Informatics Telecommun. Eng., vol. 4, no. 1, pp. 250–259, 2020, doi: 10.31289/jite.v4i1.3860.
[7] D. Xhemali, C. J. Hinde, and R. G. Stone, “Naïve Bayes vs. decision trees vs. neural networks in the classification of training web pages,” Int. J. Comput. Sci., vol. 4, no. 1, pp. 16–23, 2009.
[8] M. R. Fanani, “Algoritma naïve Bayes berbasis forward selection untuk prediksi bimbingan konseling siswa,” J. Disprotek, vol. 11, no. 1, pp. 13–22, 2020.
[9] M. Arifin, “Ig-knn untuk prediksi customer churn telekomunikasi,” Simetris J. Tek. Mesin, Elektro dan Ilmu Komput., vol. 6, no. 1, pp. 1–10, Apr. 2015, doi: 10.24176/simet.v6i1.230.
[10] I. Muzakkir, A. Syukur, and I. N. Dewi, “Peningkatan akurasi algoritma backpropagation dengan seleksi fitur particle swarm optimization dalam prediksi pelanggan telekomunikasi yang hilang,” Pseudocode, vol. 1, no. 1, pp. 1–10, 2014.
[11] M. A. Al Karomi and Ivandari, “Optimasi algoritma naïve Bayes dengan information gain ratio untuk menangani dataset berdimensi tinggi,” IC-Tech J. Inform. Comput. Technol., vol. 14, no. 2, pp. 18–24, 2019.
[12] L. A. Utami, “Analisis sentimen opini publik berita kebakaran hutan melalui komparasi algoritma support vector machine dan k-nearest neighbor berbasis particle swarm optimization,” J. Pilar Nusa Mandiri, vol. 13, no. 1, pp. 103–112, 2017.
[13] A. A. Nababan, O. S. Sitompul, and Tulus, “Attribute weighting based k-nearest neighbor using gain ratio,” J. Phys. Conf. Ser., vol. 1007, no. 1, pp. 1–6, Apr. 2018, doi: 10.1088/1742-6596/1007/1/012007.
[14] O. Herdiana, S. Maulani, and E. A. Firdaus, “Strategi pemasaran produk industri kreatif menggunakan algoritma k-means clustering berbasis particle swarm optimization,” NUANSA Inform., vol. 15, no. 2, pp. 1–13, Aug. 2021, doi: 10.25134/nuansa.v15i2.4394.
[15] M. S. Amin, Y. K. Chiam, and K. D. Varathan, “Identification of significant features and data mining techniques in predicting heart disease,” Telemat. Informatics, vol. 36, pp. 82–93, 2019, doi: 10.1016/j.tele.2018.11.007.
[16] V. M. Purnama, W. Astuti, and A. Adiwijaya, “Analisis perbandingan klasifikasi microarray menggunakan naïve Bayes dan support vector machine (SVM) untuk deteksi kanker dengan feature extraction PCA,” eProceedings Eng., vol. 8, no. 5, pp. 9974–9986, 2021.
[17] H. Rahmawan, “Penentuan rekomendasi pelatihan pengembangan diri bagi pegawai negeri sipil menggunakan algoritma c4.5 dengan principal component analysis dan diskritisasi,” J. Tekno Kompak, vol. 14, no. 1, pp. 5–10, Feb. 2020, doi: 10.33365/jtk.v14i1.531.
[18] X. Xu, H. Rong, M. Trovati, M. Liptrott, and N. Bessis, “CS-PSO: chaotic particle swarm optimization algorithm for solving combinatorial optimization problems,” Soft Comput., vol. 22, no. 3, pp. 783–795, 2018, doi: 10.1007/s00500-016-2383-8.
[19] I. Yulianti, R. A. Saputra, M. S. Mardiyanto, and A. Rahmawati, “Optimasi akurasi algoritma C4.5 berbasis particle swarm optimization dengan teknik bagging pada prediksi penyakit ginjal kronis optimization of C4.5 algorithm based on particle swarm optimization with bagging technique on prediction of chronic Kidney Dise,” Techno.Com, vol. 19, no. 4, pp. 411–421, 2020, [Online]. Available: https://archive.ics.uci.edu/ml/
[20] A. Gholamy, V. Kreinovich, and O. Kosheleva, “Why 70/30 or 80/20 relation between training and testing sets: a pedagogical explanation,” Dep. Tech. Reports (CS)., pp. 1–6, 2018.
[21] M. Sabransyah, Y. N. Nasution, and F. D. T. Amijaya, “Aplikasi metode naive Bayes dalam prediksi resiko penyakit jantung,” EKSPONENSIAL, vol. 8, no. 2, pp. 111–118, 2017.
[22] C. Y. Chiu, Y. F. Chen, I. T. Kuo, and H. C. Ku, “An intelligent market segmentation system using k-means and particle swarm optimization,” Expert Syst. Appl., vol. 36, no. 3, pp. 4558–4565, 2009, doi: 10.1016/j.eswa.2008.05.029.
Published
2023-09-29
How to Cite
Meidina, A., & Abidin, Z. (2023). Diagnosis of Heart Disease Using Optimized Naïve Bayes Algorithm with Particle Swarm Optimization and Gain Ratio. Recursive Journal of Informatics, 1(2), 47-54. https://doi.org/10.15294/rji.v1i2.67278

Most read articles by the same author(s)