Comparative Analysis of ADASYN-SVM and SMOTE-SVM Methods on the Detection of Type 2 Diabetes Mellitus

Nur Ghaniaviyanto Ramadhan(1),


(1) Institut Teknologi Telkom Purwokerto

Abstract

Most people with diabetes in the world are type 2. We can detect diabetes early to prevent things that are not desirable by checking sugar and insulin levels with the doctor. In addition to using this method, people with diabetes can also be grouped based on data from diabetes examination results. However, most of the data on health examination results have several parameters that are difficult for the public to understand. These problems can be done by means of automatic classification. In addition to these problems, there is another problem in the form of an unbalanced amount of data for diabetics and non-diabetics. This problem can be done by balancing the amount of data using the model to increase the ratio of the amount of data that is small or decrease the ratio of the amount of data that is too much. Purpose: This study aims to detect type 2 diabetes mellitus using the SVM classification model and analyze the results of the comparison using the SMOTE and ADASYN data balancing technique which is the best. Methods/Study design/approach: The research method starts from collecting the diabetes dataset, then the dataset cleaning process is carried out whether there is a null value or not. After applying two oversampling methods to analyze which method is the most appropriate. After the oversampling technique was carried out, data classification was carried out using a support vector machine model to see the accuracy results. Result/Findings: The results obtained by the ADASYN-SVM method are superior to SMOTE-SVM. The ADASYNSVM method has an accuracy of 87.3%, while the SMOTE-SVM has an accuracy of 85.4%. Novelty/Originality/Value: The data used in this study came from the Karya Medika clinic, Indonesia which contains parameters related to type 2 diabetes.

Keywords

Diabetes Type 2; Classification; SVM; ADASYN; SMOTE

Full Text:

PDF

References

Khairani. Info Datin (Pusat Data dan Informasi Kementrian Kesehatan Republik Indonesia). Hari diabetes sedunia. [Accessed December 12, 2021]. PDF article, https://pusdatin.kemkes.go.id/download.php?file=download/pusdatin/infodatin/infodatin-Diabetes-2018.pdf, 2018. (In Indonesian)

S. Pangribowo, et.al. Info Datin (Pusat Data dan Informasi Kementrian Kesehatan Republik Indonesia). Tetap produktif, cegah, dan atasi diabetes melitus. [Accessed December 12, 2021]. PDF article, https://pusdatin.kemkes.go.id/download.php?file=download/pusdatin/infodatin/Infodatin-2020-Diabetes-Melitus.pdf. 2020. (In Indonesian)

Ministry of Health RI. “Cegah, cegah, dan cegah: suara dunia perangi diabetes. [Accessed December 12, 2021]. https://www.kemkes.go.id/article/view/18121200001/prevent-prevent-and-prevent-the-voiceof-the-world-fight-diabetes.html. 2018. (In Indonesian)

World Health Organization (WHO). Diabetes country profiles. [Accessed December 12, 2021]. PDF article, https://www.who.int/diabetes/country-profiles/idn_en.pdf. 2021.

https://www.who.int/news-room/fact-sheets/detail/diabetes, Accessed December 12, 2021.

N. G. Ramadhan, Adiwijaya, and A. Romadhony. “Preprocessing handling to enhance detection of type 2 diabetes mellitus based on random forest,” Int. J. Adv. Comput. Sci. Appl., 12(7), pp. 223-228. 2021.

Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, & H. Tang, “Predicting diabetes mellitus with machine learning techniques,” Front. Genet, 9, 515, 2018.

Z. Tafa, N. Pervetica, & B. Karahoda, “An intelligent system for diabetes prediction,” In 2015 4th Mediterr. Conf. Embed. Comput. (MECO) (pp. 378-382), IEEE, June 2015.

S. Mirza, S. Mittal, & M. Zaman, “Decision support predictive model for prognosis of diabetes using SMOTE and decision tree,” Int. J. Appl. Eng. Res., 13(11), pp. 9277-9282, 2018.

A. Azrar, Y. Ali, M. Awais, & K. Zaheer, “ Data mining models comparison for diabetes prediction,” Int. J. Adv. Comput. Sci. Appl., 9(8), pp. 320-323, 2018.

S. Saru, & S. Subashree, “Analysis and prediction of diabetes using machine learning,” Int. J. Emerg. Technol. Innov. Eng., 5(4). 2019.

Q. Wang, W. Cao, J. Guo, J. Ren, Y. Cheng, & D. N. Davis, “DMP_MI: an effective diabetes mellitus classification algorithm on imbalanced data with missing values,” IEEE Access, 7, 102232-102238, 2019.

Tyagi, Shivani, and S. Mittal. "Sampling approaches for imbalanced data classification problem in machine learning." Proc. of ICRIC 2019. Springer, Cham, pp. 209-221, 2020.

Z. Shi, “Improving k-nearest neighbors algorithm for imbalanced data classification,” In IOP Conf. Ser.: Mater. Sci. Eng. (Vol. 719, No. 1, p. 012072), IOP Publishing, 2020.

H. Wu, S. Yang, Z. Huang, J. He, & X. Wang, “Type 2 diabetes mellitus prediction model based on data mining. Inform. Med. Unlocked, 10, pp. 100-107, 2018.

M. Shuja, S. Mittal, & M. Zaman, “Effective prediction of type ii diabetes mellitus using data mining classifiers and SMOTE,” In Adv. Comput. Intell. Syst., (pp. 195-211), Springer, Singapore, 2020.

M. D. Purbolaksono, M. I. Tantowi, A. I. Hidayat, & A. Adiwijaya, “ Perbandingan support vector machine dan modified balanced random forest dalam deteksi pasien penyakit diabetes,” J. RESTI (Rekayasa Sist. Dan Teknol. Inform., 5(2), pp. 393-399, 2021.

J. P. Kandhasamy and S. Balamurali. "Performance analysis of classifier models to predict diabetes mellitus," Procedia Comput. Sci., 47, pp. 45-51, 2015.

M. Md, et al. "Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm," Comput. Methods Programs Biomed, 152, pp. 23-34, 2017.

N. V. Chawla, K. W. Bowyer, L. O. Hall, & W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” J. Artif. Intell. Res., 16, pp. 321-357, 2002.

H. He, Y. Bai, E. A. Garcia, & S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning,” In 2008 IEEE Int. Jt. Conf. Neural. Netw. (IEEE World Congr. Comput. Intell.) (pp. 1322-1328), IEEE, June 2008.

I. Tomek, “Two modifications of CNN,” IEEE Trans. Syst. Man Cybern. 6, pp. 769–772. 1976.

E. García-Gonzalo, Z. Fernández-Muñiz, P. J. García Nieto, A. Bernardo Sánchez, M. Menéndez & M. Fernández, “Hard-rock stability analysis for span design in entry-type excavations with learning classifiers,” Materials, 9(7), pp. 531, 2016.

Refbacks

  • There are currently no refbacks.




Scientific Journal of Informatics (SJI)
p-ISSN 2407-7658 | e-ISSN 2460-0040
Published By Department of Computer Science Universitas Negeri Semarang
Website: https://journal.unnes.ac.id/nju/index.php/sji
Email: [email protected]

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.