Revolutionizing Healthcare: Comprehensive Evaluation and Optimization of SVM Kernels for Precise General Health Diagnosis

Wardatus Sholihah(1), Ade Silvia Handayani(2), Sarjana Sarjana(3),


(1) Departement of Electrical Engineering, Politeknik Negeri Sriwijaya, Indonesia
(2) Departement of Electrical Engineering, Politeknik Negeri Sriwijaya, Indonesia
(3) Departement of Electrical Engineering, Politeknik Negeri Sriwijaya, Indonesia

Abstract

Purpose: This study is driven by a two-fold objective. Firstly, it seeks to optimize the Support Vector Machine (SVM) algorithms in machine learning, comprehensively evaluating diverse SVM kernel variants to enhance their versatility and applicability across various domains, which is beyond the healthcare sector. Secondly, in the context of general health diagnosis, it aims to assess the suitability of SVM kernels for achieving precision in predictive modeling. The choice of SVM is rooted in its effectiveness, proven in classification and regression within data mining. SVMs excel in high-dimensional problem classification, demonstrating superior accuracy, making them invaluable in refining machine learning methodologies and advancing diagnostic systems, promising implications for healthcare and beyond. The chosen SVM model, distinguished by its exceptional performance, is then implemented in real-world applications, particularly in wireless, non-invasive healthcare devices. This deployment signifies a substantial stride toward advancing healthcare practices and holds promising implications for various fields.

Methods: Data for this study was collected from publicly accessible datasets on Kaggle, encompassing a comprehensive array of general health-related information. This dataset, comprised of clinical data and vital signs data, underwent meticulous preprocessing, such as data cleaning, feature extraction, and categorization of health status into ‘healthy’ and ‘requiring further attention’. Subsequently, predictive models were constructed employing Support Vector Machine (SVM) algorithms with various kernel functions, such as Linear, RBF, Polynomial, and Sigmoid. They were trained and tested on the preprocessed dataset to assess their efficacy in general health diagnosis. Model performance was rigorously evaluated using established metrics, including accuracy, precision, recall, F-1 score, Area Under the Curve (AUC), Receiver Operating Characteristic (ROC) curve, and cross-validation. The selection of the most efficacious SVM kernel was governed by stringent adherence to industry standards and best practices, ensuring optimal integration into health diagnostic systems. The chosen model was tested using new datasets obtained from wireless non-invasive healthcare devices and the pre-existing AHD application. Hyperparameter tuning was meticulously executed to maximize accuracy, ensuring the effectiveness of the evaluation process.

Results: The results demonstrate that the Polynomial kernel was selected as the body health diagnostic model instead of the Linear, RBF, and Sigmoid kernels. This kernel has a training time of 0.8 seconds, a testing time of 0.1 seconds, accuracy scores of 97%, precision of 97%, recall of 97%, F-1 score of 97% for training metrics, and accuracy scores of 99%, precision of 99%, recall of 99%, and F-1 score of 99% for testing metrics. The accuracy of the polynomial kernel model decreased to 0.88 on new datasets; adjusting the hyperparameter C to C = 100 resulted in the highest accuracy of 0.945.

Novelty: This study introduces a pioneering approach by rigorously optimizing Support Vector Machine (SVM) algorithms, notably the innovative application of the Polynomial kernel in general health diagnosis. Unlike traditional kernels, the Polynomial kernel exhibited exceptional accuracy (up to 99%) and precision. Furthermore, the study’s unique methodology, combining industry standards and meticulous hyperparameter tuning, ensures seamless integration into real-world healthcare systems. The deployment of this optimized model in wireless non-invasive healthcare devices signifies a groundbreaking advancement, highlighting a novel synthesis of theoretical innovation and practical implementation in machine learning for healthcare.

Keywords

Machine learning, SVM kernels, Health diagnose

Full Text:

PDF

References

A. Ray and A. K. Chaudhuri, “Smart healthcare disease diagnosis and patient management: Innovation, improvement and skill development,” Mach. Learn. with Appl., vol. 3, p. 100011, Mar. 2021, doi: 10.1016/J.MLWA.2020.100011.

J. Vamathevan et al., “Applications of machine learning in drug discovery and development,” Nat. Rev. Drug Discov., vol. 18, no. 6, pp. 463–477, 2019, doi: 10.1038/s41573-019-0024-5.

J. R. Cano, P. A. Gutiérrez, B. Krawczyk, M. Woźniak, and S. García, “Monotonic classification: An overview on algorithms, performance measures and data sets,” Neurocomputing, vol. 341, pp. 168–182, 2019, doi: 10.1016/j.neucom.2019.02.024.

A. Singh and R. Kumar, “Heart Disease Prediction Using Machine Learning Algorithms,” Int. Conf. Electr. Electron. Eng. ICE3 2020, pp. 452–457, Feb. 2020, doi: 10.1109/ICE348803.2020.9122958.

E. Retnoningsih and R. Pramudita, “Mengenal Machine Learning Dengan Teknik Supervised Dan Unsupervised Learning Menggunakan Python,” Bina Insa. Ict J., vol. 7, no. 2, p. 156, 2020, doi: 10.51211/biict.v7i2.1422.

H. Abijono, P. Santoso, and N. L. Anggreini, “Algoritma Supervised Learning Dan Unsupervised Learning Dalam Pengolahan Data,” J. Teknol. Terap. G-Tech, vol. 4, no. 2, pp. 315–318, Apr. 2021, doi: 10.33379/gtech.v4i2.635.

D. Aprilianto, “SVM Optimization with Correlation Feature Selection Based Binary Particle Swarm Optimization for Diagnosis of Chronic Kidney Disease,” J. Soft Comput. Explor., vol. 1, no. 1, Sep. 2020, doi: 10.52465/joscex.v1i1.1.

I. S. Al-Mejibli, D. H. Abd, J. K. Alwan, and A. J. Rabash, “Performance evaluation of kernels in support vector machine,” Proc. - 2018 1st Annu. Int. Conf. Inf. Sci. AiCIS 2018, pp. 96–101, 2019, doi: 10.1109/AiCIS.2018.00029.

C. Zhang, Y. Zhou, J. Guo, G. Wang, and X. Wang, “Research on classification method of high-dimensional class-imbalanced datasets based on SVM,” Int. J. Mach. Learn. Cybern., vol. 10, no. 7, pp. 1765–1778, 2019, doi: 10.1007/s13042-018-0853-2.

S. F. Hussain, “A novel robust kernel for classifying high-dimensional data using Support Vector Machines,” Expert Syst. Appl., vol. 131, pp. 116–131, 2019, doi: 10.1016/j.eswa.2019.04.037.

V. Pappu and P. M. Pardalos, High-Dimensional Data Classification. doi: 10.1007/978-1-4939-0742-7.

S. Ghosh, A. Dasgupta, and A. Swetapadma, “A study on support vector machine based linear and non-linear pattern classification,” Proc. Int. Conf. Intell. Sustain. Syst. ICISS 2019, no. Iciss, pp. 24–28, 2019, doi: 10.1109/ISS1.2019.8908018.

A. Goel and S. K. Srivastava, “Role of kernel parameters in performance evaluation of SVM,” Proc. - 2016 2nd Int. Conf. Comput. Intell. Commun. Technol. CICT 2016, pp. 166–169, 2016, doi: 10.1109/CICT.2016.40.

N. L. Husni, M. Al Muhaajir, E. Prihatini, A. Silvia, S. Nurmaini, and I. Yani, “Optimal Kernel Classifier in Mobile Robots for Determining Gases Type,” Proc. 2018 Int. Conf. Electr. Eng. Comput. Sci. ICECOS 2018, vol. 17, pp. 107–110, 2019, doi: 10.1109/ICECOS.2018.8605252.

V. Sharma, D. Baruah, D. Chutia, P. Raju, and D. K. Bhattacharya, “An assessment of support vector machine kernel parameters using remotely sensed satellite data,” 2016 IEEE Int. Conf. Recent Trends Electron. Inf. Commun. Technol. RTEICT 2016 - Proc., pp. 1567–1570, 2017, doi: 10.1109/RTEICT.2016.7808096.

N. Reska and K. Tsabita, “Comparison of KNN, naive bayes, and decision tree methods in predicting the accuracy of classification of immunotherapy dataset,” J. Student Res. Explor., vol. 1, no. 2, pp. 104–121, Jul. 2023, doi: 10.52465/josre.v1i2.170.

H. Kaur and V. Kumari, “Predictive modelling and analytics for diabetes using a machine learning approach,” Appl. Comput. Informatics, vol. 18, no. 1–2, pp. 90–100, 2022, doi: 10.1016/j.aci.2018.12.004.

S. Bashir, Z. S. Khan, F. Hassan Khan, A. Anjum, and K. Bashir, “Improving Heart Disease Prediction Using Feature Selection Approaches,” Proc. 2019 16th Int. Bhurban Conf. Appl. Sci. Technol. IBCAST 2019, pp. 619–623, 2019, doi: 10.1109/IBCAST.2019.8667106.

S. Uddin, A. Khan, M. E. Hossain, and M. A. Moni, “Comparing different supervised machine learning algorithms for disease prediction,” BMC Med. Inform. Decis. Mak., vol. 19, no. 1, pp. 1–16, 2019, doi: 10.1186/s12911-019-1004-8.

S. I. Ayon, M. M. Islam, and M. R. Hossain, “Coronary Artery Heart Disease Prediction: A Comparative Study of Computational Intelligence Techniques,” IETE J. Res., vol. 68, no. 4, pp. 2488–2507, 2022, doi: 10.1080/03772063.2020.1713916.

“View of Electronic Medical Record Analysis To Determine Medical Diagnosis In Chapter Icd 10 Category Using Machine Learning.”

Y. Khan, A. E. Ostfeld, C. M. Lochner, A. Pierre, and A. C. Arias, “Monitoring of Vital Signs with Flexible and Wearable Medical Devices,” Adv. Mater., vol. 28, no. 22, pp. 4373–4395, 2016, doi: 10.1002/adma.201504366.

R. Ashmore, R. Calinescu, and C. Paterson, “Assuring the Machine Learning Lifecycle: Desiderata, Methods, and Challenges,” ACM Comput. Surv., vol. 54, no. 5, 2021, doi: 10.1145/3453444.

D. Rafique and L. Velasco, “Machine learning for network automation: Overview, architecture, and applications [Invited Tutorial],” J. Opt. Commun. Netw., vol. 10, no. 10, pp. D126–D143, 2018, doi: 10.1364/JOCN.10.00D126.

D. P. F. Möller, “Machine Learning and Deep Learning,” Adv. Inf. Secur., vol. 103, pp. 347–384, 2023, doi: 10.1007/978-3-031-26845-8_8.

V. K. Chauhan, K. Dahiya, and A. Sharma, “Problem formulations and solvers in linear SVM: a review,” Artif. Intell. Rev., vol. 52, no. 2, pp. 803–855, 2019, doi: 10.1007/s10462-018-9614-6.

J. Qezelbash-Chamak, S. Badamchizadeh, K. Eshghi, and Y. Asadi, “A survey of machine learning in kidney disease diagnosis,” Mach. Learn. with Appl., vol. 10, p. 100418, Dec. 2022, doi: 10.1016/j.mlwa.2022.100418.

S. Chidambaram and K. G. Srinivasagan, “Performance evaluation of support vector machine classification approaches in data mining,” Cluster Comput., vol. 22, pp. 189–196, 2019, doi: 10.1007/s10586-018-2036-z.

S. Nandhini and D. J. Marseline, “Performance Evaluation of Machine Learning Algorithms for Email Spam Detection,” Int. Conf. Emerg. Trends Inf. Technol. Eng. ic-ETITE 2020, pp. 1–4, 2020, doi: 10.1109/ic-ETITE47903.2020.312.

M. Mohammadi et al., “A comprehensive survey and taxonomy of the SVM-based intrusion detection systems,” J. Netw. Comput. Appl., vol. 178, no. January, p. 102983, 2021, doi: 10.1016/j.jnca.2021.102983.

O. Karal, “Performance comparison of different kernel functions in SVM for different k value in k-fold cross-validation,” Proc. - 2020 Innov. Intell. Syst. Appl. Conf. ASYU 2020, pp. 0–4, 2020, doi: 10.1109/ASYU50717.2020.9259880.

H. Talabani and E. Avci, “Performance Comparison of SVM Kernel Types on Child Autism Disease Database,” 2018 Int. Conf. Artif. Intell. Data Process. IDAP 2018, pp. 1–5, 2019, doi: 10.1109/IDAP.2018.8620924.

M. A. Almaiah et al., “Machine Kernels,” 2022.

A. S. Handayani et al., “Design of Android and IoS Applications for Mobile Health Monitoring Devices,” Adv. Sustain. Sci. Eng. Technol., vol. 5, no. 2, p. 0230206, 2023, doi: 10.26877/asset.v5i2.16508.

Refbacks

  • There are currently no refbacks.




Scientific Journal of Informatics (SJI)
p-ISSN 2407-7658 | e-ISSN 2460-0040
Published By Department of Computer Science Universitas Negeri Semarang
Website: https://journal.unnes.ac.id/nju/index.php/sji
Email: [email protected]

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.