Hybrid Feature Selection for Effective Heart Disease Detection: A Multi-Algorithm Machine Learning Approach

Authors

  • Syahrani Lonang Department of Information Technology, Universitas Qamarul Huda Badaruddin Bagu, Central Lombok, Indonesia Author https://orcid.org/0009-0007-2510-8148
  • Ahmad Fatoni Dwi Putra Department of Computer Science, Universitas Qamarul Huda Badaruddin Bagu, Central Lombok, Indonesia Author
  • Fahmi Syuhada Department of Computer Science, Universitas Qamarul Huda Badaruddin Bagu, Central Lombok, Indonesia Author https://orcid.org/0009-0005-9577-6729
  • Asno Azzawagama Firdaus Department of Computer Science, Universitas Qamarul Huda Badaruddin Bagu, Central Lombok, Indonesia Author https://orcid.org/0009-0005-1564-6167
  • Alya Masitha Department of Software Engineering, Institut Teknologi Statistika dan Bisnis Muhammadiyah Semarang, Indonesia Author

DOI:

https://doi.org/10.15294/sji.v13i1.38815

Keywords:

Heart disease, Hybrid feature selection, SMOTEENN, Machine learning, Random forest

Abstract

Purpose: This research aims to develop an effective early detection model for heart disease with data balancing and hybrid feature selection. The study seeks to enhance predictive accuracy and minimize errors, providing a robust model for clinical decision support systems.

Methods: The study used the Heart Failure Prediction dataset derived from Kaggle. A novel hybrid framework was implemented, integrating SMOTEENN (Synthetic Minority Over-sampling Technique + Edited Nearest Neighbors) for data balancing and a Hybrid Feature Selection (HFS) method combining Chi-square and Backward Elimination. Eight machine learning algorithms, including Logistic Regression, Naïve Bayes, Decision Tree, K Nearest Neighbor, Random Forest, Gradient Boosting, Support Vector Machine, and XGBoost. Performance was assessed based on accuracy, precision, recall, f1-score, specificity, AUC Score, fallout and miss rate.

Result: The proposed framework significantly improved classification performance across all algorithms. The Random Forest model emerged as the optimal classifier, achieving an accuracy of 99.44%, AUC Score of 99.98%, and a specific reduction in miss rate to 0.92% (from 10.03% baseline). The HFS method successfully reduced the feature space by 54%, identifying 'ExerciseAngina', 'FastingBS', 'ST_Slope', 'ChestPainType', and 'Sex' as the most critical predictors. The model outperformed standard approaches and recent state-of-the-art benchmarks by over 10% in accuracy.

Novelty: This study introduces a synergistic integration of SMOTEENN with hybrid feature selection. The combination significantly improves model performance in early heart disease detection.

Downloads

Published

16-03-2026

Article ID

38815

Issue

Section

Articles

How to Cite

Hybrid Feature Selection for Effective Heart Disease Detection: A Multi-Algorithm Machine Learning Approach. (2026). Scientific Journal of Informatics, 13(1), 119-132. https://doi.org/10.15294/sji.v13i1.38815