Gene Expression-Based Lung Cancer Prediction in Smokers Using SVM and Moth-Flame Optimization Algorithm

Authors

  • Salma Safira Ramandha School of Computing, Telkom University, Bandung, Indonesia Author
  • Angel Metanosa Afinda School of Electrical Engineering, Telkom University, Bandung, Indonesia Author
  • Isman Kurniawan School of Computing, Telkom University, Bandung, Indonesia Author

DOI:

https://doi.org/10.15294/sji.v13i1.38268

Keywords:

lung cancer gene expression, support vector machine, moth flame optimization, metaheuristics

Abstract

Purpose: Lung cancer remains one of the leading causes of death worldwide, especially among active smokers, yet early detection is still difficult because traditional imaging methods have limited sensitivity for identifying early-stage abnormalities. This study was conducted to address the need for a more accurate computational approach capable of detecting lung cancer at a molecular level using gene expression data. The goal is to build a model that can reliably distinguish cancerous from non-cancerous samples based on genomic features.

Methods: This study uses the GSE4115 gene-expression dataset consisting of 187 bronchial epithelial samples and 22,215 gene features. The Moth-Flame Optimization (MFO) algorithm was implemented to select the most informative subset of genes from this high-dimensional dataset. A Support Vector Machine (SVM) classifier was then trained using multiple kernels, with hyperparameter tuning performed to identify the optimal configuration for each kernel.

Results: Experimental results show that the Polynomial kernel achieved the highest performance using 286 MFO-selected features, reaching an accuracy of 0.84 and an F1-score of 0.85. These results confirm that combining MFO with SVM improves classification performance compared to using raw gene data without feature selection.

Novelty: This study provides the first application of MFO-based feature selection for lung cancer prediction in smokers using the GSE4115 dataset. The findings demonstrate the value of nature-inspired optimization for handling high-dimensional genomic data and offer a promising direction for developing early computational detection methods.

Published

01-02-2026

Article ID

38268

Issue

Section

Articles

How to Cite

Gene Expression-Based Lung Cancer Prediction in Smokers Using SVM and Moth-Flame Optimization Algorithm. (2026). Scientific Journal of Informatics, 13(1). https://doi.org/10.15294/sji.v13i1.38268