Classification of Early Stages of Lung Cancer based on First and Second Order Statistical Variations using Decision Tree Method

Soeparmi Soeparmi(1), Umi Salamah(2), Arnita Ayu Ningrum(3), Mohtar Yunianto(4),


(1) Department of Physics, Universitas Sebelas Maret, Surakarta, Indonesia
(2) Department of Informatics, Universitas Sebelas Maret, Surakarta, Indonesia
(3) Department of Physics, Universitas Sebelas Maret, Surakarta, Indonesia
(4) Department of Physics, Universitas Sebelas Maret, Surakarta, Indonesia

Abstract

Purpose: This research aims to produce the best performance in identifying early-stage lung cancer class through CT-Scan image analysis using the decision tree classification method and to determine the results of the best classification performance from the variations carried out.

Methods: Five steps in the CT-Scan image classification process for early-stage lung cancer class based on tumor density measurements. First, image data preparation where the image data used was 280 CT-Scan images with a pixel size of 607 x 607 and PNG format taken from the LIDC-IDRI database at https://www.cancerimagingarchive.net/ with a total of 1010 CT-Scan data scans. Second, the grayscaling stage converts the RGB image to a grayscale. Third, combining a high pass filter and Gaussian smoothing filter method is used to remove salt pepper noise and to smooth the image. Fourth, the feature extraction stage uses first and second-order statistics with 22 features used. The fifth is the classification stage using a decision tree, which is then validated using the k-fold method with k=10 so that all image data can be tested thoroughly.

Result: The accuracy rate at the training stage was 90.51%, and at the testing stage was 89.99%. Stage I lung cancer detection program through CT-Scan imagery was successfully created with the highest PSNR value proven to optimize the accuracy level, precision, and recall in the testing phase results of 89.99%, 91.24%, and 89.64%.

Novelty: Based on previous research searches, no one had used machine learning to classify early-stage lung cancer. Punithavathy et al. (2015) and Meliala (2021) stated that early detection of lung cancer can increase survival by 60%-70%. This research will produce a new method for determining early-stage lung cancer.

 

Keywords

Lung cancer; CT-Scan; LIDC-IDRI; decision tree

Full Text:

PDF

References

A. Azzam, G. Samy, M. A. Hagras, and R. ElKholy, “Geographic information systems-based framework for water–energy–food nexus assessments,” Ain Shams Eng. J., p. 102224, Mar. 2023, doi: 10.1016/j.asej.2023.102224.

H. Sung et al., “Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries,” CA. Cancer J. Clin., vol. 71, no. 3, pp. 209–249, 2021, doi: 10.3322/caac.21660.

R. Zheng et al., “Lung cancer incidence and mortality in China: Updated statistics and an overview of temporal trends from 2000 to 2016,” J. Natl. Cancer Cent., vol. 2, no. 3, pp. 139–147, 2022, doi: 10.1016/j.jncc.2022.07.004.

R. Pandian, V. Vedanarayanan, D. N. S. Ravi Kumar, and R. Rajakumar, “Detection and classification of lung cancer using CNN and Google net,” Meas. Sensors, vol. 24, no. September, p. 100588, 2022, doi: 10.1016/j.measen.2022.100588.

X. Wang et al., “Histological types of lung cancer attributable to fine particulate, smoking, and genetic susceptibility,” Sci. Total Environ., vol. 858, no. November 2022, p. 159890, 2023, doi: 10.1016/j.scitotenv.2022.159890.

S. H. Feng and S. T. Yang, “The new 8th tnm staging system of lung cancer and its potential imaging interpretation pitfalls and limitations with ct image demonstrations,” Diagnostic Interv. Radiol., vol. 25, no. 4, pp. 270–279, 2019, doi: 10.5152/dir.2019.18458.

H. K. P. Faisal, J. Zaini, and F. Yunus, “Next-Generation Sequencing pada Kanker Paru,” eJournal Kedokt. Indones., vol. 8, no. 2, 2020, doi: 10.23886/ejki.8.11579.

N. P. Damayanti, M. N. D. Ananda, and F. W. Nugraha, “Lung cancer classification using convolutional neural network and DenseNet,” J. Soft Comput. Explor., vol. 4, no. 3, pp. 133–141, 2023, doi: 10.52465/joscex.v4i3.177.

S. Wang, L. Dong, X. Wang, and X. Wang, “Classification of pathological types of lung cancer from CT images by deep residual neural networks with transfer learning strategy,” Open Med., vol. 15, no. 1, pp. 190–197, 2020, doi: 10.1515/med-2020-0028.

A. K. Nugroho, I. Permadi, and M. Faturrahim, “Improvement Of Image Quality Using Convolutional Neural Networks Method,” Sci. J. Informatics, vol. 9, no. 1, pp. 95–103, 2022, doi: 10.15294/sji.v9i1.30892.

A. A. Hakim, E. Juanara, and R. Rispandi, “Mask Detection System with Computer Vision-Based on CNN and YOLO Method Using Nvidia Jetson Nano,” J. Inf. Syst. Explor. Res., vol. 1, no. 2, pp. 109–122, 2023, doi: 10.52465/joiser.v1i2.175.

M. Yunianto et al., “Klasifikasi Kanker Paru Paru Menggunakan Naive Bayes Dengan Variasi Filter Dan Ekstraksi Ciri Gray Level Co-occurance Matrix (GLCM),” Indones. J. Appl. Phys., vol. 11, no. 2, pp. 256–267, 2021.

S. H. Wibowo and F. Susanto, “Penerapan Metode Gaussian Smoothing Untuk Mereduksi Noise Pada Citra Digital,” J. Media Infotama, vol. 12, no. 2, 2017, doi: 10.37676/jmi.v12i2.416.

Radi, M. Rivai, and M. H. Purnomo, “Combination of first and second order statistical features of bulk grain image for quality grade estimation of green coffee bean,” ARPN J. Eng. Appl. Sci., vol. 10, no. 18, pp. 8165–8174, 2015.

S. K. Haralick RM, “IEEE Transactions on systems, man, and cybernetics:610– 621, 1973,” Textural Featur. image Classif., vol. 3, pp. 610–621, 1973.

I. K. Nti, O. Nyarko-Boateng, and J. Aning, “Performance of Machine Learning Algorithms with Different K Values in K-fold CrossValidation,” Int. J. Inf. Technol. Comput. Sci., vol. 13, no. 6, pp. 61–71, 2021, doi: 10.5815/ijitcs.2021.06.05.

Endang S Kresnawati, Yulia Resti, Bambang Suprihatin, M. Rendy Kurniawan, and Widya Ayu Amanda, “Coronary Artery Disease Prediction Using Decision Trees and Multinomial Naïve Bayes with k-Fold Cross Validation,” Inomatika, vol. 3, no. 2, pp. 174–189, 2021, doi: 10.35438/inomatika.v3i2.266.

D. Colledani, P. Anselmi, and E. Robusto, “Machine learning-decision tree classifiers in psychiatric assessment: An application to the diagnosis of major depressive disorder,” Psychiatry Res., vol. 322, no. February, p. 115127, 2023, doi: 10.1016/j.psychres.2023.115127.

M. M. Ghiasi, S. Zendehboudi, and A. A. Mohsenipour, “Decision tree-based diagnosis of coronary artery disease: CART model,” Comput. Methods Programs Biomed., vol. 192, p. 105400, 2020, doi: 10.1016/j.cmpb.2020.105400.

F. M. Javed Mehedi Shamrat, R. Ranjan, K. M. Hasib, A. Yadav, and A. H. Siddique, “Performance Evaluation Among ID3, C4.5, and CART Decision Tree Algorithm,” Lect. Notes Networks Syst., vol. 317, no. March 2021, pp. 127–142, 2022, doi: 10.1007/978-981-16-5640-8_11.

D. Valero-carreras, J. Alcaraz, and M. Landete, “Computers and Operations Research Comparing two SVM models through different metrics based on the confusion matrix,” Comput. Oper. Res., vol. 152, no. April 2022, p. 106131, 2023, doi: 10.1016/j.cor.2022.106131.

M. Grandini, E. Bagli, and G. Visani, “Metrics for Multi-Class Classification: an Overview,” pp. 1–17, 2020, [Online]. Available: http://arxiv.org/abs/2008.05756

R. S. D. Wijaya, Adiwijaya, Andriyan B Suksmono, and Tati LR Mengko, “Segmentasi Citra Kanker Serviks Menggunakan Markov Random Field dan Algoritma K-Means,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 5, no. 1, pp. 139–147, 2021, doi: 10.29207/resti.v5i1.2816.

S. Singh, “Performance Evaluation of High Pass , Low Pass and Median filter on Webcam Pictures,” no. March 2017, 2020.

D. Darwis and K. KISWORO, “Teknik Steganografi untuk Penyembunyian Pesan Teks Menggunakan Algoritma End Of File,” Explor. J. Sist. Inf. dan Telemat., vol. 8, no. 2, 2017, doi: 10.36448/jsit.v8i2.950.

U. Sara, M. Akter, and M. S. Uddin, “Image Quality Assessment through FSIM, SSIM, MSE and PSNR—A Comparative Study,” J. Comput. Commun., vol. 07, no. 03, pp. 8–18, 2019, doi: 10.4236/jcc.2019.73002.

D. R. I. M. Setiadi, “Improved payload capacity in LSB image steganography uses dilated hybrid edge detection,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 2, pp. 104–114, 2022, doi: 10.1016/j.jksuci.2019.12.007.

R. Kumar, G. Sharma, and V. Sanduja, “A Real Time Approach to Compare PSNR and MSE Value of Different Original Images and Noise ( Salt and Pepper, Speckle, Gaussian) Added Images,” Int. J. Latest Technol. Eng., vol. VII, no. I, pp. 43–46, 2018, [Online]. Available: www.ijltemas.in

A. B. Prasetyo et al., “Comparative Analysis of Image on Several Edge Detection Techniques,” TEM J., vol. 12, no. 1, pp. 111–117, 2023, doi: 10.18421/TEM121-15.

W. K. Mutlag, S. K. Ali, Z. M. Aydam, and B. H. Taher, “Feature Extraction Methods: A Review,” J. Phys. Conf. Ser., vol. 1591, no. 1, 2020, doi: 10.1088/1742-6596/1591/1/012028.

D. A. Puspitawati, “Sistem Pakar Diagnosis Penyakit Kanker Payudara Dan Cara Penanganannya,” J. Techno Nusa Mandiri, vol. 15, no. 2, p. 129, 2018, doi: 10.33480/techno.v15i2.921.

M. Jia, “乳鼠心肌提取 HHS Public Access,” Physiol. Behav., vol. 176, no. 3, pp. 139–148, 2017.

S. Sivakumar and C. Chandrasekar, “Lung nodule detection using fuzzy clustering and support vector machines,” Int. J. Eng. Technol., vol. 5, no. 1, pp. 179–185, 2013.

R. A. Syifa, K. Adi, and C. E. Widodo, “Analisis Tekstur Citra Mikroskopis Kanker Paru Menggunakan Metode Gray Level Co-Occurance Matrix (Glcm) Dan Tranformasi Wavelet Dengan Klasifikasi Naive Bayes,” Youngster Phys. J., vol. 5, no. 4, pp. 457–462, 2016.

G. A. P. Singh and P. K. Gupta, “Performance analysis of various machine learning-based approaches for detection and classification of lung cancer in humans,” Neural Comput. Appl., vol. 31, no. 10, pp. 6863–6877, 2019, doi: 10.1007/s00521-018-3518-x.

Ö. Günaydin, M. Günay, and Ö. Şengel, “Comparison of lung cancer detection algorithms,” 2019 Sci. Meet. Electr. Biomed. Eng. Comput. Sci. EBBT 2019, 2019, doi: 10.1109/EBBT.2019.8741826.

C. Dev, K. Kumar, A. Palathil, T. Anjali, and V. Panicker, “Machine Learning Based Approach For Detection Of Lung Cancer In DICOM CT Image,” Ambient Commun. Comput. Syst., pp. 161–173, 2019.

M. Islam, A. H. Mahamud, and R. Rab, “Analysis of CT Scan Images to Predict Lung Cancer Stages Using Image Processing Techniques,” 2019 IEEE 10th Annu. Inf. Technol. Electron. Mob. Commun. Conf. IEMCON 2019, no. October, pp. 961–967, 2019, doi: 10.1109/IEMCON.2019.8936175.

Q. Firdaus, R. Sigit, T. Harsono, and A. Anwar, “Lung cancer detection based on ct-scan images with detection features using gray level co-occurrence matrix (glcm) and support vector machine (svm) methods,” IES 2020 - Int. Electron. Symp. Role Auton. Intell. Syst. Hum. Life Comf., pp. 643–648, 2020, doi: 10.1109/IES50839.2020.9231663.

S. Shanthi and N. Rajkumar, “Lung Cancer Prediction Using Stochastic Diffusion Search (SDS) Based Feature Selection and Machine Learning Methods,” Neural Process. Lett., vol. 53, no. 4, pp. 2617–2630, 2021, doi: 10.1007/s11063-020-10192-0.

Refbacks

  • There are currently no refbacks.




Scientific Journal of Informatics (SJI)
p-ISSN 2407-7658 | e-ISSN 2460-0040
Published By Department of Computer Science Universitas Negeri Semarang
Website: https://journal.unnes.ac.id/nju/index.php/sji
Email: [email protected]

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.