Penerapan Stacking Ensemble Learning untuk Klasifikasi Efek Kesehatan Akibat Pencemaran Udara

  • Budi Sunarko Universitas Negeri Semarang
  • Uswatun Hasanah Universitas Negeri Semarang
  • Syahroni Hidayat Universitas Negeri Semarang
  • Naufal Muhammad Universitas Negeri Semarang
  • Muhammad Irfan Ardiansyah Universitas Negeri Semarang
  • Briska Putra Ananda Universitas Negeri Semarang
  • Muhammad Khikam Hakiki Universitas Negeri Semarang
  • Luluk Taufiqul Baroroh Universitas Negeri Semarang
Keywords: Dampak kesehatan, Ensemble learning, Klasifikasi efek kesehatan, Pencemaran udara, Stacking

Abstract

Pencemaran udara merupakan masalah serius yang berdampak negatif pada kesehatan manusia. Berbagai jenis polutan udara seperti partikel halus, sulfur dioksida, nitrogen oksida, dan ozon dapat menyebabkan gangguan pernapasan, penyakit jantung, kanker paru-paru, dan masalah kesehatan lainnya. Untuk memahami dampak kesehatan pencemaran udara, klasifikasi efek kesehatan akibat pencemaran udara menjadi penting. Metode klasifikasi ini membagi efek kesehatan berdasarkan jenis polutan, dosis, dan waktu paparan.  Penelitian ini mengusulkan penerapan metode klasifikasi dengan ensemble learning untuk mengidentifikasi polutan berdampak dan tingkat risiko kesehatannya. Ensemble learning adalah teknik pembelajaran mesin yang menggabungkan beberapa model untuk meningkatkan akurasi prediksi. Stacking ensemble learning merupakan salah satu metode yang digunakan dalam klasifikasi efek kesehatan pencemaran udara dengan mengintegrasikan beberapa model dasar seperti Logistic Regression, Decision Tree, K-Nearest Neighbor, Support Vector Machine, dan Multi-Layer Perceptron. Hasil penelitian menunjukkan bahwa model Stacking memberikan performa tertinggi dengan akurasi sekitar 99,9% pada dataset baik yang seimbang maupun tidak seimbang. Namun, model Decision Tree dan K-Nearest Neighbor juga berhasil memberikan performa yang sangat baik. Waktu pelatihan model menjadi pertimbangan penting, di mana K-Nearest Neighbor dan Decision Tree memiliki waktu yang jauh lebih singkat dibandingkan dengan model Stacking.

References

Alves Ribeiro, V. H., Moritz, S., Rehbach, F., & Reynoso-Meza, G. (2020). A novel dynamic multi-criteria ensemble selection mechanism applied to drinking water quality anomaly detection. Science of The Total Environment, 749, 142368. https://doi.org/10.1016/j.scitotenv.2020.142368

Amira, S. A., Utama, S., & Fahmi, M. H. (2020). Penerapan Metode Support Vector Machine untuk Analisis Sentimen pada Review Pelanggan Hotel. Edu Komputika Journal, 7(2), 40–48. https://doi.org/10.15294/edukomputika.v7i2.42608

CH4_CO_CO2_Health Effects | Kaggle. (n.d.). Retrieved March 16, 2023, from https://www.kaggle.com/datasets/airpollutionhealth/ch4-co-co2-health-effects

Cui, L., & Wang, S. (2021). Mapping the daily nitrous acid (HONO) concentrations across China during 2006–2017 through ensemble machine-learning algorithm. Science of The Total Environment, 785, 147325. https://doi.org/10.1016/j.scitotenv.2021.147325

Du, Z., Heng, J., Niu, M., & Sun, S. (2021). An innovative ensemble learning air pollution early-warning system for China based on incremental extreme learning machine. Atmospheric Pollution Research, 12(9), 101153. https://doi.org/10.1016/j.apr.2021.101153

García, S., Zhang, Z.-L., Altalhi, A., Alshomrani, S., & Herrera, F. (2018). Dynamic ensemble selection for multi-class imbalanced datasets. Information Sciences, 445–446, 22–37. https://doi.org/10.1016/j.ins.2018.03.002

Gladkova, E., & Saychenko, L. (2022). Applying machine learning techniques in air quality prediction. Transportation Research Procedia, 63, 1999–2006. https://doi.org/10.1016/j.trpro.2022.06.222

Gokul, P. R., Mathew, A., Bhosale, A., & Nair, A. T. (2023). Spatio-temporal air quality analysis and PM2.5 prediction over Hyderabad City, India using artificial intelligence techniques. Ecological Informatics, 76, 102067. https://doi.org/10.1016/j.ecoinf.2023.102067

Hadj Sassi, M. S., & Chaari Fourati, L. (2022). Comprehensive survey on air quality monitoring systems based on emerging computing and communication technologies. Computer Networks, 209, 108904. https://doi.org/10.1016/j.comnet.2022.108904

Hassan Bhat, T., Jiawen, G., & Farzaneh, H. (2021). Air Pollution Health Risk Assessment (AP-HRA), Principles and Applications. International Journal of Environmental Research and Public Health, 18(4), 1935. https://doi.org/10.3390/ijerph18041935

Hulkkonen, M., Lipponen, A., Mielonen, T., Kokkola, H., & Prisle, N. L. (2022). Changes in urban air pollution after a shift in anthropogenic activity analysed with ensemble learning, competitive learning and unsupervised clustering. Atmospheric Pollution Research, 13(5), 101393. https://doi.org/10.1016/j.apr.2022.101393

Ke, H., Gong, S., He, J., Zhang, L., Cui, B., Wang, Y., Mo, J., Zhou, Y., & Zhang, H. (2022). Development and application of an automated air quality forecasting system based on machine learning. Science of The Total Environment, 806, 151204. https://doi.org/10.1016/j.scitotenv.2021.151204

Khojasteh, D. N., Goudarzi, G., Taghizadeh-Mehrjardi, R., Asumadu-Sakyi, A. B., & Fehresti-Sani, M. (2021). Long-term effects of outdoor air pollution on mortality and morbidity–prediction using nonlinear autoregressive and artificial neural networks models. Atmospheric Pollution Research, 12(2), 46–56. https://doi.org/10.1016/j.apr.2020.10.007

Kumar, K., & Pande, B. P. (2022). Air pollution prediction with machine learning: A case study of Indian cities. International Journal of Environmental Science and Technology. https://doi.org/10.1007/s13762-022-04241-5

Lei, T. M. T., Siu, S. W. I., Monjardino, J., Mendes, L., & Ferreira, F. (2022). Using Machine Learning Methods to Forecast Air Quality: A Case Study in Macao. Atmosphere, 13(9), 1412. https://doi.org/10.3390/atmos13091412

Li, Y., Sha, Z., Tang, A., Goulding, K., & Liu, X. (2023). The application of machine learning to air pollution research: A bibliometric analysis. Ecotoxicology and Environmental Safety, 257, 114911. https://doi.org/10.1016/j.ecoenv.2023.114911

Lin, C.-Y., Chang, Y.-S., & Abimannan, S. (2021). Ensemble multifeatured deep learning models for air quality forecasting. Atmospheric Pollution Research, 12(5), 101045. https://doi.org/10.1016/j.apr.2021.03.008

Liu, S. M., Chen, J.-H., & Liu, Z. (2023). An empirical study of dynamic selection and random under-sampling for the class imbalance problem. Expert Systems with Applications, 221, 119703. https://doi.org/10.1016/j.eswa.2023.119703

Magnolia, C., Nurhopipah, A., & Kusuma, B. A. (2023). Penanganan Imbalanced Dataset untuk Klasifikasi Komentar Program Kampus Merdeka Pada Aplikasi Twitter. Edu Komputika Journal, 9(2), 105–113. https://doi.org/10.15294/edukomputika.v9i2.61854

Maleki, H., Sorooshian, A., Goudarzi, G., Baboli, Z., Tahmasebi Birgani, Y., & Rahmati, M. (2019). Air pollution prediction by using an artificial neural network model. Clean Technologies and Environmental Policy, 21(6), 1341–1352. https://doi.org/10.1007/s10098-019-01709-w

Marini, R. P., Lavely, E. K., Baugher, T. A., Crassweller, R., & Schupp, J. R. (2022). Using Logistic Regression to Predict the Probability That Individual ‘Honeycrisp’ Apples Will Develop Bitter Pit. HortScience, 57(3), 391–399. https://doi.org/10.21273/HORTSCI16081-21

Masmoudi, S., Elghazel, H., Taieb, D., Yazar, O., & Kallel, A. (2020). A machine-learning framework for predicting multiple air pollutants’ concentrations via multi-target regression and feature selection. Science of The Total Environment, 715, 136991. https://doi.org/10.1016/j.scitotenv.2020.136991

Méndez, M., Merayo, M. G., & Núñez, M. (2023). Machine learning algorithms to forecast air quality: A survey. Artificial Intelligence Review. https://doi.org/10.1007/s10462-023-10424-4

Mercol, J. P., Gambini, J., & Santos, J. M. (2008). Automatic classification of oranges using image processing and data mining techniques. XIV Congreso Argentino de Ciencias de La Computación. XIV Argentine Congress of Computer Sciences (CACIC 2008), 1–12.

Murad, M., Sukmawaty, S., Ansar, A., Sabani, R., & Hidayat, S. (2021). Sistem Pendeteksi Kerusakan Buah Mangga Menggunakan Sensor Gas Dengan Metode DCS - LCA. JTIM : Jurnal Teknologi Informasi dan Multimedia, 3(4), 186–194. https://doi.org/10.35746/jtim.v3i4.169

Rismayati, R., Ismarmiaty, I., & Hidayat, S. (2022). Ensemble Implementation for Predicting Student Graduation with Classification Algorithm. International Journal of Engineering and Computer Science Applications (IJECSA), 1(1), 35–42. https://doi.org/10.30812/ijecsa.v1i1.1805

Samek, L. (2016). Overall human mortalityand morbidity due to exposureto air pollution. International Journal of Occupational Medicine and Environmental Health, 29(3), 417–426. https://doi.org/10.13075/ijomeh.1896.00560

Shah, K., Patel, H., Sanghvi, D., & Shah, M. (2020). A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification. Augmented Human Research, 5(1). https://doi.org/10.1007/s41133-020-00032-0

Singh, K. P., Gupta, S., & Rai, P. (2013). Identifying pollution sources and predicting urban air quality using ensemble learning methods. Atmospheric Environment, 80, 426–437. https://doi.org/10.1016/j.atmosenv.2013.08.023

Trends in atmospheric concentrations of CO2 (ppm), CH4 (ppb) and N2O (ppb), between 1800 and 2017—European Environment Agency. (n.d.). [Data Visualization]. Retrieved April 18, 2023, from https://www.eea.europa.eu/data-and-maps/daviz/atmospheric-concentration-of-carbon-dioxide-5#tab-chart_5_filters=%7B%22rowFilters%22%3A%7B%7D%3B%22columnFilters%22%3A%7B%22pre_config_polutant%22%3A%5B%22CH4%20(ppb)%22%5D%7D%7D

Types of pollutants. (n.d.). Retrieved April 18, 2023, from https://www.who.int/teams/environment-climate-change-and-health/air-quality-and-health/health-impacts/types-of-pollutants

Worasawate, D., Sakunasinha, P., & Chiangga, S. (2022). Automatic Classification of the Ripeness Stage of Mango Fruit Using a Machine Learning Approach. AgriEngineering, 4(1), 32–47. https://doi.org/10.3390/agriengineering4010003

Wu, J., Shen, J., Xu, M., & Shao, M. (2021). A novel combined dynamic ensemble selection model for imbalanced data to detect COVID-19 from complete blood count. Computer Methods and Programs in Biomedicine, 211, 106444. https://doi.org/10.1016/j.cmpb.2021.106444

Zhang, Y., Liu, J., & Shen, W. (2022). A Review of Ensemble Learning Algorithms Used in Remote Sensing Applications. Applied Sciences, 12(17), 8654. https://doi.org/10.3390/app12178654

Published
2023-09-29
How to Cite
Sunarko, B., Hasanah, U., Hidayat, S., Muhammad, N., Ardiansyah, M., Ananda, B., Hakiki, M., & Baroroh, L. (2023). Penerapan Stacking Ensemble Learning untuk Klasifikasi Efek Kesehatan Akibat Pencemaran Udara. Edu Komputika Journal, 10(1), 55 - 63. https://doi.org/10.15294/edukomputika.v10i1.72080