Performance Evaluation of Machine Learning Models for Soil Fertility Classification Based on the Indian Soil Fertility Dataset

Authors

  • Yoga Pristyanto Universitas Amikom Yogyakarta Author
  • Ibrahim Aji Fajar Romadhon Universitas Amikom Yogyakarta Author
  • Anggit Ferdita Nugraha Universitas Amikom Yogyakarta Author
  • Atik Nurmasani Universitas Amikom Yogyakarta Author
  • Irma Rofni Wulandari Universitas Amikom Yogyakarta Author

DOI:

https://doi.org/10.15294/edukom.v12i1.10317

Keywords:

Classification , Ensemble Classifier, Machine Learning, Single Classifier, Soil Fertility Classification

Abstract

Rice farming productivity worldwide has been declining due to improper soil management practices, including excessive chemical fertilizer use and irregular irrigation. The main challenge lies in accurately classifying soil fertility levels to support optimal land use and reduce resource waste, especially when dealing with imbalanced datasets. This study aims to compare the performance of single classifiers and ensemble classifiers in classifying soil fertility. The single classifiers used include K-Nearest Neighbor (KNN), Naive Bayes, Decision Tree, Support Vector Machine (SVM), and Artificial Neural Network (ANN), while the ensemble classifiers include Random Forest and XGBoost. The Indian Soil Fertility Dataset, obtained from Kaggle, contains 880 samples with 12 features and 1 output class. The research methodology involved data acquisition, preprocessing, data splitting, standardization, and classification, with performance evaluation conducted using a confusion matrix. The results show that ensemble classifiers, particularly Random Forest and XGBoost, outperform single classifiers in imbalanced datasets, achieving accuracy, precision, recall, and F1-score values exceeding 92%-95% across all split scenarios. The findings conclude that Random Forest and XGBoost can serve as reliable models for assisting farmers and agricultural experts in evaluating soil conditions, minimizing unnecessary fertilizer usage, and improving rice farming productivity globally.

References

Akula, B., Reddy, Dr. K. I., N, D., & RS, P. (2023). Advances in soil fertility classification: Data mining approach. International Journal of Statistics and Applied Mathematics, 8(5S), 475–481. https://doi.org/10.22271/maths.2023.v8.i5sg.1240

Blesslin Sheeba, T., Anand, L. D. V., Manohar, G., Selvan, S., Wilfred, C. B., Muthukumar, K., Padmavathy, S., Ramesh Kumar, P., & Asfaw, B. T. (2022). Machine Learning Algorithm for Soil Analysis and Classification of Micronutrients in IoT-Enabled Automated Farms. Journal of Nanomaterials, 2022(1), 5343965. https://doi.org/https://doi.org/10.1155/2022/5343965

Bouasria, A., Bouslihim, Y., Gupta, S., Taghizadeh-Mehrjardi, R., & Hengl, T. (2023). Predictive performance of machine learning model with varying sampling designs, sample sizes, and spatial extents. Ecological Informatics, 78. https://doi.org/10.1016/j.ecoinf.2023.102294

Bouslihim, Y., John, K., Miftah, A., Azmi, R., Aboutayeb, R., Bouasria, A., Razouk, R., & Hssaini, L. (2024). The effect of covariates on Soil Organic Matter and pH variability: a digital soil mapping approach using random forest model. Annals of GIS, 30(2), 215–232. https://doi.org/10.1080/19475683.2024.2309868

Fidiyanto, N., & Izzati, A. N. (2024). Penerapan Data Mining Klasifikasi Lahan Tanam Buah Alpukat dengan Algoritma Naïve Bayes. BIOS : Jurnal Teknologi Informasi Dan Rekayasa Komputer, 5(2), 95–103. https://doi.org/10.37148/bios.v5i2.125

Hanif, N. A., Hannats, M., Ichsan, H., & Budi, A. S. (2022). Rancangan Sistem Klasifikasi Kesuburan Tanah pada Tanaman Pangan berdasarkan PH dan Kelembapan berbasis Arduino Nano menggunakan Metode K-NN dan Aplikasi Android (Vol. 6, Issue 8). http://j-ptiik.ub.ac.id

Jaiswal, R. (2024, August 17). Soil Fertility Dataset. Kaggle: Https://Www.Kaggle.Com/Datasets/Rahuljaiswalonkaggle/Soil-Fertility-Dataset.

Mallah, S., Delsouz Khaki, B., Davatgar, N., Scholten, T., Amirian-Chakan, A., Emadi, M., Kerry, R., Mosavi, A. H., & Taghizadeh-Mehrjardi, R. (2022). Predicting Soil Textural Classes Using Random Forest Models: Learning from Imbalanced Dataset. Agronomy, 12(11). https://doi.org/10.3390/agronomy12112613

Mukhtar, H., Maulina Syafutri, T., Aulia Rahman, R., Putra, A., Hafsari, R., Ilmu Komputer, F., & Muhammadiyah Riau, U. (2024). Analisis Kesuburan Pertanian Melalui Irigasi Dengan Menggunakan Metode K-Means Clustering (Vol. 4, Issue 2).

Pradana, M. R., Hannats, M., Ichsan, H., & Akbar, S. R. (2023). Klasifikasi Kesuburan dan Daya Ukur Cakupan Kelembaban Tanah pada Tanaman Jambu Merah berbasis Arduino (Vol. 7, Issue 4). http://j-ptiik.ub.ac.id

Pramoedyo, H., Ariyanto, D., & Aini, N. N. (2022). Comparison of Random Forest and Naive Bayes Methods for Classifying and Forecasting Soil Texture in The Area Around DAS Kalikonto East Java. Barekeng: Journal of Mathematics and Its Application, 16(4), 1411–1422. https://doi.org/10.30598/barekengvol16iss4pp1411-1422

Reddy, B. B., Maragatham, S., Santhi, R., Balachandar, D., Vijayalakshmi, D., Davamani, V., Vasu, D., & Gopalakrishnan, M. (2024). Predictive soil mapping using random forest models: Applications in pH and soil organic matter assessment. Plant Science Today, 11(4), 463–474. https://doi.org/10.14719/pst.3865

Sarangi, A., Raula, S. K., Ghoshal, S., Kumar, S., Kumar, C. S., & Padhy, N. (2024). Enhancing Process Control in Agriculture: Leveraging Machine Learning for Soil Fertility Assessment †. Engineering Proceedings, 67(1). https://doi.org/10.3390/engproc2024067031

Siahaan, A., Hannats, M., Ichsan, H., & Fitriyah, H. (2023). Implementasi Fuzzy K-Nearest Neighbor dalam Sistem Klasifikasi Kualitas Tanah pada Tanaman Kedelai berdasarkan Kelembapan dan pH Tanah menggunakan Arduino (Vol. 7, Issue 5). http://j-ptiik.ub.ac.id

Supriyanto, S., & Atwa Magriyanti, A. (2022). Perancangan Sistem Monitoring Kualitas Tanah Sawah Dengan Parameter Suhu Dan Kelembaban Tanah Menggunakan Arduino Berbasis Internet Of Things (IoT). JURNAL ILMIAH ELEKTRONIKA DAN KOMPUTER, 15(2), 234–241. http://journal.stekom.ac.id/index.php/elkompage234

Wadoux, A. M. J.-C., Samuel-Rosa, A., Poggio, L., & Mulder, V. L. (2020). A note on knowledge discovery and machine learning in digital soil mapping. European Journal of Soil Science, 71(2), 133–136. https://doi.org/https://doi.org/10.1111/ejss.12909

Downloads

Published

2025-08-30

Article ID

10317

How to Cite

Yoga Pristyanto, Ibrahim Aji Fajar Romadhon, Nugraha, A. F., Nurmasani, A., & Wulandari, I. R. (2025). Performance Evaluation of Machine Learning Models for Soil Fertility Classification Based on the Indian Soil Fertility Dataset. Edu Komputika Journal, 12(1), 21-30. https://doi.org/10.15294/edukom.v12i1.10317