Deep Learning-Based Eye Disorder Classification: A K-Fold Evaluation of EfficientNetB and VGG16 Models

Authors

  • Cinantya Paramita Dinus Research Group for Al in Medical Science (DREAMS) Universitas Dian Nuswantoro, Indonesia Author
  • Sindhu Rakasiwi Dinus Research Group for Al in Medical Science (DREAMS) Universitas Dian Nuswantoro, Indonesia Author
  • Pulung Nurtantio Andono Research Center for Intelligent Distributed Surveillance and Security (IDSS), Universitas Dian Nuswantoro, Indonesia Author
  • Guruh Fajar Shidik Research Center for Intelligent Distributed Surveillance and Security (IDSS), Universitas Dian Nuswantoro, Indonesia Author
  • Shier Nee Saw Department of Artificial Intelligence, Universiti Malaya, Malaysia Author
  • Muhammad Ivan Rafsanjani Department of Computer Science, Universitas Dian Nuswantoro, Indonesia Author

DOI:

https://doi.org/10.15294/sji.v12i3.26257

Keywords:

CNN modern, EfficientNetB, Color fundus photography, K-Fold, VGG16, Machine learning, Grad-Cam

Abstract

Purpose: The study evaluates EfficientNetB3 and VGG16 deep learning architectures for image classification, focusing on stability, accuracy, and interpretability. It uses Gradient-weighted Class Activation Mapping to improve transparency and robustness. The research aims to create reliable AI-based diagnostic tools.

Methods: The study used a dataset of 4,217 color retinal fundus images divided into four classes: cataract, diabetic retinopathy, glaucoma, and normal. The dataset was divided into 70% for training, 10% for validation, and 20% for testing. The researchers used a transfer learning approach with EfficientNetB3 and VGG16 models, pretrained on ImageNet. Real-time augmentation was applied to prevent overfitting and improve generalization. The models were compiled with the Adam optimizer and trained with categorical cross-entropy loss. Early stopping was implemented to allocate computational resources efficiently and reduce overfitting. A learning rate scheduler (ReduceLROnPlateau) was added to adjust the learning rate if no significant improvement was made concerning validation loss. EfficientNetB3 was more efficient in model size, possessing only 12 million parameters compared to VGG16's 138 million, making it suitable for resource-constrained mobile or embedded systems. The final evaluation was done on the held-out test set.

Result: The EfficientNetB3 architecture outperforms VGG16 in classification accuracy and loss value stability, with an average accuracy of 93%. It also exhibits better transparency and predicted accuracy, making it a reliable model for medical image categorization.

Novelty: This work introduces a novel framework integrating EfficientNetB3 architecture, stratified cross-valuation, L2 regularization, and Grad-CAM-based interpretability, focusing on openness and explainability in model evaluation.

Downloads

Published

09-09-2025

Article ID

26257

Issue

Section

Articles

How to Cite

Deep Learning-Based Eye Disorder Classification: A K-Fold Evaluation of EfficientNetB and VGG16 Models. (2025). Scientific Journal of Informatics, 12(3), 441-452. https://doi.org/10.15294/sji.v12i3.26257