Performance Evaluation of Otsu and Sauvola Thresholding for Structured Document Binarization

Authors

  • Muhammad Noko Darpito Master Program of Informatics, Faculty of Industrial Technology, Universitas Ahmad Dahlan Author
  • Kartika Firdausy Department of Electrical Engineering, Faculty of Industrial Technology, Universitas Ahmad Dahlan Author
  • Abdul Fadlil Department of Electrical Engineering, Faculty of Industrial Technology, Universitas Ahmad Dahlan Author

DOI:

https://doi.org/10.15294/sji.v13i1.40245

Keywords:

Image thresholding, Otsu, Sauvola, Clahe, Document scanned

Abstract

Purpose: Digitizing public administration records, particularly structured forms such as the Transport of Plants and Wildlife Abroad (Surat Angkut Tumbuhan dan Satwa Liar Luar Negeri / SATS-LN), necessitates meticulous preparation for precise subsequent analysis. Most of the photos in the SATS-LN archives are scanned, and they have inconsistent lighting, varying resolution, and background noise, which makes it difficult to separate the text from the backdrop and read it clearly. This work identifies the optimal SATS-LN binarization approach for preserving textual structure and suppressing background artifacts.

Methods: A four-stage pipeline is used. First, Detectron2 localizes seven important SATS-LN fields. Second, binarization is investigated with global Otsu and adaptive Sauvola thresholding under three parameter configurations. Third, following binarization, Contrast-Limited Adaptive Histogram Equalization (CLAHE) boosts local contrast. Finally, Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), Difference from Reference for Distortion (DRD), Precision, Recall, F1-score, and Foreground Ratio are assessed on 200 annotated SATS-LN documents (150 scanner-based/DOC and 50 camera captured/CAM).

Result: The acquisition domain and assessment model affect binarization performance on 200 SATS-LN documents (150 DOC scans and 50 CAM images). Global Otsu_T10 has the highest median PSNR (21.19 dB) and the lowest median MSE (494.69), indicating a visually cleaner background. However, segmentation-based metrics show better stroke preservation with Sauvola, as Sauvola_k05 has the strongest DOC text–background separation (F1 = 0.938). In the CAM domain, where illumination variability dominates, Sauvola performs better across structural and segmentation indicators, with Sauvola_k04 performing best overall (F1 = 0.980) and mitigating the over-segmentation tendency of strict global thresholds. The Sauvola window (25x25) and CLAHE clip limit (1.0) results suggest using Sauvola_k05 for DOC and Sauvola_k04 for CAM to preserve text integrity and reduce background artifacts.

Novelty: This study presents a novel field-level binarization assessment that combines automated cropping and ground-truth evaluation, providing practical guidance for robust preprocessing that supports scalable, reliable, and cross-device public document digitization.

Downloads

Published

31-03-2026

Article ID

40245

Issue

Section

Articles

How to Cite

Performance Evaluation of Otsu and Sauvola Thresholding for Structured Document Binarization. (2026). Scientific Journal of Informatics, 13(1), 189-200. https://doi.org/10.15294/sji.v13i1.40245