From Classic to Cutting-Edge: A Near-Perfect Global Thresholding Approach with Machine Learning

Image binarization is an important process in many computer-vision applications. This transforms the color space of the original image into black and white. Global thresholding is a quick and reliable way to achieve binarization, but it is inherently limited by image noise and uneven lighting. This...

Full description

Saved in:
Bibliographic Details
Main Authors: Nicolae Tarbă, Costin-Anton Boiangiu, Mihai-Lucian Voncilă
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/14/8096
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Image binarization is an important process in many computer-vision applications. This transforms the color space of the original image into black and white. Global thresholding is a quick and reliable way to achieve binarization, but it is inherently limited by image noise and uneven lighting. This paper introduces a global thresholding method that uses the results of classical global thresholding algorithms and other global image features to train a regression model via machine learning. We prove through nested cross-validation that the model can predict the best possible global threshold with an average F-measure of 90.86% and a confidence of 0.79%. We apply our approach to a popular computer vision problem, document image binarization, and compare popular metrics with the best possible values achievable through global thresholding and with the values obtained through the algorithms we used to train our model. Our results show a significant improvement over these classical global thresholding algorithms, achieving near-perfect scores on all the computed metrics. We also compared our results with state-of-the-art binarization algorithms and outperformed them on certain datasets. The global threshold obtained through our method closely approximates the ideal global threshold and could be used in a mixed local-global approach for better results.
ISSN:2076-3417