The 26th ACM Symposium on Document Engineering

August 25, 2026 to August 28, 2026
HES-SO / University of Fribourg, Switzerland

Photographed Document Image Binarization — Quality, Time & Size Assessment

→ Back to Call for Competitors

Organizers

Overview

Photographed documents are far more complex document images than scanned ones, as the image quality is device-dependent, they have uneven resolution, and they usually suffer from interfering environmental light sources.

This competition assesses the performance of photographed binarization algorithms using images acquired with recently released portable phones. The test set includes several kinds of document images (offset, laser and inkjet printed documents) with photos taken in different environment conditions, with the strobe flash on, off, and in auto modes.

Three aspects of the algorithms are analyzed: the quality of the generated image, the time performance, and its potential use in image compression schemes. The submitted algorithms will not only be compared among themselves, but also with 102 of the best known binarization algorithms.

Quality Assessment

The quality of the final monochromatic image is the most important assessment criterion in this photographed document image binarization competition. The quality measure used to evaluate the performance of the binarization algorithms is the PSNR (Peak-Signal-to-Noise Ratio) [1], a standard image quality measure, between each of the assessed images and a ground-truth (GT) image. The GT-image is obtained by computing the "Boolean and" image between the top-two quality algorithms from the previous competitions, visually inspecting it, and, if needed, hand-correcting it.

All images will be publicly available after the competition at the DIB website: https://dib.cin.ufpe.br/

The PSNR measures for each algorithm in each image cluster are ranked in the same way as in [2]. First, the ranking for each measure is calculated for each document in a class. Then, the summation of the rank order for all documents in the class defines the final ranking. Visual inspection is applied to check the consistency of the results obtained.

Size Assessment

Once the top-quality images are identified, one may consider the mean size of the monochromatic file produced by each evaluated algorithm. Image binarization is widely used as a key step for image compression schemes. The mean size of the monochromatic image files in TIFF (G4) format, which is possibly the most efficient lossless compression scheme for binary images [3], is also assessed in this competition.

Time Assessment

The processing time evaluation provides the order of magnitude of the time elapsed for binarizing each of the image datasets. The competitors are responsible for training their AI-based algorithms; training times are not computed in this competition.

Dataset

The competition dataset is hosted at the Document Image Binarization (DIB) website:

https://dib.cin.ufpe.br/

Important Dates

Competition opens to participants April 24, 2026
Registration & submission deadline July 16, 2026
Final results announced at DocEng 2026 August 25, 2026

The submission deadline is 23:59 AoE (Anywhere on Earth) and requires the executable code together with a short description of the participants' binarization scheme.

References

  1. Gonzalez, R.C. and Woods, R.E. (2018). Digital Image Processing. 4th Edition, Pearson Education, New York.
  2. Ntirogiannis, K., Gatos, B., Pratikakis, I. (2013). Performance Evaluation Methodology for Historical Document Image Binarization. IEEE Transactions on Image Processing 22(2), 595–609.
  3. Lins, R.D. and Machado, D. (2004). Comparative study of file formats for image storage and transmission. Journal of Electronic Imaging 13(1), 175–181. https://doi.org/10.1117/1.1634591

Contact

For questions about this competition, please contact Rafael Dueire Lins.