Optical character recognition guided image super resolution

Philipp Hildebrandt, Maximilian Schulze, Sarel Cohen, Vanja Doskoč, Raid Saabni, Tobias Friedrich

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recognizing disturbed text in real-life images is a difficult problem, as information that is missing due to low resolution or out-of-focus text has to be recreated. Combining text super-resolution and optical character recognition deep learning models can be a valuable tool to enlarge and enhance text images for better readability, as well as recognize text automatically afterwards. We achieve improved peak signal-to-noise ratio and text recognition accuracy scores over a state-of-the-art text super-resolution model TBSRN on the real-world low-resolution dataset TextZoom while having a smaller theoretical model size due to the usage of quantization techniques. In addition, we show how different training strategies influence the performance of the resulting model.

Original languageEnglish
Title of host publicationDocEng 2022 - Proceedings of the 2022 ACM Symposium on Document Engineering
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450395441
DOIs
StatePublished - 20 Sep 2022
Externally publishedYes
Event22nd ACM Symposium on Document Engineering, DocEng 2022 - Virtual, Online, United States
Duration: 20 Sep 202223 Sep 2022

Publication series

NameDocEng 2022 - Proceedings of the 2022 ACM Symposium on Document Engineering

Conference

Conference22nd ACM Symposium on Document Engineering, DocEng 2022
Country/TerritoryUnited States
CityVirtual, Online
Period20/09/2223/09/22

Keywords

  • deep learning
  • image super-resolution
  • optical character recognition
  • unfocused images

Fingerprint

Dive into the research topics of 'Optical character recognition guided image super resolution'. Together they form a unique fingerprint.

Cite this