arXiv:2012.02033 Abstract | arXiv Analytics

arXiv:2012.02033 [cs.CV]Abstract References Reviews Resources

SuperOCR: A Conversion from Optical Character Recognition to Image Captioning

Baohua Sun, Michael Lin, Hao Sha, Lin Yang

Published 2020-11-21Version 1

Optical Character Recognition (OCR) has many real world applications. The existing methods normally detect where the characters are, and then recognize the character for each detected location. Thus the accuracy of characters recognition is impacted by the performance of characters detection. In this paper, we propose a method for recognizing characters without detecting the location of each character. This is done by converting the OCR task into an image captioning task. One advantage of the proposed method is that the labeled bounding boxes for the characters are not needed during training. The experimental results show the proposed method outperforms the existing methods on both the license plate recognition and the watermeter character recognition tasks. The proposed method is also deployed into a low-power (300mW) CNN accelerator chip connected to a Raspberry Pi 3 for on-device applications.

Comments: 8 pages, 2 figures, 2 tables

Categories: cs.CV, eess.IV

Keywords: optical character recognition, image captioning, watermeter character recognition tasks, conversion, cnn accelerator chip

Related articles: Most relevant | Search more

arXiv:1903.12020 [cs.CV] (Published 2019-03-28)

Describing like humans: on diversity in image captioning

Qingzhong Wang, Antoni B. Chan

arXiv:2210.10914 [cs.CV] (Published 2022-10-19)

Prophet Attention: Predicting Attention with Future Attention for Improved Image Captioning

Fenglin Liu, Xuewei Ma, Xuancheng Ren, Xian Wu, Wei Fan, Yuexian Zou, Xu Sun

arXiv:2107.14178 [cs.CV] (Published 2021-07-29, updated 2022-07-14)

ReFormer: The Relational Transformer for Image Captioning