[Search] OCR

1. E2E OCR

1.1 Donut

donut : Document Understanding Transformer without OCR
[논문] https://arxiv.org/pdf/2111.15664v1.pdf
[리뷰] https://yhkim4504.tistory.com/15?category=843360
[코드] X

1.2 LayoutMv2

LayoutLMv2 : Multi-modal Pre-training for Visually-Rich Document Understanding

[논문] https://arxiv.org/abs/2012.14740
[리뷰] https://www.youtube.com/watch?v=BI2Mx5cdc60&feature=youtu.be
[코드] https://github.com/microsoft/unilm/tree/master/layoutlmv2

1.3 E2E-MLT

E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text (2018)

[논문] https://arxiv.org/ftp/arxiv/papers/1801/1801.09919.pdf

[코드] https://github.com/MichalBusta/E2E-MLT

1.4. FOTS

FOTS: Fast Oriented Text Spotting with a Unified Network (2018)

[논문]https://arxiv.org/abs/1801.01671
[코드] https://github.com/jiangxiluning/FOTS.PyTorch

1.5 PGNet

PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network, 2021(**)

[논문] https://arxiv.org/pdf/2104.05458v1.pdf
[코드] https://github.com/PaddlePaddle/PaddleOCR-

1.6 STN-OCR

STN-OCR: A single Neural Network for Text Detection and Text Recognition, 2017(**)
[논문] https://arxiv.org/pdf/1707.08831v1.pdf

[코드] https://github.com/Bartzi/stn-ocr

1.7 TransVTSpotter

A Bilingual, OpenWorld Video Text Dataset and End-to-end Video Text Spotter with Transformer (TransVTSpotter)
[논문] https://arxiv.org/pdf/2112.04888.pdf
[코드] https://github.com/weijiawu/transvtspotter

1.8 Cost-effective End-to-end Information Extraction for Semi-structured Document Images

[논문] https://arxiv.org/pdf/2104.08041.pdf
[논문리뷰] X
[코드] X

2. Tranformer based OCR

2.1 SATRN(Self-Attention Text Recognition Network)

On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention

[논문]
https://openaccess.thecvf.com/content_CVPRW_2020/papers/w34/Lee_On_Recognizing_Texts_of_Arbitrary_Shapes_With_2D_Self-Attention_CVPRW_2020_paper.pdf
[논문리뷰] https://oranz.tistory.com/5
[코드] https://github.com/clovaai/satrn

2.2 TrOCR

TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models (2021)

[논문] https://arxiv.org/pdf/2109.10282.pdf
[코드] https://github.com/microsoft/unilm/tree/master/trocr

2.3 Transformer-Based Text Detection in the Wild
[논문] https://openaccess.thecvf.com/content/CVPR2021W/VOCVALC/papers/Raisi_Transformer-Based_Text_Detection_in_the_Wild_CVPRW_2021_paper.pdf

2.4 ViTSTR

Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
[논문] https://arxiv.org/pdf/2105.08582.pdf
[코드] https://github.com/roatienza/deep-text-recognition-benchmark

'Paper > OCR' 카테고리의 다른 글

[OCR] ViT-STR: Vision Transformer for Fast and Efficient SceneText Recognition (0)	2022.02.03
[OCR] FOTS: Fast Oriented Text Spotting with a Unified Network (0)	2022.02.03
[OCR] Donut : Document Understanding Transformer without OCR (0)	2022.01.25
[OCR] TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models (0)	2022.01.23

IIIIIIIIIIII

[Search] OCR

1. E2E OCR

2. Tranformer based OCR

'Paper > OCR' 카테고리의 다른 글

티스토리툴바

[Search] OCR

1. E2E OCR

2. Tranformer based OCR

'Paper > OCR' 카테고리의 다른 글

'Paper/OCR' Related Articles

티스토리툴바