Note: The main consideration is the method of deep learning, the traditional method is not within the scope of consideration.
1. Word Recognition steps
1.1detection: Find the area with text (proposal).
1.2classification: Identifies the text in the area.
2. Text detection
Text detection mainly has two lines, two steps and one step.
2.1 Two-step method: Faster-rcnn.
2.2 One-step: YOLO. The one-step speed is faster than the two-step method, but accuracy has a loss.
Text detection is based on the angle of the text.
2.1 Horizontal text detection: Four degrees of freedom, similar to object detection. A better algorithm for horizontal text detection is the CTPN of the 2016ECCV Choyu teacher team.
2.2 Tilt Text detection: The text box is an irregular quadrilateral, with eight degrees of freedom. Tilt text Detection Personal preferred method is 2017CVPR East and Seglink. Routines: Detection text box, such as the Radon Hough transformation and other methods of text correction, through the projection histogram division of the text of a single line of the picture, and finally to single-line OCR.
3. Word Recognition
Only consider the need not to split the text.
3.1 fixed length, each character is considered to be independent: multi-digit number.
3.2 Indefinite length: RNN/LSTM/GRU+CTC. The Crnn written by Baixiang's team is quite clear.
3.3 Indefinite long attention-mechanism (cnn+rnn+attention): Divided into hard Attention (directly to hard location, not directly violent Pb), soft Attention (can be violent Pb) , gradient-base attention.
Reference: https://www.zhihu.com/question/20191727
What algorithm is used to summarize image text recognition (OCR)