OCR recognition version-python3.5

Source: Internet
Author: User

Just touch, nothing, follow the tutorial walk

Requirements: Identify the text information in the picture
Environment: Windows system

Development language: python3.5

Tool class: 1.PYOCR
2.PIL
3.tesseract-ocr

Steps:

1.pyocr

Network Access direct command:
Pip Install PYOCR

Network is not available, go to https://pypi.python.org/pypi/pyocr/0.4.1 download installation

2. Install PIL (has not been installed successfully, as if there is no corresponding 3.5 version, to 2.X, but this can be skipped, not installed )
Network Access direct command:
Pip Install PIL

Network is not available, go to http://www.pythonware.com/products/pil/index.htm download installation

3. Installing TESSERACT-OCR

Http://jaist.dl.sourceforge.net/project/tesseract-ocr-alt/tesseract-ocr-setup-3.02.02.exe

EXE file, install directly after download, recommend the default installation process option, install directory default C:\Program Files (x86) \TESSERACT-OCR

# Coding=utf-8
__author__ =' YJJ '

#https://GITHUB.COM/TESSERACT-OCR
Import Sys
Import importlib
#reload (SYS)
Importlib.reload (SYS);
#sys. setdefaultencoding (' Utf-8 ')

Import OS;
os.environ[' Nls_lang '] =' Simplified Chinese_china. UTF8 '
Try
From PYOCRImport PYOCR
From PILImport Image
ExceptImporterror:
Print' Module import error, please install using PIP, Pytesseract dependent on the following libraries: ')
Print' http://www.lfd.uci.edu/~gohlke/pythonlibs/#pil ')
Print' http://code.google.com/p/tesseract-ocr/')
RaiseSystemexit
Tools = Pyocr.get_available_tools () [:]
IfLen (tools) = =0:
Print"No OCR tool found")
Sys.exit (1)
Print"Using '%s '"% (tools[0].get_name ())
print (tools[ 0].image_to_string (Image.open ( "D: \\123.png "), Span style= "COLOR: #660099" >lang= print (Tools[0].image_to_string (Image.open ( D:\\ 3434.png "), lang= Chi_sim ')
#print tools[0].image_to_ String (Image.open (' d:\\3535.png '), lang= ' Chi_sim ')

File contents: (Put the picture on the D-plate)

123.png

3434.png

Output:

Using ' tesseract (SH) '
7364
Beg I only another u going r 1th generation

Problems that you may encounter throughout the process

1. (sometimes restarting the software, wrong is nothing, strange is not wrong, I am so) if the console output: "No OCR tool found", indicating that the installation is not successful Tesseract-ocr,debug view Get_available_ Tools, go back in this method to see the OCR library that has been installed in this machine, there are three kinds,

Libtesseract,
Tesseract,
Cuneiform,

This article uses the second kind of tesseract,

Tesseract specific installation please go to.

2. In the identification of pictures with Chinese, will encounter the "Allow_blob_division" error,

Need to download TESSERACT-OCR's Chinese library, address: Https://sourceforge.net/projects/tesseract-ocr-alt/files/tesseract-ocr-3.02.chi_ Sim.tar.gz/download, which contains tesseract multiple text library, Chi_sim.traineddata for the Simplified Chinese library, put the file to C:\Program files (x86) \tesseract-ocr\ Tessdata directory next to the specific processing method, go to: https://www.cnblogs.com/syqlp/p/5462459.html

OCR recognition version-python3.5

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.