Python practice === use python to recognize Chinese Characters in Images,
Modules to be installed
PIL
pytesseract
Tools to be downloaded:
Http://download.csdn.net/download/bo_mask/10196285
Download and unzip the package and install itDefaultPath, such as 1:
And copy and decompress the fileChi_sim.traineddataFile to installation pathC: \ Program Files (x86) \ Tesseract-OCR \ tessdata2, Figure 3:
Okay, if you have installed all of the above, there is still the last step to change the configuration file, as shown in Figure 4:
After opening, comment out the original one and add:
# Change this if tesseract is not in your path, OR IS NAMED DIFFERENTLY
# Tesseract_cmd = 'tesseract'
Tesseract_cmd = u'c:/Program Files (x86)/Tesseract-OCR/tesseract.exe '# This path is the path after your installation, corresponding to the path in Figure 1
Save! Environment configuration complete ~
For example, save as 111.png:
#test.py
from PIL import Imageimport pytesseracttext=pytesseract.image_to_string(Image.open('111.png'),lang='chi_sim')print(text)
Execution result:
(-.-|... ....
Summary: