This article to you to share is the Python code implementation of the image text recognition, the content is very good, hoping to help the needy friends
We take the recognition of poetry as an example
Here's the picture we're going to identify.
Let's see.
We run the code after the recognition of the results, a few words are not recognized correctly, but most of the numbers can be identified.
The wind-fast day high ape Sao-yun abdominal fang less white birds flying chicken boundless falling wood, not long-time blind to go to thousands of sad autumn often 1 at first, more than a century of disease alone magic emanation difficult bitter hate the amount of a new stop Shu unitary sail
A line of code can identify the picture, we have to do some preparation work behind
The following is the installation of these libraries, because only a few libraries installed in the future Python to implement a line of code to achieve picture text recognition
One, installation of Pytesseract and PIL
The two packages can be installed with PIP
-1, command-line installation
Pip Install PIL
Pip Install Pytesseract
-2, if you use the Pycharm editor, you can directly with the Pycharm for quick installation.
Follow these steps in the Settings Settings page of Pycharm
This will successfully install Pytesseract, install PIL only need to search in the third step above PIL and click Install
Then we have a good library, run the following code
From PIL import Imageimport pytesseracttext=pytesseract.image_to_string (Image.open (' Denggao.jpeg '), lang= ' Chi_sim ') Print (text)
The following error is reported, the error reason is: No recognition engine installed TESSERACT-OCR
Second, install the recognition engine TESSERACT-OCR
After decompression installation TESSERACT-OCR do the following, you can support Chinese recognition. Because TESSERACT-OCR does not support Chinese recognition by default.
2, after the installation is complete TESSERACT-OCR, we also need to do a bit of configuration
After C:\Users\huxiu\AppData\Local\Programs\Python\Python35\Lib\site-packages\pytesseract find pytesseract.py Open, do the following:
# change this IF tesseract are not in YOUR PATH, OR is NAMED differently#tesseract_cmd = ' tesseract ' tesseract_cmd = ' C:/pro Gram Files (x86)/tesseract-ocr/tesseract.exe '
You can also quickly open pytesseract.py via Pycharm
At this point all of our configuration is complete, run the following code can be Du Fu's ascent This image of the poem into text