Python code realizes the recognition of picture text

Source: Internet
Author: User

This article to you to share is the Python code implementation of the image text recognition, the content is very good, hoping to help the needy friends

We take the recognition of poetry as an example
Here's the picture we're going to identify.

Let's see.


We run the code after the recognition of the results, a few words are not recognized correctly, but most of the numbers can be identified.

The wind-fast day high ape Sao-yun abdominal fang less white birds flying chicken boundless falling wood, not long-time blind to go to thousands of sad autumn often 1 at first, more than a century of disease alone magic emanation difficult bitter hate the amount of a new stop Shu unitary sail

A line of code can identify the picture, we have to do some preparation work behind

    • Here we need to use two libraries: Pytesseract and PiL

    • We also need to install the recognition engine TESSERACT-OCR

The following is the installation of these libraries, because only a few libraries installed in the future Python to implement a line of code to achieve picture text recognition

One, installation of Pytesseract and PIL

The two packages can be installed with PIP
-1, command-line installation
Pip Install PIL
Pip Install Pytesseract
-2, if you use the Pycharm editor, you can directly with the Pycharm for quick installation.
Follow these steps in the Settings Settings page of Pycharm

This will successfully install Pytesseract, install PIL only need to search in the third step above PIL and click Install

Then we have a good library, run the following code

From PIL import Imageimport pytesseracttext=pytesseract.image_to_string (Image.open (' Denggao.jpeg '), lang= ' Chi_sim ') Print (text)

The following error is reported, the error reason is: No recognition engine installed TESSERACT-OCR

Second, install the recognition engine TESSERACT-OCR

    • 1. Download the installation package below and click Install directly
      TESSERACT-OCR installation package and Chinese language pack

After decompression installation TESSERACT-OCR do the following, you can support Chinese recognition. Because TESSERACT-OCR does not support Chinese recognition by default.

    • 2, after the installation is complete TESSERACT-OCR, we also need to do a bit of configuration
      After C:\Users\huxiu\AppData\Local\Programs\Python\Python35\Lib\site-packages\pytesseract find pytesseract.py Open, do the following:

# change this IF tesseract are not in YOUR PATH, OR is NAMED differently#tesseract_cmd = ' tesseract ' tesseract_cmd = ' C:/pro Gram Files (x86)/tesseract-ocr/tesseract.exe '

You can also quickly open pytesseract.py via Pycharm

At this point all of our configuration is complete, run the following code can be Du Fu's ascent This image of the poem into text

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.