Python3 sample code for image text recognition,

Source: Internet
Author: User

Python3 sample code for image text recognition,

I learned Python3 5th days by myself. Today, I think I want to use Python to recognize text in images. I didn't expect Python to achieve image text recognition so easily, just a line of code is needed.

From PIL import Imageimport comment '), lang = 'chi _ sim') print (text)

Let's take recognizing poetry as an example.

Below are the images we want to recognize

Let's take a look at it first.

 

After we run the code to identify the results, a few words are not correctly identified, but most numbers can be identified.

The wind is anxious, the sky is high, the ape is sad, the birds are few white birds, the phoenix is boundless, the wood falls under the Xiao, not a long amount of work blind, the peak is a thousand miles of sorrow and autumn often at first glance, more than a hundred years of illness alone login difficult, hate, the amount of frome float, new stop yunfan

One line of code can recognize images, and we need to make some preparations.

  1. Here we need to use two libraries: pytesseract and PIL.
  2. At the same time, we also need to install the recognition engine tesseract-ocr

Next, let's talk about the installation of these libraries, because only after these libraries are installed can Python implement a line of code for image text recognition.

1. Install pytesseract and PIL

Install these two packages using pip

-1, command line installation

pip install PIL pip install pytesseract 

-2. If you use the pycharm editor, you can directly use pycharm for quick installation.

Follow these steps on the Settings page of pycharm:

 

In this way, you can successfully install pytesseract. To install PIL, you only need to search for PIL in step 3 and click Install.

At this time, we have fixed the database and run the following code:

from PIL import Imageimport pytesseracttext=pytesseract.image_to_string(Image.open('denggao.jpeg'),lang='chi_sim')print(text)

The following error is reported because the recognition engine tesseract-ocr is not installed.

2. Install the recognition engine tesseract-ocr

1. download the following installation package and click Install.
Tesseract-ocr installation package and Chinese Language Pack

Unzip and install tesseract-ocr and perform the following operations to support Chinese recognition. Because tesseract-ocr does not support Chinese recognition by default.

2. After tesseract-ocr is installed, we still need to configure it.

Find pytesseract. py in C: \ Users \ huxiu \ AppData \ Local \ Programs \ Python \ Python35 \ Lib \ site-packages \ pytesseract and perform the following operations:

# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY#tesseract_cmd = 'tesseract'tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe'

You can also use pycharm to quickly open pytesseract. py

So far, all our configurations have been completed. Run the following code to parse the image poem of Du Fu into text.

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.