Python_python Image Recognition Applet

Source: Internet
Author: User

I have read some Python image recognition programs on the Internet. Try to write one for testing!

Running Environment: Linux centos + Python 2.7 + Pil library + tesseract3.0 + pytesser

 

Environment setup:

I will not talk about installing python in Linux. Here I will mainly talk about how to install pytesser, Pil and tesseract.

1. Check whether the system has installed the following libraries:

LibPNG, libjpeg, LibTIFF, zlibg-Dev

# Yum list | grep libpng

# Yum list | grep libjpeg

# Yum list | grep LibTIFF

# Yum list | grep zlibg

Install without installation:

# Yum install libpng

# Yum install libjpeg

# Yum install LibTIFF

# Yum install zlibg

 

2. Install tesseract:

Download the latest version of tesseract, http://code.google.com/p/tesseract-ocr/downloads/list I downloaded is version 3.0.

Decompress the package:

# Tar-zxvf tesseract-3.00.tar.gz

Enter the decompressed Folder:

# Cd tesseract-3.00

Installation:

#./Configure -- prefix =/opt/tesseract # Use -- prefix to specify the installation directory. The installation directory here is/opt/tesseract.

# Make

# Make install

After installation, configure the path and modify the. profile or. Bash-profile in the home directory. Here we modify. Bash-profile. Add the following content in path.

:/Opt/tesseract/bin

Make the configuration file effective:

# .. Bash-Profile

3. Install Pil:

Download Pil: http://www.pythonware.com/products/pil/ for your Python version on the Pil Homepage

My Python version 2.7 is: http://effbot.org/downloads/Imaging-1.1.7.tar.gz

Decompress the package:

# Tar-zxvf Imaging-1.1.7.tar.gz

Enter the decompressed Folder:

# CD image-1.1.7

Installation:

# Python setup. py install

4. Install pytesser:

Download pytesser: The http://pytesser.googlecode.com/files/pytesser_v0.0.1.zip currently has only one version.

Decompress the package:

# Unzip pytesser_v0.0.1.zip

We recommend that you create a folder and put the package in the folder for decompression, because unzip is used directly to decompress the package to the current directory, which is not easy to manage.

5. test:

Create img_to_text.py in the pytesser directory as follows:

From pytesser import * # import the pytesser File

Def img_to_text (filename ):
IMG = image. Open (filename) # Read image files
IMG. Load () # If the load () method is not used, the system sometimes prompts that the object cannot be found.
If Len (IMG. Split () = 4: # The image model is separated here.

R, G, B, A = IMG. split () # Send the four models or channels of the image to R, G, B, A (r red channel, G green channel, B blue channel, a transparent alpha channel ), PIL does not support channel A in BMP images. For Image Recognition, you must first convert the image to the BMP format for identification.

IMG = image. Merge ("RGB", (R, G, B) # Remove Channel A and reassemble the image.

Return image_to_string (IMG) # Call the image_to_string () method in pytesser to convert the image and text. The Tesseract engine is used in the method.

If _ name _ = '_ main __':
Img_to_text ()

Print "OK"

 

Test:

Here I have several online mall price pictures for identification:

 

Source image:

The $ symbol cannot be recognized, but it does not affect the number.

Test other images:

 

The image converted to BMP for re-identification can recognize the ¥. However, even if the image is converted to BMP, the $ symbol cannot be recognized.

 

However, to get the price, you can take a string of 3rd characters. That is, the string [2] and its suffix are obtained.

 

Reference: http://www.daniweb.com/software-development/python/threads/253957

Use pytesser in http://wenyue.me/blog/282 Linux

 

PS: Tesseract provides a multi-language library, which can be downloaded from the Tesseract download page in the article.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.