1. Download the required packages first
OCR tool: tesseract-ocr3.0.1 source code tesseract-ocr-3.01.eng.tar.gz broken Verification Code in English is enough.
Image processing tool: Leptonica 1.68
PNG recognition Tool: Libpng
JPEG Identification tool: Libjpeg
TIF identification tool: Libtiff
2. Installation Steps
1-Installing Libpng,libjpeg,libtiff
Here are the commands:
./configure
Make
sudo make install
2-Installing Leptionica
Here are the commands:
./configure
Make
sudo make install
When make is found error, prompt
Pngio.c:119:error: ' Z_default_compression ' undeclared here (not in a function)
I went to the wiki and found out it was pngio.c. This file has a bug, unable to find ZLIB1G package modification under Mac LEPTIONICA/SRC/PNGIO.C to insert code after #include "Png.h"
The following is the command code:
#ifdef HAVE_LIBZ
#include "Zlib.h"
#endif
3-Installing TESSERACT-OCR
Here are the commands
./autogen.sh
./configure
Make
sudo make install
If an error is found, it can be changed to the following command:
./autogen.shexport LIBLEPT_HEADERSDIR=/usr/local/include./configure --with-extra-libraries=/usr/local/libsudo make install
4-Install the language pack
Unzip the tesseract-ocr-3.01.eng.tar.gz to/usr/local/share/tesseract.
3. Try OCR
- Macbook-pro:work my$ tesseract pin.jpg out-l Eng
- Tesseract Open Source OCR Engine v3. Leptonica with
- Macbook-pro:work my$ More OUT.txt
- Bvcs
At this point, already tesseract has been able to work properly.
The rest of us write a piece of code to use the command line call to realize the image recognition.
Tesseract's own training language package does not guarantee that the verification code picture is recognized, this can be done by grasping a certain amount of verification code
Training for more precise identification, official documentation and tools for how
Http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
Turn from (slightly modified): http://holybless.iteye.com/blog/1338717
Install TESSERACT-OCR under Mac