1. Can be installed automatically under Ubuntu
[HTML]
Technology sharing technology Share
sudo apt-get install TESSERACT-OCR
2. Compiling the installation
A. Compilation environment: GCC gcc-c++ make (this environment is common for machines, can be ignored)
[HTML]
Technology sharing technology Share
Yum install gcc gcc-c++ make
B. Install the required packages for TESSERACT-OCR compilation
[HTML]
Technology sharing technology Share
1120.www.qixoo.qixoo.com/yum/apt-get Install autoconf Automake libtool
C. Add package for image parsing, select package according to specified format
[HTML]
Technology sharing technology Share
Yum install libjpeg-devel libpng-devel libtiff-devel zlib-devel
Ubuntu
[HTML]
Technology sharing technology Share
sudo apt-get install Libpng12-dev
sudo apt-get install Libjpeg62-dev
sudo apt-get install Libtiff4-dev
d. Download Leptonica Package: http://www.leptonica.org/source/leptonica-1.71.tar.gz
[HTML]
Technology sharing technology Share
wget http://www.leptonica.org/source/leptonica-1.71.tar.gz
TAR-ZXVF ...
./configure
Make
Make install
It is important to note that the Leptonica version issue
3.01 requires at least v1.67 of Leptonica.
3.02 requires at least v1.69 of Leptonica. (Both available in Ubuntu 12.04 precise pangolin.)
3.03 requires at least v1.70 of Leptonica. (Both available in Ubuntu 14.04 trusty Tahr.)
If the version is inconsistent, the following problems occur:
[HTML]
Technology sharing technology Share
Tesseract Open Source OCR Engine v3.02.02 with Leptonica
Error in Findtiffcompression:function not present
Error in Pixreadstreamtiff:function not present
Error in PixReadStream:tiff:no Pix returned
Error in Pixread:pix not read
Unsupported image type.
E. Download tesseract-3.02 installation package: http://tesseract-ocr.googlecode.com/files/tesseract-3.02.02.tar.gz
[HTML]
Technology sharing technology Share
wget http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.02.tar.gz
. qkxue.net/autogen.sh
./configure
Make
Make install
Ldconfig
F. Download tesseract-3.02 English Language pack: http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.02.eng.tar.gz, after decompression will All files under Tesseract-ocr/tessdata are copied to/usr/local/share/tessdata.
Test
[HTML]
Technology sharing technology Share
Tesseract phototest.tif Phototest-l Eng
In this case, a phototest.txt text file should be generated in the current directory, and the content is the text displayed phototest.tif.
Technology sharing
Technology sharing
Install TESSERACT-OCR under Linux