Install Tesseract-OCR in linux

Source: Internet
Author: User
Tags automake
Preparations for installing Tesseract-OCR: Compiling Environment: gccgcc-c ++ make (this environment is generally available on machines and can be ignored) packages on which 1yuminstallgccgcc-c ++ make depends: autoconfautomakelibtoollibjpeg-devellibpng-devellibtif... install Tesseract-OCR
Preparations:
Compiling Environment: gcc-c ++ make (this environment is generally available on machines and can be ignored)
1
Yum install gcc-c ++ make
Dependent package: autoconf automake libtool libjpeg-devel libpng-devel libtiff-devel zlib-devel leptonica (more than 1.67)
 
 
1. autoconf automake libtool libjpeg-devel libpng-devel libtiff-devel zlib-devel can be installed through yum:
1
Yum install autoconf automake libtool
2
Yum install libjpeg-devel libpng-devel libtiff-devel zlib-devel
2. leptonica requires source code compilation and installation
References:
Http://paramountideas.com/tesseract-ocr-30-and-leptonica-installation-centos-55-and-opensuse-113
Http://www.leptonica.org/source/README.html
Download leptonica package: http://www.leptonica.org/source/leptonica-1.68.tar.gz
Unzipping and switch to the leptonica-1.68 root directory
1
./Configure
2
Make
3
Make install
Tesseract installation:
Install tesseract after the dependency is installed.
Download tesseract-3.01 installation package: http://tesseract-ocr.googlecode.com/files/tesseract-3.01.tar.gz
Unzipping and switch to the tesseract-3.01 root directory
(If you encounter something similar to strngs in make. h: 1: error: stray '\ 357' in program error, please put tesseract-3.01/ccutil/strngs. the h file is converted to ANSI encoding for saving and then re-compiled)
1
./Autogen. sh
2
./Configure
3
Make
4
Make install
5
Ldconfig
Tesseract English language pack installation:
Download tesseract-3.01 English language pack: http://tesseract-ocr.googlecode.com/files/tesseract-ocr-3.01.eng.tar.gz
Decompress the package and copy all files under tesseract-ocr/tessdata to/usr/local/share/tessdata.
Installation is complete.
Test:
Switch to the unzipped tesseract-3.01 root directory (under this directory there is a built-in phototest. tif can be used for testing)
Command line:
1
Tesseract phototest. tif phototest-l eng
Output:
1
Tesseract Open Source OCR Engine v3.01 with Leptonica
2
Page 0
In this case, a phototest.txt text file should be generated in the current directory, which contains the text displayed in phototest. tif.
 
From snowman's blog
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.