Paper in many places has become increasingly out of favour, paperless office talk for more than 40 years, the office environment is restricting the production of paper Mountain. In the past few years, the concept of paperless office has undergone a significant shift. With the help of computer software, documents containing a large number of important management data and information can be stored more conveniently in electronic form. The benefits of scanning a document are not purely archival reasons. Optical character Recognition (OCR) technology is critical in order to access paper-based information and integrate information into digital workflows. Choosing the right OCR tool depends on your specific needs, such as online OCR services that are useful to some people, but there may be privacy issues and file size limitations. OCR software is a non-VW product, so open source substitution is relatively less than commercial-grade heavyweight products, plus OCR software requires advanced algorithms to correctly translate scanned images into actual text, and images contain not only text, but also layouts, graphics, and tables that may span multiple pages.
Excellent open source OCR software includes:
Tesseract
The original HP-developed image Recognition Class library, TESSERACT-OCR, has been updated to 2.04, which is the OCR that Google supported recently. Originally written by Hewlett-Packard, it is now open source.
Ocropus
Ocropus (TM) is an advanced file analysis and OCR system with pluggable layout analysis, pluggable character recognition, natural language statistical modeling and multi-language support capabilities.
Cuneiform
Cuneiform is the trademark of an OCR word recognition system, and is the first software developed by cognitive technology that runs under Windows. This project is a ported version of the software under Linux systems.
Gocr
GOCR is an open-source OCR optical recognition program.
Ocrfeeder
Ocrfeeder is an open-source OCR suite under the GNOME desktop. Paper or graphic documents can be converted into electronic documents.
Lios
Linux-intelligent-ocr-solution (Lios) is the next open source OCR solution for Linux, which converts printed documents into editable text.
RELATED LINKS
Want to access open source China via mobile Client (Android, IPhone and Windows phone): please click here
ONLINEOCR
NEWOCR
Free OCRImage Recognition Class Library
tesseract OCR"Business" Image word recognition tool
OcrkitOCR text recognition system
Cuneiform for LinuxOCR recognition
OcropusOCR Optical Recognition Program
GOCRWord Recognition tool
EyeWeb word recognition software
WEOCRPDF Text Recognition tool
gscan2pdfPython module for image text recognition
PytesserPicture Recognition text
OcrstyleOpen Source OCR Suite
OcrfeederOptical Character Recognition
GNU OcradThe OCR library for Python
PYOCROCR Tools
YAGF
TopOCR
SIMPLEOCR
OCR using Microsoft Office Document Imaging
OCR using Microsoft OneNote
Open source OCR Optical character recognition