Open source OCR Optical character recognition

Source: Internet
Author: User
Tags tesseract ocr

Paper in many places has become increasingly out of favour, paperless office talk for more than 40 years, the office environment is restricting the production of paper Mountain. In the past few years, the concept of paperless office has undergone a significant shift. With the help of computer software, documents containing a large number of important management data and information can be stored more conveniently in electronic form. The benefits of scanning a document are not purely archival reasons. Optical character Recognition (OCR) technology is critical in order to access paper-based information and integrate information into digital workflows. Choosing the right OCR tool depends on your specific needs, such as online OCR services that are useful to some people, but there may be privacy issues and file size limitations. OCR software is a non-VW product, so open source substitution is relatively less than commercial-grade heavyweight products, plus OCR software requires advanced algorithms to correctly translate scanned images into actual text, and images contain not only text, but also layouts, graphics, and tables that may span multiple pages.

Excellent open source OCR software includes:

Tesseract

The original HP-developed image Recognition Class library, TESSERACT-OCR, has been updated to 2.04, which is the OCR that Google supported recently. Originally written by Hewlett-Packard, it is now open source.

Ocropus

Ocropus (TM) is an advanced file analysis and OCR system with pluggable layout analysis, pluggable character recognition, natural language statistical modeling and multi-language support capabilities.

Cuneiform

Cuneiform is the trademark of an OCR word recognition system, and is the first software developed by cognitive technology that runs under Windows. This project is a ported version of the software under Linux systems.

Gocr

GOCR is an open-source OCR optical recognition program.

Ocrfeeder

Ocrfeeder is an open-source OCR suite under the GNOME desktop. Paper or graphic documents can be converted into electronic documents.

Lios

Linux-intelligent-ocr-solution (Lios) is the next open source OCR solution for Linux, which converts printed documents into editable text.

RELATED LINKS

Want to access open source China via mobile Client (Android, IPhone and Windows phone): please click here

 
ONLINEOCR NEWOCR Free OCRImage Recognition Class Library tesseract OCR"Business" Image word recognition tool OcrkitOCR text recognition system Cuneiform for LinuxOCR recognition OcropusOCR Optical Recognition Program GOCRWord Recognition tool EyeWeb word recognition software WEOCRPDF Text Recognition tool gscan2pdfPython module for image text recognition PytesserPicture Recognition text OcrstyleOpen Source OCR Suite OcrfeederOptical Character Recognition GNU OcradThe OCR library for Python PYOCROCR Tools YAGF TopOCR SIMPLEOCR OCR using Microsoft Office Document Imaging OCR using Microsoft OneNote

Open source OCR Optical character recognition

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.