Ocrodjvu is a packaging program for OCR systems, mainly for performing OCR systems on DjVu files.
Ocrodjvu 0.7.6 This version updates tesseract≥3.00, the bounding box for a particular character is now extracted with high precision. You can choose to use a HTML5 parser.
OCR (optical Character recognition, optical character recognition) is the process by which electronic devices (such as scanners or digital cameras) examine printed characters on paper, determine their shape by detecting dark, bright patterns, and then translate shapes into computer text using character recognition methods; The text data is scanned, then the image file is analyzed and processed, and the process of text and layout information is obtained. How to debug or use auxiliary information to improve the correct rate of recognition is the most important subject of OCR, and the noun of ICR (intelligent Character recognition) is also produced. The main indicators of the performance of an OCR system are: rejection rate, false recognition rate, identification speed, user-friendly interface, product stability, ease of use and feasibility.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.