The simplest screen OCR character recognition

Source: Internet
Author: User
The simplest screen OCR character recognition

Using the office2003 included with the Modi can be OCR text recognition, do not need sampling, easy to use, the recognition rate is very good.
The text that is appropriate to identify the comparison specification. The disadvantage is that the office2003 and Microsoft Document Imaging components must be installed.

This method is suitable for ordinary character recognition, and is not suitable for verification code identification.
Verification Code Identification please refer to Tutorial: http://www.yhhe.net/ape/book/fap/f2/ix.html

This program uses the COMX plug-in, please refer to the post: http://www.yhhe.net/bbs/dispbbs.asp?BoardID=4&ID=179&replyID=

Here is the demo source code:

Download Simulation Wizard: Http://www.yhhe.net/Fairy_Ape.exe
Open the Simulation wizard , paste the following code into the source editor, press F5 to run.

img = image.new ();--Create a Picture object
Img:capture (0,100,200,300,400);--grab screen, range x=100,y=200, wide =300, high =400
Img:save (_lasdir ... " Test.bmp "); --Save the picture to the script directory (aka _lasdir)

--Import COMx Plugin
Import ("Std");
Import2 ("COMx", "Http://www.yhhe.net/ape/import/comx/comx.dll");

--Create a Modi object (you must install the Microsoft Document Imaging component in office2003)
Mdoc = COMx. CreateObject ("MODI. Document ");
if (not mdoc) then
Win.messagebox ("Please install office2003 and Microsoft Document Imaging Components", "Screen OCR text recognition")
Return
False
End

--Import Pictures
Mdoc:create (_lasdir ... " Test.bmp ");
--OCR recognition, the parameters are the language ID, whether auto-lure, whether automatic stretching
MDOC:OCR (0x804,_false,_false);

Local mi = mdoc. Images (0);
--Quick access to all text
Win.messagebox (mi. Layout.text, "Mdoc. Images (0). Layout.text ");

--Get character details
Local word = mi. Layout.words (0)
Local str = "Id:". Word. Id.. "/r/n"

str = str.. "Line Id:". Word. LineId. "/r/n";
str = str.. "Region Id:". Word. RegionID. "/r/n";
str = str.. "Font Id:". Word. Fontid. "/r/n";
str = str.. "Recognition confidence:". Word. Recognitionconfidence. "/r/n";
str = str.. "Text:". Word. Text;

Win.messagebox (str, "Mdoc. Images (0). Layout.words (0) ")

The first parameter of the MDOC:OCR function specifies a language ID
The optional language IDs in the Simplified Chinese office are: Auto Select 0x800
English 9
Simplified Chinese 0x804

The optional language ID in Traditional Chinese office is: Auto Select 0x800
English 9
Traditional Chinese 0x404

The correct language ID can improve the recognition rate.
If MDOC:OCR does not find the text in the picture, it will error and terminate the simulation program.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.