OCR, i.e. optical Character recognition, optical character recognition. The following introduction comes from search:
OCR (Optical Character
Recognition, optical character recognition) refers to the process by which an electronic device (such as a scanner or digital camera) examines printed characters on a paper, determines its shape by detecting dark and bright patterns, and then translates the shape into computer text using a character recognition method, i.e., for printed characters, In the optical way, the text in the paper document is converted into a black-and-white dot-matrix image file, and the text in the image is converted into a text format by the recognition software, which is used by the word processing software to further edit the processing technology. It is the most important issue for OCR to improve the recognition accuracy by using auxiliary information, and ICR (Intelligent
Character
Recognition) is also the result of the noun. The main indicators to measure the performance of an OCR system are: Rejection rate, error rate, recognition speed, user interface friendliness, product stability, ease of use and feasibility.
To put it simply, it is to identify the text above the print, probably used more than the case is: take a picture with the device's camera, the picture has text, and then through the OCR technology to identify the characters in the photo, converted to characters. By the way, if you want to "do bad things", you can also use OCR technology to identify some simple image verification code, OH, but, now many sites of the verification code are more "cunning", to accurately identify it is not so easy.
OK, next look at how to use OCR technology in UAP.
The API used for OCR recognition is primarily under the WINDOWS.MEDIA.OCR namespace. Use the following:
1, call islanguagesupported static method check, whether to support the recognition of a language text, such as Traditional Chinese, Simplified Chinese, etc. (it is estimated that it can not recognize Oracle and Seal script).
2. Call the Ocrengine.trycreatefromlanguage method to create an Ocrengine instance from the specified language, or call the Trycreatefromuserprofilelanguages method to create it from a user-configured language. These methods are static and can be accessed directly.
3. The Recognizeasync method that invokes the Ocrengine instance begins to be recognized, and the Ocrresult object is returned asynchronously when the recognition is complete, where the Text property in the object is the recognized literal. The Recognizeasync method requires an Windows.Graphics.Imaging.SoftwareBitmap instance as a parameter, which is obtained through the Bitmapdecoder class, which is the image to be identified.
Here's an example to illustrate. In this example, you can select a picture with text and then identify the text in the diagram. The code is as follows:
Fileopenpicker Picker =NewFileopenpicker (); Picker. Filetypefilter.add (". jpg"); Picker. Filetypefilter.add (". JPEG"); //Select FileStorageFile Imgfile =awaitPicker. Picksinglefileasync (); if(Imgfile! =NULL) { using(Irandomaccessstream instream =awaitImgfile.openreadasync ()) { //Show PicturesBitmapImage BMP =NewBitmapImage (); Bmp. Decodepixelwidth= the; Bmp. SetSource (Instream.clonestream ()); This. img. Source =bmp; //decode PictureBitmapdecoder decoder =awaitBitmapdecoder.createasync (Bitmapdecoder.jpegdecoderid, instream); //Get ImagesSoftwarebitmap swbmp =awaitdecoder. Getsoftwarebitmapasync (); //ready to identifyWindows.Globalization.Language lang =NewWindows.Globalization.Language ("ZH-CN"); //determine if Simplified Chinese recognition is supported if(ocrengine.islanguagesupported (lang)) {Ocrengine engine=ocrengine.trycreatefromlanguage (lang); if(Engine! =NULL) {Ocrresult result=awaitengine. Recognizeasync (swbmp); if(Result! =NULL) {Tbresult.text=result. Text; } } } Else{Windows.UI.Popups.MessageDialog Dialog=NewWindows.UI.Popups.MessageDialog ("recognition of Simplified Chinese is not supported. "); awaitdialog. Showasync (); } } }
Currently support Simplified Chinese character recognition, but the accuracy rate can not reach 100%, 97% of the accuracy should be guaranteed. Look at the results of the recognition:
From the above results, "son", "Son", "a few" three words are not correctly identified, accurate rate is passable.
Sample source Download: Http://files.cnblogs.com/files/tcjiaan/OcrApp.zip
Okay, this is the time to get here, another day to blow.
"WIN10 Application Development" OCR recognition