: This article describes how to implement ocr in php. For more information about PHP tutorials, see. MORE: http://www.webyang.net/Html/web/article_161.html
The Baidu definition of OCR (Optical Character Recognition, Optical Character Recognition) refers to the Character printed on paper by an electronic device (such as a scanner or digital camera), which is determined by detecting the dark and bright pattern, then, we use the character recognition method to translate shapes into Computer Texts. that is, for printed characters, we use optical methods to convert the texts in paper documents into black and white dot matrix image files, the recognition software converts the text in the image into a text format for further editing and processing by the text processing software.
As an engineer, in actual programming, texts in images may need to be displayed, which requires ocr technology. Because php development, so the first choice of php, find the php ocr Extension test, the results found that unavailable (address: http://sourceforge.net/projects/phpocr.berlios )? I also read demos from many friends online. Basically, the principle is to break down the image into a 0, 1 matrix, and then convert the image into a corresponding string based on the features. Testing is not feasible. Then we can see that PHP is rarely used for OCR and is not suitable. The Language efficiency is too low. This algorithm requires high efficiency. Try OCR algorithms such as C and MATLAB. There are many biased algorithms such as OCR for matlab.
It is helpless to learn less, not C. I accidentally discovered that Baidu has ocr APIs: http://apistore.baidu.com/apiworks/servicedetail/146.html.
I wrote a play:
-
- Header ("Content-type: text/html; charset = utf-8 ");
-
- Function curl ($ img ){
-
- $ Ch = curl_init ();
- $ Url = 'http: // apis.baidu.com/apistore/idlocr/ocr'#//baidu OCR api
- $ Header = array (
- 'Content-Type: application/x-www-form-urlencoded ',
- 'Apikey: 69c2ace1ef297ce88869f0751cb1b618 ',
- );
-
- $ Data_temp = file_get_contents ($ img );
- $ Data_temp = urlencode (base64_encode ($ data_temp ));
- // Encapsulate necessary parameters
- $ Data = "fromdevice = pc & clientip = 127.0.0.1 & detecttype = LocateRecognize & languagetype = CHN_ENG & imagetype = 1 & image =". $ data_temp;
- Curl_setopt ($ ch, CURLOPT_HTTPHEADER, $ header); // add the apikey to the header
- Curl_setopt ($ ch, CURLOPT_POST, 1 );
- Curl_setopt ($ ch, CURLOPT_POSTFIELDS, $ data); // add a parameter
- Curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1 );
- Curl_setopt ($ ch, CURLOPT_URL, $ url); // execute the HTTP request
- $ Res = curl_exec ($ ch );
- If ($ res = FALSE ){
- Echo "cURL Error:". curl_error ($ ch );
- }
- Curl_close ($ ch );
- $ Temp_var = json_decode ($ res, true );
- Return $ temp_var;
-
- }
-
- $ WordArr = curl('4.jpg ');
- If ($ wordArr ['errnum'] = 0 ){
- Var_dump ($ wordArr );
- } Else {
- Echo "recognition error:". $ wordArr ["errMsg"];
- }
After testing a few images, the accuracy is still quite high. Is unrealistic ~
Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.
The above section describes how to implement ocr in php, including some content. I hope to help anyone who is interested in PHP tutorials.