java-Baidu API Image text recognition (support in English)

Source: Internet
Author: User
Tags base64 oauth readline urlencode

Ps:

Based on Java 1.8
Version control: Maven
You need to get the corresponding project Api_key,secret_key before use, these parameters must be used when using the API, to generate Access_token.
How to get these parameters: apply for a "generic word recognition" project at Baidu Developer Center, and then you can get these parameters.
The preparation conditions are complete, and now the image recognition is started.

1. Preparing the Pom file

<!--Https://mvnrepository.com/artifact/com.alibaba/fastjson--><dependency>    <groupId> com.alibaba</groupid>    <artifactId>fastjson</artifactId>    <version>1.2.46</ version></dependency><!--Https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient-- ><dependency>    <groupId>org.apache.httpcomponents</groupId>    <artifactId> Httpclient</artifactid>    <version>4.5.5</version></dependency>

2. Get Access_token

Package Com.wsk.netty.check;import Org.json.jsonobject;import Java.io.bufferedreader;import Java.io.inputstreamreader;import Java.net.httpurlconnection;import Java.net.url;import Java.util.List;import java.util.map;/** * Get Token class * * @Author: Wushukai * @Date: 2018/2/12 10:04 */public class Authservice {/** * get right Token only * @return Return Example: * {* "Access_token": "24.460da4889caad24cccdb1fea17221975.2592000.1491995545.282335-12        34567 ", *" expires_in ": 2592000 *} */public static String GetAuth () {//official website Gets the API Key updated for your registered        String clientId = "* *";        The official website gets the Secret Key update for your registered String Clientsecret = "* *";    Return GetAuth (ClientId, Clientsecret);     }/** * Get API access token * The token has a certain expiration date and needs to be managed by itself and re-acquired when it fails. * @param AK-Baidu Cloud website to get the API key * @param SK-Baidu Cloud website to obtain the Securet Key * @return Assess_token Example: * "24.460da4889ca ad24cccdb1fea17221975.2592000.1491995545.282335-1234567 "*/private static String GetaUth (String ak, String sk) {//Get token address string authhost = "Https://aip.baidubce.com/oauth/2.0/token?"; String Getaccesstokenurl = authhost//1. Grant_type is fixed parameter + "Grant_type=client_credentials"//2. The official website gets the API Key + "&client_id=" + AK//3.        Secret Key + "&client_secret=" + SK for the official website;            try {URL realurl = new URL (getaccesstokenurl);            The connection between open and URL httpurlconnection connection = (httpurlconnection) realurl.openconnection ();            Connection.setrequestmethod ("GET");            Connection.connect ();            Get all response header fields Map<string, list<string>> Map = Connection.getheaderfields (); Traverse all response header fields for (String Key:map.keySet ()) {System.err.println (key + "--->" + map.get (key            )); }//define BufferedReader input stream to read the response of the URL BufferedReader in = nEW BufferedReader (New InputStreamReader (Connection.getinputstream ()));            StringBuilder result = new StringBuilder ();            String Line;            while (line = In.readline ())! = null) {result.append (line);            /** * Returns an example of the result */System.err.println ("Result:" + result);            Jsonobject jsonobject = new Jsonobject (result.tostring ());        Return jsonobject.getstring ("Access_token"); } catch (Exception e) {System.err.printf ("Get token failed!            ");        E.printstacktrace (System.err);    } return null;    } public static void Main (string[] args) {GetAuth (); }}

3. Writing tool classes that convert images into Base64 and then into UrlEncode

package Com.wsk.netty.check;import Sun.misc.base64encoder;import Java.io.fileinputstream;import java.io.ioexception;import Java.io.inputstream;import java.net.URLEncoder;/** * Image conversion Base64 UrlEncode results * @Author: Wushukai * @Date: 2018/2/12 10:43 */public class BaseImg64 {/** * Convert a local image to Ba Se64 String * @param imgpath local Image address * @return Image conversion Base64 after UrlEncode results */public static string Getimagestrfrompa        Th (String imgpath) {inputstream in;        byte[] data = null;            Reads a picture byte array try {in = new FileInputStream (Imgpath);            data = new byte[in.available ()];            In.read (data);        In.close ();        } catch (IOException e) {e.printstacktrace ();        }//byte array Base64 encoded Base64encoder encoder = new Base64encoder ();    Returns the byte array string Base64 encoded and UrlEncode return Urlencoder.encode (Encoder.encode (data)); }}

4. Write a method to invoke the Baidu API interface to obtain the recognition results

Package Com.wsk.netty.check;import Org.apache.http.httpresponse;import Org.apache.http.client.httpclient;import Org.apache.http.client.methods.httppost;import Org.apache.http.entity.stringentity;import Org.apache.http.impl.client.defaulthttpclient;import Org.apache.http.util.entityutils;import Java.io.File;import Java.io.ioexception;import java.net.uri;import java.net.urisyntaxexception;/** * Image text Recognition * * @Author: WuShukai * @Date: 20 18/2/12 10:25 */public class Check {private static final String Post_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/g    Eneral_basic?access_token= "+ Authservice.getauth ();     /** * Identify the text of a local image * * @param path local image address * @return recognition result, open exception for JSON format * @throws urisyntaxexception URI * @throws IOException IO Stream exception */public static string Checkfile (string path) throws URISyntaxException, Ioexc        eption {File File = new file (path);        if (!file.exists ()) {throw new NullPointerException ("Picture does not exist");        }String image = Baseimg64.getimagestrfrompath (path);        String param = "image=" + image;    Return post (param); }/** * @param url image URL * @return recognition results for JSON format */public static string Checkurl (string url) throws Ioex        Ception, urisyntaxexception {String param = "url=" + url;    Return post (param); /** * By passing Parameters: URL and image for text recognition * * @param param distinguish between URL or image recognition * @return recognition result * @throws Urisyntaxe Xception URI Open Exception * @throws IOException IO Stream exception */private static String post (string param) throws Urisynta        Xexception, IOException {//start building POST request HttpClient HttpClient = new Defaulthttpclient ();        HttpPost post = new HttpPost ();        Uri url = new Uri (POST_URL);        Post.seturi (URL); To set the request header, the request header must be application/x-www-form-urlencoded, because it is passed a very long string and cannot be sent in fragments Post.setheader ("Content-type", "Application        /x-www-form-urlencoded ");        stringentity entity = new stringentity (param); POst.setentity (entity);        HttpResponse response = Httpclient.execute (POST);        System.out.println (Response.tostring ());            if (Response.getstatusline (). Getstatuscode () = = () {String str;                try {/* Read JSON string data returned by the server */str = entityutils.tostring (response.getentity ());                System.out.println (str);            return str;                } catch (Exception e) {e.printstacktrace ();            return null;    }} return null;        public static void Main (string[] args) {String path = "E:\\find.png";            try {Long now = System.currenttimemillis ();            Checkfile (path); Checkurl ("https://gss3.bdstatic.com/-Po3dSag_xI4khGkpoWK1HF6hhy/baike/c0%3Dbaike80%2C5%2C5%2C80%2C26/sign=            08c05c0e8444ebf8797c6c6db890bc4f/fc1f4134970a304e46bfc5f7d2c8a786c9175c19.jpg "); SYSTEM.OUT.PRINTLN ("Time Consuming:" + (System.currenttimemillis ()-now)/1000 + "S "); } catch (URISyntaxException |        IOException e) {e.printstacktrace (); }    }}

5. Recognition results (test local image recognition only)

Chinese

Results:

Conclusion

This is tested using postman, and the JSON returned is not easy to read with the idea console.
As can be seen here, the time-consuming is 1s, although the recognition rate is high, but there are some gaps, such as the fifth column of the recognition results, only returned "I am a", and the original picture of a large number of not recognized.

English:

Results:

Conclusion

Single recognition of English pictures, the effect is relatively satisfactory, time-consuming, high precision.

Combination of Chinese and English:

Results:

Conclusion

The results are also quite satisfactory. Baidu's identification is still to double-click 66666.

Specific document: HTTP://AI.BAIDU.COM/DOCS#/OCR-API/E1BD77F3

Ps:
Based on Java 1.8
Version control: Maven
You need to get the corresponding project Api_key,secret_key before use, these parameters must be used when using the API, to generate Access_token.
How to get these parameters: apply for a "generic word recognition" project at Baidu Developer Center, and then you can get these parameters.
The preparation conditions are complete and the image recognition is now started.

test4j Picture Text recognition Tutorial: http://blog.csdn.net/wsk1103/article/details/54173282

1. Preparing the Pom file

<!--Https://mvnrepository.com/artifact/com.alibaba/fastjson--><dependency>    <groupId> com.alibaba</groupid>    <artifactId>fastjson</artifactId>    <version>1.2.46</ version></dependency><!--Https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient-- ><dependency>    <groupId>org.apache.httpcomponents</groupId>    <artifactId> Httpclient</artifactid>    <version>4.5.5</version></dependency>

2. Get Access_token

Package Com.wsk.netty.check;import Org.json.jsonobject;import Java.io.bufferedreader;import Java.io.inputstreamreader;import Java.net.httpurlconnection;import Java.net.url;import Java.util.List;import java.util.map;/** * Get Token class * * @Author: Wushukai * @Date: 2018/2/12 10:04 */public class Authservice {/** * get right Token only * @return Return Example: * {* "Access_token": "24.460da4889caad24cccdb1fea17221975.2592000.1491995545.282335-12        34567 ", *" expires_in ": 2592000 *} */public static String GetAuth () {//official website Gets the API Key updated for your registered        String clientId = "* *";        The official website gets the Secret Key update for your registered String Clientsecret = "* *";    Return GetAuth (ClientId, Clientsecret);     }/** * Get API access token * The token has a certain expiration date and needs to be managed by itself and re-acquired when it fails. * @param AK-Baidu Cloud website to get the API key * @param SK-Baidu Cloud website to obtain the Securet Key * @return Assess_token Example: * "24.460da4889ca ad24cccdb1fea17221975.2592000.1491995545.282335-1234567 "*/private static String GetaUth (String ak, String sk) {//Get token address string authhost = "Https://aip.baidubce.com/oauth/2.0/token?"; String Getaccesstokenurl = authhost//1. Grant_type is fixed parameter + "Grant_type=client_credentials"//2. The official website gets the API Key + "&client_id=" + AK//3.        Secret Key + "&client_secret=" + SK for the official website;            try {URL realurl = new URL (getaccesstokenurl);            The connection between open and URL httpurlconnection connection = (httpurlconnection) realurl.openconnection ();            Connection.setrequestmethod ("GET");            Connection.connect ();            Get all response header fields Map<string, list<string>> Map = Connection.getheaderfields (); Traverse all response header fields for (String Key:map.keySet ()) {System.err.println (key + "--->" + map.get (key            )); }//define BufferedReader input stream to read the response of the URL BufferedReader in = nEW BufferedReader (New InputStreamReader (Connection.getinputstream ()));            StringBuilder result = new StringBuilder ();            String Line;            while (line = In.readline ())! = null) {result.append (line);            /** * Returns an example of the result */System.err.println ("Result:" + result);            Jsonobject jsonobject = new Jsonobject (result.tostring ());        Return jsonobject.getstring ("Access_token"); } catch (Exception e) {System.err.printf ("Get token failed!            ");        E.printstacktrace (System.err);    } return null;    } public static void Main (string[] args) {GetAuth (); }}
  .

3. Writing tool classes that convert images into Base64 and then into UrlEncode

Package Com.wsk.netty.check;import Sun.misc.base64encoder;import Java.io.fileinputstream;import Java.io.ioexception;import Java.io.inputstream;import java.net.urlencoder;/** * Image conversion base64 after UrlEncode results * @Author: Wushukai * @Date: 2018/2/12 10:43 */public class BaseImg64 {    /**     * Convert a local image to Base64 string     * @param imgpath local image address     * @return Image conversion Base64 urlencode result *     /public    static string Getimagestrfrompath (String imgpath) {        InputStream in;        byte[] data = null;        Reads a picture byte array        try {in            = new FileInputStream (imgpath);            data = new byte[in.available ()];            In.read (data);            In.close ();        } catch (IOException e) {            e.printstacktrace ();        }        The byte array Base64 encoded        base64encoder encoder = new Base64encoder ();        Returns the byte array string Base64 encoded and UrlEncode        return Urlencoder.encode (Encoder.encode (data));}    }

.

4. Write a method to invoke the Baidu API interface to obtain the recognition results

Package Com.wsk.netty.check;import Org.apache.http.httpresponse;import Org.apache.http.client.httpclient;import Org.apache.http.client.methods.httppost;import Org.apache.http.entity.stringentity;import Org.apache.http.impl.client.defaulthttpclient;import Org.apache.http.util.entityutils;import Java.io.File;import Java.io.ioexception;import java.net.uri;import java.net.urisyntaxexception;/** * Image text Recognition * * @Author: WuShukai * @Date: 20 18/2/12 10:25 */public class Check {private static final String Post_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/g    Eneral_basic?access_token= "+ Authservice.getauth ();     /** * Identify the text of a local image * * @param path local image address * @return recognition result, open exception for JSON format * @throws urisyntaxexception URI * @throws IOException IO Stream exception */public static string Checkfile (string path) throws URISyntaxException, Ioexc        eption {File File = new file (path);        if (!file.exists ()) {throw new NullPointerException ("Picture does not exist");        }String image = Baseimg64.getimagestrfrompath (path);        String param = "image=" + image;    Return post (param); }/** * @param url image URL * @return recognition results for JSON format */public static string Checkurl (string url) throws Ioex        Ception, urisyntaxexception {String param = "url=" + url;    Return post (param); /** * By passing Parameters: URL and image for text recognition * * @param param distinguish between URL or image recognition * @return recognition result * @throws Urisyntaxe Xception URI Open Exception * @throws IOException IO Stream exception */private static String post (string param) throws Urisynta        Xexception, IOException {//start building POST request HttpClient HttpClient = new Defaulthttpclient ();        HttpPost post = new HttpPost ();        Uri url = new Uri (POST_URL);        Post.seturi (URL); To set the request header, the request header must be application/x-www-form-urlencoded, because it is passed a very long string and cannot be sent in fragments Post.setheader ("Content-type", "Application        /x-www-form-urlencoded ");        stringentity entity = new stringentity (param); POst.setentity (entity);        HttpResponse response = Httpclient.execute (POST);        System.out.println (Response.tostring ());            if (Response.getstatusline (). Getstatuscode () = = () {String str;                try {/* Read JSON string data returned by the server */str = entityutils.tostring (response.getentity ());                System.out.println (str);            return str;                } catch (Exception e) {e.printstacktrace ();            return null;    }} return null;        public static void Main (string[] args) {String path = "E:\\find.png";            try {Long now = System.currenttimemillis ();            Checkfile (path); Checkurl ("https://gss3.bdstatic.com/-Po3dSag_xI4khGkpoWK1HF6hhy/baike/c0%3Dbaike80%2C5%2C5%2C80%2C26/sign=            08c05c0e8444ebf8797c6c6db890bc4f/fc1f4134970a304e46bfc5f7d2c8a786c9175c19.jpg "); SYSTEM.OUT.PRINTLN ("Time Consuming:" + (System.currenttimemillis ()-now)/1000 + "S "); } catch (URISyntaxException |        IOException e) {e.printstacktrace (); }    }}

.

5. Recognition results (test local image recognition only)

Chinese

Results:

Conclusion

This is tested using postman, and the JSON returned is not easy to read with the idea console.
As can be seen here, the time-consuming is 1s, although the recognition rate is high, but there are some gaps, such as the fifth column of the recognition results, only returned "I am a", and the original picture of a large number of not recognized.

English:

Results:

Conclusion

Single recognition of English pictures, the effect is relatively satisfactory, time-consuming, high precision.

Combination of Chinese and English:

Results:

Conclusion

The results are also quite satisfactory.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.