Play Baidu speech recognition, it is so simple

Source: Internet
Author: User

Two days after the Ming Dynasty is the company's annual FedEx Day. My understanding is the technical community brainstorming, idea spray like a big festival.

For this event each person or two or three people a group need to have an idea, according to the current product status, put forward their own improvement or rich product ideas, I took out my mobile phone and went to the website, speech recognition This noun stand out, hit my mind. Compared to the previous finger-era manual input, a variety of Chinese and English or even the number of switching between, often because of wandering or hand shaking caused by input errors, and then a wild press DELETE key, over and over the input, face the huge screen, a kind of make not retwist feeling. Voice input can say goodbye to this annoyance, today's speech recognition accuracy is high, the use of simple and easy to operate, but also for the liberation of the hands paved the way. You can say to Siri to help me check the recent weather or set an alarm, you can use the voice-fly speech input method, where to enter where you can let a variety of brands of mobile phone bring your own voice assistant to tell you a joke ...

Today, I want to first move hands, understand the speech recognition technology, what good API can be called. Because it is the degree of Niang, so I was greeted by Baidu Voice, "permanent free intelligent Voice open platform" banner or deeply moved me.

Groping down, see two ways, one is based on the rest API to complete speech recognition, and the other is based on the mobile Android Platform app speech recognition.

First, access to tickets

1. Register as a developer

Click to enter http://yuyin.baidu.com/, use your Baidu account to complete the login, if you click on the "Application Management" tab found as shown

Explain that you need to complete registration verification, after submission, you are a Baidu developer user.

2. Create an App

When you're done, click on "app Management", if there's no list of apps on the page, you'll need to add an app, click "Create a new app" in the top right corner of the page, fill out the app category and give him a name, then complete the app creation. Once created, you'll see an effect like this.

In "View Key", we can see the app ID, API key and secret key we need to use later.

3. Open Service

Once the app is created, we need to turn on speech recognition services before we can use speech recognition. Click "Activate Service" on the application card and select speech recognition.

You are now eligible for admission to the ticket.

Second, the rest API-based speech recognition

Baidu Voice supports Android, iOS and Rest API three platforms. This first introduces the rest API, rather than building an Android or iOS development environment.

Enter the sample and documentation for the Http://yuyin.baidu.com/asr/download download rest API.

Examples are Java, Linux C, and PHP versions, and a TEST.PCM audio file is included.

Choose Java version, import Eclipse, the code is very simple for a test class.

Package Com.baidu.speech.serviceapi;import Java.io.bufferedreader;import Java.io.dataoutputstream;import Java.io.file;import Java.io.fileinputstream;import Java.io.ioexception;import Java.io.InputStream;import Java.io.inputstreamreader;import Java.net.httpurlconnection;import Java.net.url;import  Javax.xml.bind.datatypeconverter;import Org.json.jsonobject;public class Sample {private static final String ServerURL    = "Http://vop.baidu.com/server_api";    private static String token = "";    private static final String Testfilename = "C:\\USERS\\ADMINISTRATOR\\WORKSPACE\\SPEECHRECOGNITION\\SRC\\TEST.PCM"; Put your own params here private static final String ApiKey = "* * *";//ApiKey Here is the previous application in the application card in the ApiKey private static Final string Secretkey = "* * *";//Secretkey is the previous application in the application card in the Secretkey private static final String cuid = "* * *";//cuid is the device's        The only indication, because I use a PC, so here is the NIC MAC address public static void main (string[] args) throws Exception {GetToken ();        Method1 (); Method2 (); } private static void GetToken () throws Exception {String Gettokenurl = "https://openapi.baidu.com/oauth/2.0/to        Ken?grant_type=client_credentials "+" &client_id= "+ ApiKey +" &client_secret= "+ secretkey;        HttpURLConnection conn = (httpurlconnection) new URL (Gettokenurl). OpenConnection ();    token = new Jsonobject (Printresponse (conn)). GetString ("Access_token");        } private static void Method1 () throws Exception {file Pcmfile = new File (testfilename);        HttpURLConnection conn = (httpurlconnection) new URL (ServerURL). OpenConnection ();        Construct params jsonobject params = new Jsonobject ();        Params.put ("format", "PCM");        Params.put ("rate", 8000);        Params.put ("Channel", "1");        Params.put ("token", token);        Params.put ("cuid", cuid);        Params.put ("Len", Pcmfile.length ());        Params.put ("Speech", Datatypeconverter.printbase64binary (LoadFile (Pcmfile))); Add Request HEader Conn.setrequestmethod ("POST"); Conn.setrequestproperty ("Content-type", "Application/json;        Charset=utf-8 ");        Conn.setdoinput (TRUE);        Conn.setdooutput (TRUE);        Send request DataOutputStream WR = new DataOutputStream (Conn.getoutputstream ());        Wr.writebytes (Params.tostring ());        Wr.flush ();        Wr.close ();    Printresponse (conn);        } private static void Method2 () throws Exception {file Pcmfile = new File (testfilename); HttpURLConnection conn = (httpurlconnection) New URL (ServerURL + "? cuid=" + cuid + "&token=" + token). O        Penconnection ();        Add Request Header Conn.setrequestmethod ("POST"); Conn.setrequestproperty ("Content-type", "AUDIO/PCM;        rate=8000 ");        Conn.setdoinput (TRUE);        Conn.setdooutput (TRUE);        Send request DataOutputStream WR = new DataOutputStream (Conn.getoutputstream ());        Wr.write (LoadFile (pcmfile));      Wr.flush ();  Wr.close ();    Printresponse (conn); } private static String Printresponse (HttpURLConnection conn) throws Exception {if (Conn.getresponsecode ()! = 2        (XX) {//Request error return "";        } InputStream is = Conn.getinputstream ();        BufferedReader rd = new BufferedReader (new InputStreamReader (IS));        String Line;        StringBuffer response = new StringBuffer ();            while (line = Rd.readline ())! = null) {response.append (line);        Response.append (' \ R ');        } rd.close ();        System.out.println (New Jsonobject (Response.tostring ()). ToString (4));    return response.tostring ();        } private static byte[] LoadFile (file file) throws IOException {InputStream is = new FileInputStream (file);        Long length = File.length ();        byte[] bytes = new byte[(int) length];        int offset = 0;        int numread = 0; while (Offset < bytes.length && (NUMRead = is.read (bytes, offset, bytes.length-offset)) >= 0) {offset + = Numread;            } if (offset < bytes.length) {is.close ();        throw new IOException ("Could not completely read file" + file.getname ());        } is.close ();    return bytes; }}

  

The entire class runs exactly the same as the normal class, and the information given to the console is as follows:

{    "Access_token": "***66a6adc3bb14***99.2592000.1462845194.282335-7***",    "Refresh_token": "25.344b6*** 6a9748d8b25a***360000.1775613194.282335-7*** ",    " scope ":" Public audio_voice_assistant_get wise_adapt Lebo_ Resource_base lightservice_public hetu_basic lightcms_map_poi Kaidian_kaidian ",    " Session_key ":" 9mzdC*** 7bvkta4huievyxrxoupy***rss8h4936rrxxd***v4pmq1y+6ovkac+18rrxrtst ",    " Session_secret ":" *352***e9a7a664ef*** 775e ",    " expires_in ": 2592000}{    " result ": [" Baidu Voice provides technical support, "],    " err_msg ":" Success. ",    " sn ":" 160625465371460253194 ",    " Corpus_no ":" 6271739712934435529 ",    " Err_no ": 0}{    " result ": [" Baidu Voice provides technical support, " ],    "err_msg": "Success.",    "sn": "613862746801460253195",    "Corpus_no": "6271739717258680030",    "Err_no": 0}

  As seen from the results, the voice content installed in the TEST.PCM is "Baidu Voice to provide technical support." So, I also take advantage of the Windows tape recorder function, recorded a WAV format voice, an initial error 3301, to view the document said to identify the error, open the audio file, found no input anything, so re-entry to identify, although there is no error, but the recognition is not the voice content, It is estimated that the noise is too loud.

Third, based on the Android platform for speech recognition

Obviously, the rest API mode is still a play addiction, thinking about the effect on the mobile side, the mobile platform has Android and iOS, considering the situation of their own book, or choose Android, of course, these are not familiar.

Online search for a direct-use Android environment http://blog.sina.com.cn/s/blog_6de000c20101rpva.html#cmt_2623882, download a pass Eclipse, SDK and ADT, etc. With the memory of a previous Android platform, reluctantly set the environment.

It is also necessary to download the Android SDK and documentation as in the rest API. The SDK directory contains the following:

The functions of each module are as follows:


Import the demo project into Eclipse and configure virtual device to start the VM (I found in the actual operation that the libbdeasrandroid.so and Libbdvoicerecognitionclient_mfe_ V1.so Import Classpath will be error, so I deleted the two packages, the effect of running up as follows:

Click the middle button on the toolbar below to go to all apps and find the app "Speech Recorder":

Click to enter the app:

At present, click on the "Record", the application will flash back, not yet find out what the reason, behind the study (have encountered the welcome message) ~ ~ ~

Generally speaking, Baidu Voice is good to get started, the document is more detailed, but on the personal recording of audio recognition, the effect has yet to be improved (may be the audio file noise is too large).

First, a familiar face, understand the supported platform, API call method, two days after the FedEx day take a good look at this piece.

If you feel that reading this article is helpful to you, please click " recommend " button, your "recommendation" will be my biggest writing motivation! If you want to keep an eye on my article, please scan the QR code, follow Jackiezheng's public number, I will push my article to you and share with you the high-quality articles I have read every day.

  

Play Baidu speech recognition, it is so simple

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.