Reference: http://blog.sina.com.cn/s/blog_923fdd9b0101flx1.html
Realization of speech recognition via Google Voice interface
Recently there are features in the project that need to implement speech recognition. Toss a few days to do well. At the beginning of the time do not have a clue, the information found on the internet is a mess, or is very ancient implementation methods, some simple code snippets. So I decided to share my experience with everyone.
To implement the speech recognition process in iOS, follow these steps:
Recording->PCM format, conversion wav-> conversion flac-> to send a request to Google to wait for the return of JSON data, such as parsing data;
First, if you want to use the Google interface to implement speech recognition, you must know the following points:
1. How to send a POST request. (You can use open Source Library asihttprequest,afnetworking, which encapsulate network requests and are very simple to use);
2. Understand the audio format Pcm,wav,flac, (the relationship between the three audio formats, because the Google interface only accepts FLAC audio format, other formats are not recognized, iOS can not record FLAC audio format, can not record WAV, can only record PCM, so to step-by-step conversion);
3. Learn how the Avaudiorecorder class is used and how to configure it.
Recording in iOS will use the Avaudiorecorder class, which is an instance of this class, as follows:
-(ID) Initwithurl: (nsurl *) URL settings: (nsdictionary *) settings error: (Nserror *) Outerror;
URL: Where the sound is stored after the recording is completed,
Settings: Set the parameters of the recording sound, only a critical key to tell you about the Avformatidkey, this key determines the format of the sound you recorded, we want to record in LPCM format, uncompressed original data, so that we convert, So use the KAUDIOFORMATLINEARPCM value. Other keys can be seen in the Help document,
Nsmutabledictionary *recordsetting =[[nsmutabledictionaryalloc]init];
[Recordsetting Setvalue:[nsnumbernumberwithint:kaudioformatlinearpcm]forkey:avformatidkey];
[Recordsetting Setvalue:[nsnumbernumberwithfloat:16000.0]forkey:avsampleratekey];
[Recordsetting Setvalue:[nsnumbernumberwithint:1]forkey:avnumberofchannelskey];
[Recordsetting Setvalue:[nsnumbernumberwithint:16]forkey:avlinearpcmbitdepthkey];
[Recordsetting Setvalue:[nsnumbernumberwithint:avaudioqualityhigh]forkey:avencoderaudioqualitykey];
[Recordsetting setvalue:@ (NO) Forkey:avlinearpcmisbigendiankey];
Once you've set up this object, you're ready to start recording. Get LPCM format audio data start our first conversion, convert to WAV, what is WAV? Click to know what WAV is and then you can start transcoding. transcoding is implemented in C, part of the code in the following I packaged files inside;
After converting the file to WAV, you also need to convert WAV to FLAC to be uploaded to the Google interface for speech recognition, but fortunately someone in GitHub has encapsulated a FLAC open Source library: Https://github.com/jhurt/FLACiOS
Download this source code to remove the support of OGG, otherwise compiled. Directly click on the file-, after compiling into the products catalog to get. A and the framework, put this two files together to join your project.
After the sound processing will be sent to the Google Voice interface request. I am using the ASI request, we can use other libraries to send, after all, ASI is a bit too old, I just used it. Here filepath is the address of the converted FLAC file;
#define google_audio_url@ "http://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang= ZH-CN "
Nsurl *url = [Nsurlurlwithstring:google_audio_url];
Asiformdatarequest *request =[asiformdatarequestrequestwithurl:url];
[Request addrequestheader:@ "Content-type" value:@ "audio/x-flac;rate=16000"];
[Requestappendpostdatafromfile:filepath];
[requestsetrequestmethod:@ "POST"];
Request.completionblock = ^{
NSLog (@ "json:%@", request.responsestring);
NSData *data = Request.responsedata;
Idret = nil;
RET =[nsjsonserializationjsonobjectwithdata:data Options:NSJSONReadingMutableContainerserror:nil];
NSLog (@ "ret%@", ret);
Results (ret);
};
Request.failedblock = ^{
Uialertview *alert =[[uialertviewalloc]initwithtitle:@ "error" message:@ "Network request Error" delegate:nilcancelbuttontitle:@ "OK" Otherbuttontitles:nil,nil];
[Alert show];
NSLog (@ "Network request error:%@", request.error);
};
[Request startsynchronous];
-----------------------------------------------------------------------------------------------The following JSON parsing is returned by Google------------ --------------------------------------------------------------------------------
if (dic ==nil | | [dic Count] ==0) {
Return
}
Nsarray *array = [dicobjectforkey:@ "Hypotheses"];
if ([Arraycount]) {
Nsdictionary *dic_hypotheses = [arrayobjectatindex:0];
NSString * scontent= [nsstringstringwithformat:@ "%@", [dic_hypothesesobjectforkey:@ "utterance"]];
Self.textField.text = scontent;
}
Here's a test project I wrote all the code
Http://pan.baidu.com/s/1kTMBBk7, direct, can be used
[Ios]ios Speech recognition