Speech recognition synthesis using the Flying open platformHair
The development of society and products is always in the way of convenience and intelligence, and so is the app for mobile phones. So now the app is shrinking the process, optimizing the use of experience, for the user to use more convenient, improve the app's start-up and user stickiness. What about the other side of the smart?
A relatively simple and intelligent implementation method is to add speech recognition in the input place, in the output display when adding speech synthesis, direct voice broadcast, then is not for the driver and so on is a smart embodiment?
Now do voice recognition there are some good open platform can give us directly to use, one is the Iflytek open platform, one is the Baidu Voice open platform. I personally prefer to fly, because the advantages of flying is in the large section of the text recognition, accurate rate is higher. It just fits my needs. This blog is also mainly about the use of the voice of the CyberLink SDK.
Development steps of the flight
The development of the course to read more clearly and more verbose, is not the most concise demo. So I made some appropriate cuts in this blog to try to make the simplest demo.
1 Apply for Account ID
Landing on the open platform, in the User menu bar to create my app. This can also be done using a third-party approach. Fill in the relevant information in the interface where the app is created. Then there will be a link to the SDK download. If not, Go directly to the SDK Download Center follow the three-step option to do a new download. There is not much to describe here.
2 Introduction to the Flight SDK framework
Download the SDK package extracted out there are three folders, the first City Doc folder, not much to say, is certainly the development of documents, as well as related author information and so on. The most important thing is the remaining two folders, one is the Lib folder, which is stored in the Message Fly SDK class library file, We import the SDK is the import of the file here. The third folder is a demo project for iOS.
1 Adding a static library
Create a new iOS project, copy the "iflymsc.framework" file under the Lib folder to the project directory, and then in the project configuration file [Build phases]-[link Binary with Libraries], [addother ]
2 Confirm SDK Path
Search for "Head" in the config file to find [Framework Search Paths], click to view the path of the SDK is not the absolute path, if it is the same, then there is no problem. This step is to ensure that the path to the SDK is a relative path. Prevent the project from changing the folder location to be able to run the situation.
3 Adding frames
In turn, add all the frames in the diagram to the project. The XCode7 in the API is given in the previous, XCode7 after some dynamic link library suffix name has changed, after the addition of Xcode7, such as:
Speech recognition
Speech recognition is divided into two kinds, used in different occasions, one is the interface prompt speech recognition, one is no interface prompt speech recognition, here with the interface prompt speech recognition as an example.
Speech recognition with interface prompts
1 Importing header files
Import all the classes in the CyberLink SDK
#import < iflymsc/iflymsc.h >
2 Landing Flight Server
Before using the voice resolution of the message, it is necessary to authenticate the user, that is, to login to the flight server, which adds two lines of code in the Viewdidload () method. The following ID numbers are the AppID that we created in the open platform for our application, as well as in the downloaded SDK.
NSString *initstring = [[NSString alloc] Initwithformat:@ "appid=%@",@ " 565E4DD9"]; [Iflyspeechutility createutility:initstring];
3 Creating a speech recognition object with interface hints
Create a flying speech recognition object, you can make a series of calls to him
@property (nonatomic, strong) Iflyrecognizerview *iflyrecognizerview; // recognizable object with interface
4 initializing a recognized object with an interface
Before declaring a speech recognition object with an interface, it is now necessary to initialize the recognition object, which is also done in the Viewdidload () method.
#pragma Mark------The initialization of an interface for speech recognition = [[Iflyrecognizerview alloc] Initwithcenter: Self.view.center];_iflyrecognizerview. delegate =@ "iat" forkey: [iflyspeechconstant ifly_domain]; // Asr_audio_path Save the recording file name, if no longer required, set value to nil to cancel, the default directory is documents [_iflyrecognizerview setparameter:@ "" forkey:[iflyspeechconstant Asr_audio_path] ];
5 Implementing Proxy methods
The method of agent callback used in the processing of the recognition result of Iflyspeechsynthesizerdelegate realizes the onresult:islast of the Protocol: method.
Attention!!!! Here is the Onresult, not the Onresults, which is the result callback function of the speech parsing without the interface hint.
- (void) Onresult: (Nsarray *) resultarray islast: (BOOL) islast{nsmutablestring*result =[[Nsmutablestring alloc] init]; Nsdictionary*dic = [Resultarray objectatindex:0]; for(NSString *keyinchdic) {[Result AppendFormat:@"%@", key]; }//here, we need to drag the ISRDataHelper.h and isrdatahelper.m files into the downloaded demo, and then introduce the header files behind them.NSString * ResU =[Isrdatahelper Stringfromjson:result]; //display the results on the label of the interface_text.text = [NSString stringWithFormat:@"%@%@", _text.text,resu];}
Here the default is to return the JSON string, you need to parse the string. Of course, the message is more conscience, in the demo to provide an analytic class, is the above used isrdatahelper. Use it to parse. 6 Trigger Start Speech recognition drag a button to give a response event to start listening for speech recognition
// Start the recognition service [_iflyrecognizerview start];
By running the app at this time, you can make speech recognition, and the results should be as follows:
Speech recognition without an interface hint
The speech recognition without the interface prompt is suitable to put the speech recognition in the background, this looks at the concrete usage scene. The way of no interface is relatively simple and generous, can be formulated high.
1 Importing header files
// Import all the classes in the CyberLink SDK #import < iflymsc/iflymsc.h >
2 Landing Flight Server
Before using the voice resolution of the message, it is necessary to authenticate the user, that is, to login to the flight server, which adds two lines of code in the Viewdidload () method. The following ID numbers are the AppID that we created in the open platform for our application, as well as in the downloaded SDK.
NSString *initstring = [[NSString alloc] Initwithformat:@ "appid=%@",@ " 565E4DD9"]; [Iflyspeechutility createutility:initstring];
3 Creating a speech recognition object without an interface hint
Create a flying speech recognition object, you can make a series of calls to him
@property (nonatomic, strong) Iflyspeechrecognizer *iflyspeechrecognizer; // non-interface-aware object
4 Object Initialization
Some objects that are used in speech synthesis are declared earlier, and now the previous object needs to be initialized. or in Viewdidload (). Here the initialization content is more, is to make some voice settings, so alone made a method, in the Viewdidload () can be called.
-(void) initrecognizer{//Singleton mode, no UI instanceif(_iflyspeechrecognizer = =Nil) {_iflyspeechrecognizer= [Iflyspeechrecognizer sharedinstance]; [_iflyspeechrecognizer Setparameter:@""forkey:[iflyspeechconstant PARAMS]]; //Set Dictation mode[_iflyspeechrecognizer Setparameter:@"IAT"forkey:[iflyspeechconstant Ifly_domain]];} _iflyspeechrecognizer.Delegate=Self ; if(_iflyspeechrecognizer! =Nil) {Iatconfig*instance =[Iatconfig sharedinstance]; //set the maximum recording time[_iflyspeechrecognizer setParameter:instance.speechTimeout forkey:[iflyspeechconstant speech_timeout]; //after setting the endpoint[_iflyspeechrecognizer setParameter:instance.vadEos forkey:[iflyspeechconstant Vad_eos]; //set the front end point[_iflyspeechrecognizer setParameter:instance.vadBos forkey:[iflyspeechconstant Vad_bos]; //Network wait Time[_iflyspeechrecognizer Setparameter:@"20000"forkey:[iflyspeechconstant Net_timeout]]; //set sample rate, 16K recommended[_iflyspeechrecognizer setParameter:instance.sampleRate forkey:[iflyspeechconstant sample_rate]; if([Instance.language isequaltostring:[iatconfig Chinese]]) {//Set Language[_iflyspeechrecognizer setParameter:instance.language forkey:[iflyspeechconstant language]; //Set dialect[_iflyspeechrecognizer setParameter:instance.accent forkey:[iflyspeechconstant accent]; }Else if([Instance.language isequaltostring:[iatconfig 中文版]]) {[_iflyspeechrecognizer SetParameter:instance.langu Age Forkey:[iflyspeechconstant LANGUAGE]]; } //sets whether to return punctuation[_iflyspeechrecognizer SetParameter:instance.dot forkey:[iflyspeechconstant ASR_PTT]; }}
5 Implementing Proxy methods
The method of agent callback used in the processing of the recognition result of Iflyspeechsynthesizerdelegate realizes the onresult:islast of the Protocol: method.
Attention!!!! Here is the onresults, not the Onresult, the former is a result of the interface prompt speech recognition callback function.
- (void) Onresults: (Nsarray *) Results islast: (BOOL) islast{nsmutablestring*result =[[Nsmutablestring alloc] init]; Nsdictionary*dic = [Results objectatindex:0]; for(NSString *keyinchdic) {[Result AppendFormat:@"%@", key]; } nsstring* ResU =[Isrdatahelper Stringfromjson:result]; _text.text= [NSString stringWithFormat:@"%@%@", _text.text,resu];}
6 Triggering speech synthesis
Add an input box, and the response time of a Button,button is to read the text content in the input box.
if(_iflyspeechrecognizer = =Nil) {[Self initrecognizer];} [_iflyspeechrecognizer Cancel];//set audio source to microphone[_iflyspeechrecognizer setparameter:ifly_audio_source_mic Forkey:@"Audio_source"];//set the dictation result format to JSON[_iflyspeechrecognizer Setparameter:@"JSON"forkey:[iflyspeechconstant Result_type]];//Save the recording file, save in the SDK working path, if not set the working path, the default is saved under Library/cache[_iflyspeechrecognizer Setparameter:@"ASR.PCM"forkey:[iflyspeechconstant Asr_audio_path]]; [_iflyspeechrecognizer setdelegate:self]; BOOL ret= [_iflyspeechrecognizer startlistening];
Speech synthesis
The process of speech synthesis and speech recognition is almost
1 Importing header files
// Import all the classes in the CyberLink SDK #import < iflymsc/iflymsc.h >#import"PcmPlayer.h"#import "TTSConfig.h"
2 Landing Flight Server
Before using the voice resolution of the message, it is necessary to authenticate the user, that is, to login to the flight server, which adds two lines of code in the Viewdidload () method. The following ID numbers are the AppID that we created in the open platform for our application, as well as in the downloaded SDK.
NSString *initstring = [[NSString alloc] Initwithformat:@ "appid=%@",@ " 565E4DD9"]; [Iflyspeechutility createutility:initstring];
3 Creating a speech recognition object with interface hints
Create a flying speech recognition object, you can make a series of calls to him
typedef ns_options (Nsinteger, Synthesizetype) {Nomaltype = 5 , // general composition Uritype = 6 , uri composition }; @property (Nonatomic, strong) iflyspeechsynthesizer * Iflyspeechsynthesizer; // Speech synthesis object @property (nonatomic, strong) Pcmplayer *audioplayer;// @property for playing audio (nonatomic, assign) Synthesizetype Syntype;// What is the composition @property (nonatomic, assign) BOOL Haserror;//
4 Object Initialization
Some objects that are used in speech synthesis are declared earlier, and now the previous object needs to be initialized. or in Viewdidload ().
Ttsconfig *instance =[Ttsconfig sharedinstance];if(Instance = =Nil) { return;}//Synthetic Services Single caseif(_iflyspeechsynthesizer = =Nil) {_iflyspeechsynthesizer=[Iflyspeechsynthesizer sharedinstance];} _iflyspeechsynthesizer.Delegate=Self ; //Set Speed 1-100[_iflyspeechsynthesizer setParameter:instance.speed forkey:[iflyspeechconstant Speed]]; //Set Volume 1-100[_iflyspeechsynthesizer setParameter:instance.volume forkey:[iflyspeechconstant volume]; //Set Tone 1-100[_iflyspeechsynthesizer setParameter:instance.pitch forkey:[iflyspeechconstant pitch]; //set the sample rate[_iflyspeechsynthesizer setParameter:instance.sampleRate forkey:[iflyspeechconstant sample_rate];//set the pronunciation of people[_iflyspeechsynthesizer setParameter:instance.vcnName forkey:[iflyspeechconstant voice_name];
5 Triggering speech synthesis
Add an input box, and the response time of a Button,button is to read the text content in the input box.
if ([Self. Voicetext.text isequaltostring:@ ""]) { return;} if (_audioplayer! = Nil && _audioplayer.isplaying = = YES) { [_audioplayer stop ]; == no;[ Nsthread sleepfortimeinterval:0.05];_iflyspeechsynthesizer. delegate = self;[ _iflyspeechsynthesizer startspeaking:self. Voicetext.text];
Speech recognition of the message flying