Objective
In the current software application, the input mode is mainly based on text input, but the way of voice input is more and more widely used at present. This is a 24-point iOS program written using the Olami SDK, which is input via voice.
The Olami SDK is presented at the following URL
Https://cn.olami.ai/wiki/?mp=sdk&content=sdk/ios/reference.html
In this URL, the Olami SDK contains the functions and definitions of the delegates in detail.
App implementation
Here's a 24-point procedure to explain how to use this SDK.
This app can be downloaded from https://github.com/lym-ay/OlamiRecognizerMath24
- Go to the URL above to download the Olami SDK. Includes two files, one of which is the static library of Olami, and one is its header file
The first step is to initialize the Olami speech recognition object and set the proxy
olamiRecognizer= [[OlamiRecognizer alloc] init];olamiRecognizer.delegateself;
2. Call the Setauthorization function to authorize
setAuthorization:@"d13bbcbef2a4460dbf19ced850eb5d83" api:@"asr" appSecret:@"3b08b349c0924a79869153bea334dd86" cusid:OLACUSID];
The description of the parameters of this function is described in Olamirecognizer, and you can go to the online API description to see
Https://cn.olami.ai/wiki/?mp=sdk&content=sdk/ios/reference.html
Some parameters must be registered on the Olami development platform to be available, the URL is https://olami.ai
3. Set up language
[olamiRecognizer setLocalization:LANGUAGE_SIMPLIFIED_CHINESE];
You must set it up before recording, otherwise you will not get the result. Currently only supports Simplified Chinese (language_simplified_chinese)
4. Start recording
Call the start () interface to start recording
start];
5. Get the text and semantics of the recording and deal with it
By calling the Stop () function or stopping automatically, you will get the result of the recorded text and the semantic analysis of it.
Implement the Olamirecognizerdelegate Onresult function to get the result, the result is a JSON string callback, the string to parse, you can get the desired number. For example, the microphone said "2345 Count 24 points", the results are as follows
{ "Data":{ "ASR":{"result": "2345 counts 24 points", "speech_status": 0, "final": true, "status": 0 }, "Nli":[ { "Desc_obj":{"status": 0 }, "Semantic":[ { "app":"Math24", "input":"2345 counts 24.", "Slots":[ { "Num_detail":{"recommend_value": "", "type": "number" }, "name":"Number3", "value": "4" }, { "Num_detail":{"recommend_value": "", "type": "number" }, "name":"Number4", "value": "5" }, { "Num_detail":{"recommend_value": "", "type": "number" }, "name":"Number1", "value": "2" }, { "Num_detail":{"recommend_value": "", "type": "number" }, "name":"Number2", "value": "3" } ], "modifier":[ "Play_calculate" ], "Customer":"58df685e84ae11f0bb7b4893" } ], "type":"Math24" } ]}, "Status":"OK"}
This is a set of rules that are defined by the OSL Syntax description language, and the results are returned. A description of this result is shown on the Https://cn.olami.ai/wiki/?mp=api_nlu&content=api_nlu3.html website.
See here people may have doubts, how does the app know what I mean? This involves the OSL syntax description language, Olami Syntax description Language (Olami Syntax Language, abbreviation: OSL) is the Olami platform for natural language processing by the unique syntax Markup language, natural language semantic interaction (Natural Language Interaction, abbreviated as: NLI) management system uses OSL instead of complex coding programming, easy to use, easy to learn and flexible. Detailed instructions can be viewed at this URL.
Https://cn.olami.ai/wiki/?mp=osl&content=osl1.html
Before writing this app, we will write a set of syntax according to OSL's requirements, which can be understood by the Olami server, and then give the result by semantic analysis, which is the JSON string above. On the Olami official website there are some areas of the module that can be used directly. You can see in the https://cn.olami.ai/wiki/?mp=nli&content=nli1.html URL how to use the later modules. This 24 point is the use of existing modules to write code.
Description of the 6.onResult function
One of the most important functions in the entire program is the Onresult function.
- (void) Onresult: (NSData *) result {Nserror*error; __Weaktypeof Self) Weakself = Self;nsdictionary*dic = [Nsjsonserialization jsonobjectwithdata:result options:nsjsonreadingmutablecontainers error:&error];if(Error) {NSLog(@"error is%@", error. Localizeddescription); }Else{NSString*jsonstr=[[NSStringAlloc]initwithdata:result encoding:nsutf8stringencoding];NSLog(@"Jsonstr is%@", JSONSTR);NSString*ok = [dic objectforkey:@"Status"];if([OK isequaltostring:@"OK"]) {nsdictionary*dicdata = [dic objectforkey:@"Data"];nsdictionary*ASR = [Dicdata objectforkey:@"ASR"];if(ASR) {//If ASR is not empty, the description is currently voice input[Weakself PROCESSASR:ASR]; }nsdictionary*nli = [[Dicdata objectforkey:@"Nli"] Objectatindex:0];nsdictionary*desc = [Nli objectforkey:@"Desc_obj"];intstatus = [[Desc objectforkey:@"Status"] Intvalue];if(Status! =0) {//0 State Normal, non-zero status not normal NSString*result = [desc objectforkey:@"Result"];Dispatch_async(Dispatch_get_main_queue (), ^{_resulttextview. Text= result; }); }Else{nsdictionary*semantic = [[Nli objectforkey:@"Semantic"] Objectatindex:0]; [Weakself processsemantic:semantic]; } }Else{Dispatch_async(Dispatch_get_main_queue (), ^{_resulttextview. Text= @"Please say 4 numbers within 10."; }); } }}
In this function, three functions are called to handle the three more important nodes in the JOSN format, respectively.
- (void) Processasr: (nsdictionary*) Asrdic {NSString*result = [Asrdic objectforkey:@"Result"];if(Result. Length==0) {//If the result is empty, the warning box pops upUialertcontroller *alertcontroller = [Uialertcontroller Alertcontrollerwithti tle:@"No voice received, please re-enter!"MessageNilPreferredstyle:uialertcontrollerstylealert]; [ SelfPresentviewcontroller:alertcontroller Animated:YEScompletion:^{dispatch_time_t time=dispatch_time (Dispatch_time_now,1*NSEC_PER_SEC); Dispatch_after (Time, Dispatch_get_main_queue (), ^{[Alertcontroller dismissviewcontrolleranimated:YESCompletion:Nil]; }); }]; }Else{Dispatch_async(Dispatch_get_main_queue (), ^{NSString*STR = [Result stringbyreplacingoccurrencesofstring:@" "withstring:@""];//Remove the space in the middle of the character_inputtextview. Text= str; }); }}
This is used to process the ASR node and get the result of speech recognition, shown in the first TextView
- (void) Processsemantic: (nsdictionary*) Semanticdic {Nsarray*slot = [Semanticdic objectforkey:@"Slots"]; [_slotvalue removeallobjects];if(Slot. Count!=0) { for(nsdictionary*dic in slot) {NSString* val = [dic objectforkey:@"Value"]; [_slotvalue Addobject:val]; } }Nsarray*modify = [Semanticdic objectforkey:@"modifier"];if(Modify. Count!=0) { for(NSString*s in Modify) {[ SelfPROCESSMODIFY:S]; } }}
This is used to process the semantic node, which contains the value of the slot. The slots in the OSL syntax description Language can be understood as variables in semantics for passing and extracting information. The value of the slot can be referenced in https://cn.olami.ai/wiki/?mp=osl&content=osl_slot.html, which is explained in detail here. The number we want to calculate in the 24-point program is obtained from here.
- (void) Processmodify: (nsstring*)Str{if([Strisequaltostring:@"Play_want"] || [Strisequaltostring:@"Play_want_ask"] || [Strisequaltostring:@"Needmore"] || [Strisequaltostring:@"Needmore_ask"]) {//Require the user to enter a valueDispatch_async (Dispatch_get_main_queue (), ^{_resulttextview.text = @"Please say 4 numbers within 10."; }); }Else if([Strisequaltostring:@"Rules"]) {Dispatch_async (Dispatch_get_main_queue (), ^{_resulttextview.text = @"Four digit operation result equals 24"; }); }Else if([Strisequaltostring:@"Play_calculate"]) {nsstring*Str= [[Math24 shareinstance] calculate:_slotvalue]; Dispatch_async (Dispatch_get_main_queue (), ^{_resulttextview.text =Str; }); }Else if([Strisequaltostring:@"Attention"]) {Dispatch_async (Dispatch_get_main_queue (), ^{_resulttextview.text = @"Four numbers must be within 10, not more than ten"; }); }}
This is used to deal with speech and semantic results. This function mainly deals with the modifier nodes in the JSON string. The modifier syntax Description rule is a OSL syntax description language, in addition to the slot of a built-in information transfer mechanism, generally used to denote semantic purposes, but also can be understood as a way of commenting on semantics, so that the application's developers know the corresponding intention grammar represents. Detailed description Reference
https://cn.olami.ai/wiki/?mp=osl&content=osl_regex.html#11 through modifier, can we know what the program is intended for? For example, to ask questions, or to calculate the results.
As shown in the code above, at 24 o'clock we define 7 modifier, which can be guessed by the literal meaning. These can be customized in the OSL syntax and then obtained through the JOSN string and processed in the program.
iOS program that uses the Olami SDK to implement a voice input number for 24-point computing