For HoloLens, voice input is one of the three basic input methods, widely used in various interactions. There are three types of voice input on the HoloLens, namely:
- Speech Command voice commands
- Dictation diction
- Grammar recognition Grammar Recognition
Speech Command voice commands
Voice commands are a frequently used feature for people who have done Windows Phone or Windows Store app development. Developers can provide a voice command experience for the user by setting keywords and corresponding behaviors for the app. When the user says the keyword, the preset action is called. On HoloLens, voice commands are also this mode.
Keywordrecognizer
namespaces : UnityEngine.Windows.Speech
class : keywordrecognizer, Phraserecognizedeventargs, speecherror, Speechsystemstatus
The use of the method is very simple, through the registration < keyword > to initialize the Keywordrecognizer instance, while registering voice command events for subsequent processing.
usingUnityEngine.Windows.Speech;usingSystem.Collections.Generic;usingSystem.Linq; Public classkeywordmanager:monobehavior{Keywordrecognizer Keywordrecognizer; Dictionary<string, system.action> keywords =Newdictionary<string, system.action>(); voidStart () {//Initialize KeywordsKeywords. ADD ("Activate", () = { //behavior that you want to perform }); Keywordrecognizer=Newkeywordrecognizer (keywords. Keys.toarray ()); Keywordrecognizer.onphraserecognized+=keywordrecognizer_onphraserecognized; //Start RecognitionKeywordrecognizer.start (); } Private voidkeywordrecognizer_onphraserecognized (Phraserecognizedeventargs args) {system.action keywordaction; //If a keyword is identified, call if(keywords. TryGetValue (Args.text, outkeywordaction)) {Keywordaction.invoke (); } }}
Grammar recognition Grammar Recognition
Syntax recognition, similar to the Windows Store app, relies on implementing a set of SRGs files that define a series of grammar rules for speech recognition. For more information, please read: https://msdn.microsoft.com/zh-cn/library/hh378349 (v=office.14). aspx
Grammarrecognizer
Namespaces:UnityEngine.Windows.Speech
Class: grammarrecognizer, Phraserecognizedeventargs, speecherror, speechsystemstatus
After you have created the SRGs file, put it into the Streamingaessets folder:
<project_root>/assets/streamingassets/srgs/mygrammar.xml
It is also very simple to use, the code is as follows:
Public classgrammarmanager:monobehavior{PrivateGrammarrecognizer Grammarrecognizer; voidStart () {//InitializeGrammarrecognizer =NewGrammarrecognizer (Application.streamingdatapath +"/srgs/mygrammar.xml"); Grammarrecognizer.onphraserecognized+=grammarrecognizer_onphraserecognized; //Start RecognitionGrammarrecognizer.start (); } Private voidgrammar_onphraserecognized (Phraserecognizedeventargs args) {semanticmeaning[] meanings=args.semanticmeanings; //Perform Actions }}
Dictation diction
Dictation is voice-to-text, which we call speech to text, which is also one of the Windows Store app features. On the HoloLens, play a bigger role than other platforms. Because of the operating characteristics of the HoloLens, the use of keyboard operation is very inconvenient, voice is not the problem, can greatly improve the input efficiency.
Dictationrecognizer
namespaces : UnityEngine.Windows.Speech
class : dictationrecognizer, speecherror, speechsystemstatus
The dictation feature is used to convert user voice to text input, while supporting content inference and event registration features. The Start () and Stop () methods are used to enable and disable dictation, and you need to call the Dispose () method to close the dictation page after dictation ends. The GC automatically reclaims its resources, and if not dispose brings additional performance overhead.
The complete use method is as follows:
Public classdictionmanager:monobehavior{PrivateDictationrecognizer Dictationrecognizer; voidStart () {Dictationrecognizer=NewDictationrecognizer (); //Registering EventsDictationrecognizer.dictationresult + =Dictationrecognizer_dictationresult; Dictationrecognizer.dictationhypothesis+=dictationrecognizer_dictationhypothesis; Dictationrecognizer.dictationcomplete+=Dictationrecognizer_dictationcomplete; Dictationrecognizer.dictationerror+=Dictationrecognizer_dictationerror; //Start Dictation recognitionDictationrecognizer.start (); } Private voidDictationrecognizer_dictationresult (stringtext, Confidencelevel confidence) { //Custom Behavior } Private voidDictationrecognizer_dictationhypothesis (stringtext) { //Custom Behavior } Private voidDictationrecognizer_dictationcomplete (Dictationcompletioncause cause) {//Custom Behavior } Private voidDictationrecognizer_dictationerror (stringErrorintHRESULT) { //Custom Behavior } voidOnDestroy () {//Freeing ResourcesDictationrecognizer.dictationresult-=Dictationrecognizer_dictationresult; Dictationrecognizer.dictationcomplete-=Dictationrecognizer_dictationcomplete; Dictationrecognizer.dictationhypothesis-=dictationrecognizer_dictationhypothesis; Dictationrecognizer.dictationerror-=Dictationrecognizer_dictationerror; Dictationrecognizer.dispose (); }}
Note: Dictation recognition automatically triggers a timeout behavior in the following situations:
- If no sound is heard within the first 5 seconds after dictation starts, it will time out
- If a result is identified but no sound is heard in the next 20 seconds, it also expires
Use these features at the same time
If you want to use voice commands, grammar recognition, and dictation features at the same time, be sure to turn off the current speech recognition behavior completely before starting another recognition action. If multiple Keywordrecognizer are running, you can close them one at a time using the following code:
Phraserecognitionsystem.shutdown ();
In order to recover all recognizers from a previous state, the following code can be called after the dictation recognition is complete:
Phraserecognitionsystem.restart ();
Of course you can also start a keywordrecognizer again, which will restart the Phraserecognitionsystem to achieve the above effect.
Summarize
The speech recognition feature on the Windows platform has been strong since Windows 8, but it will have a bigger place on the HoloLens. This is the most basic way to interact, with powerful support at the system level, and even with Cortana, it's very useful.
HoloLens Development Notes-Unity's voice input