All say that voice is an important means of human-computer interaction, although individuals feel that in public, to the phone orders will appear a little awkward. But in a resource-constrained IoT scenario (with no external mouse-keyboard display), it's practical to be able to control devices and interact with devices via voice. Following the previous article, "Windows IoT serials 4-How to use Cortana voice assistant on a Raspberry Pi", this article will detail how to add speech recognition and voice interactivity to the Raspberry Pi running the Windows IoT core system.
1. Hardware Preparation
- Raspberry Pi 2/Raspberry Pi 3, 5v/2a power supply, TF card (above 8GB)
- Microphone: Microsoft LifeCam HD 3000 (the camera is integrated with a microphone), or you can use other microphones, such as blue Snowball ICE condenser microphone, cardioid, sound Tech CM-100 0USB Table Top Conference Meeting Microphone
- Controlled object: Here are two LED lights for example. The user can add the controlled object according to the actual demand, for example, after adding the relay module, can control the strong electrical equipment.
- Audio output device (optional): The Raspberry Pi of the Windows IoT core system only supports the audio output of the 3.5mm interface, and HDMI audio output is not supported. So, you can connect to a normal 3.5mm interface headset.
- Display device (optional): A monitor that can connect to the HDMI interface, or an active HDMI-to-VGA module that transfers the VGA interface.
Note that the audio output device and the display device are optional and not required.
2. Hardware connection
Here the LEDs are connected to the Raspberry Pi GPIO5 and GPIO6 two pins, while the microphone device is plugged into the Raspberry Pi USB port. If you have an audio output device (such as a headset or stereo) and a display device (monitor), connect to the Raspberry Pi 3.5mm audio interface and the HDMI interface.
3. Program Preparation
The development environment used by this application is Windows 10+visual Studio Community, note that Visual Studio needs to include universal Windows APP development tools components.
3.1 Creating new projects and adding resources
When creating a new project, select the Universal template, and the project is named Rpivoicecontrol, as shown in.
Because you want to use the GPIO pin to control LEDs, you need to add a Windows IoT Extension for UWP reference for your project as shown in.
Due to the need to use microphone, it is necessary to check the microphone in the project's Package.appxmanifest file, as shown in.
In addition, because you need to use resources such as speech recognition, LEDs, and UI controls, you need to introduce namespaces to your application as follows:
Using System;
Using System.Diagnostics;
Omitted here a number of ...
Using Windows.Devices.Gpio; Led
Using windows.media.speechrecognition;//speech recognition
Using Windows.Media.SpeechSynthesis;
Using Windows.storage;
Using Windows.applicationmodel;
3.2 New Voice Command definition file
Add a new XML file for the project, named Grammar.xml, to define the voice commands. The voice commands used in the project conform to the speech recognition Grammar Specification Version 1.0 (SRGS) standard, and their specific protocol can refer to this document on MSDN: Create grammars Using SRGS XML (Microsoft.speech).
After that, open the file and add the following voice command to it.
<?xml version= "1.0" encoding= "Utf-8"?>
<grammar
Version= "1.0"
Xml:lang= "en-US"
root= "Automationcommands"
Xmlns= "Http://www.w3.org/2001/06/grammar"
tag-format= "semantics/1.0" >
<rule id= "root" >
<item>
<ruleref uri= "#automationCommands"/>
<tag>out.command=rules.latest ();</tag>
</item>
</rule>
The code is omitted here, please refer to the full code of the project on GitHub.
<rule id= "Deviceactions" >
<one-of>
<item>
Light <tag> out= ' light '; </tag>
</item>
<item>
LED <tag> out= "led"; </tag>
</item>
</one-of>
</rule>
</grammar>
3.3 Program Interface Design
If you are not ready to send the Raspberry Pi display can directly ignore this step, if you need to check the status of the program during the process, you can add some simple controls, this is only added to the two led light status of the ellipse control, two indicating the state of the program running TextBlock Control and a MediaElement control, the code is as follows.
<grid background= "{ThemeResource Applicationpagebackgroundthemebrush}" >
<stackpanel horizontalalignment= "center" verticalalignment= "Center" >
<ellipse x:name= "bedroomled" fill= "Lightgray" stroke= "white" width= "All" height= "" margin= "/>"
<ellipse x:name= "kitchenroomled" fill= "Lightgray" stroke= "white" width= "All" height= "" margin= "/>"
<textblock x:name= "Gpiostatus" text= "Waiting to initialize GPIO ..." Margin= "10,50,10,10" textalignment= "Center" fontsize= "26.667"/>
<textblock x:name= "Voicestatus" text= "Waiting to initialize Microphone" margin= "10,50,10,10" textalignment= "Center "textwrapping=" Wrap "/>
<mediaelement x:name= "MediaElement" ></MediaElement>
</StackPanel>
</Grid>
3.4 Background Code
In the background code, you first need to define the resource objects used by the application, such as Gpio, brushes, timers, some of the code below,
Private Const int bedroomled_pinnumber = 5;
Private Gpiopin Bedroomled_gpiopin;
Private Gpiopinvalue Bedroomled_gpiopinvalue;
Private DispatcherTimer Bedroomtimer;
Private Const int kitchenled_pinnumber = 6;
Private Gpiopin Kitchenled_gpiopin;
Private Gpiopinvalue Kitchenled_gpiopinvalue;
Private DispatcherTimer Kitchentimer;
Private SolidColorBrush Redbrush = new SolidColorBrush (Windows.UI.Colors.Red);
Private SolidColorBrush Graybrush = new SolidColorBrush (Windows.UI.Colors.LightGray);
Then, in the MainPage constructor, add the initialization of the resource, and some of the code is as follows:
Public MainPage ()
{
This. InitializeComponent ();
Unloaded + = mainpage_unloaded;
Initialize recognizer
Initializespeechrecognizer ();
Initbedroomgpio ();
Initkitchengpio ();
Bedroomtimer = new DispatcherTimer ();
Bedroomtimer.interval = Timespan.frommilliseconds (500);
Bedroomtimer.tick + = Bedroomtimer_tick;
Kitchentimer = new DispatcherTimer ();
Kitchentimer.interval = Timespan.frommilliseconds (500);
Kitchentimer.tick + = Kitchentimer_tick;
}
In the Initializespeechrecognizer function, complete the speech recognition state Change event Add, Voice command file loading, part of the code is as follows:
Private async void Initializespeechrecognizer ()
{
Initialize recognizer
recognizer = new SpeechRecognizer ();
Set Event handlers
Recognizer. StateChanged + = recognizerstatechanged;
Recognizer. continuousrecognitionsession.resultgenerated + = recognizerresultgenerated;
Load Grammer file constraint
String fileName = String.Format (srgs_file);
StorageFile Grammarcontentfile = await Package.Current.InstalledLocation.GetFileAsync (fileName);
Speechrecognitiongrammarfileconstraint grammarconstraint = new Speechrecognitiongrammarfileconstraint ( Grammarcontentfile);
ADD to Grammer constraint
Recognizer. Constraints.add (Grammarconstraint);
Speechrecognitioncompilationresult Compilationresult = await recognizer.compileconstraintsasync ();
Debug.WriteLine ("Status:" + compilationResult.Status.ToString ());
If successful, display the recognition result.
if (Compilationresult.status = = speechrecognitionresultstatus.success)
{
Debug.WriteLine ("Result:" + compilationresult.tostring ());
Await recognizer. Continuousrecognitionsession.startasync ();
}
Else
{
Debug.WriteLine ("Status:" + compilationresult.status);
}
}
After
, the processing of two events for recognizerresultgenerated and Recognizerstatechanged was added, mainly for the processing of speech recognition results and state changes. Some of the code is as follows:
Private async void Recognizerresultgenerated (speechcontinuousrecognitionsession session, Speechcontinuousrecognitionresultgeneratedeventargs args)
{
//Check for different tags and Initialize the variables
String location = args. Result.SemanticInterpretation.Properties.ContainsKey (tag_target)?
args. Result.semanticinterpretation.properties[tag_target][0]. ToString ():
"";
String cmd = args. Result.SemanticInterpretation.Properties.ContainsKey (tag_cmd)?
Args. Result.semanticinterpretation.properties[tag_cmd][0]. ToString ():
"";
String device = args. Result.SemanticInterpretation.Properties.ContainsKey (tag_device)?
Args. Result.semanticinterpretation.properties[tag_device][0]. ToString ():
"";
Windows.ApplicationModel.Core.CoreApplication.MainView.CoreWindow.Dispatcher.RunAsync ( Windows.UI.Core.CoreDispatcherPriority.Normal, () =
{
voicestatus.text= "Target:" + location + ", Command:" + cmd + ", Device:" + device;
});
Switch (device)
{
Case "Hiactivationcmd"://activate device
Saysomthing ("Hiactivationcmd", "on");
Break
Case ' light ':
Lightcontrol (cmd, location);
Break
Default
Break
}
}
Recognizer State changed
Private async void Recognizerstatechanged (speechrecognizer sender, Speechrecognizerstatechangedeventargs args)
{
Windows.ApplicationModel.Core.CoreApplication.MainView.CoreWindow.Dispatcher.RunAsync ( Windows.UI.Core.CoreDispatcherPriority.Normal, () =
{
Voicestatus.text = "Speech recognizer State:" + args. State.tostring ();
});
}
Defines the function saysomthing, which is used for feedback speech generation, so that the user can hear the voice feedback from the Raspberry Pi. Some of the code is as follows:
Private async void Saysomthing (String mydevice, string state, int speechcharactervoice = 0)
{
if (Mydevice = = "Hiactivationcmd")
Playvoice ($ "Hi Jack what can I do for You");
Else
Playvoice ($ "OK Jack {Mydevice} {state}", Speechcharactervoice);
Await Windows.ApplicationModel.Core.CoreApplication.MainView.CoreWindow.Dispatcher.RunAsync ( Windows.UI.Core.CoreDispatcherPriority.Normal, () =
{
Voicestatus.text = $ "OK-===== {Mydevice}---{state} =======";
});
}
Finally, in two timer overflow event processing, added to the processing of LED lights, part of the code is as follows:
private void Bedroomtimer_tick (object sender, Object e)
{
if (Bedroomled_gpiopinvalue = = Gpiopinvalue.high)
{
Bedroomled_gpiopinvalue = Gpiopinvalue.low;
Bedroomled_gpiopin.write (Bedroomled_gpiopinvalue);
Bedroomled.fill = Redbrush;
}
Else
{
Bedroomled_gpiopinvalue = Gpiopinvalue.high;
Bedroomled_gpiopin.write (Bedroomled_gpiopinvalue);
Bedroomled.fill = Graybrush;
}
}
4. Application debugging
In Visual Studio, set the compiled platform to arm, debug the device as remote machine, and in the Debug tab, set the IP address of the Raspberry Pi, and click Debug. As shown in.
After the program is run, the user can interact with the Raspberry Pi via voice commands.
First, the user can use "Hi Jack" to interact with the device and can hear a reply from the device to confirm that the application is running correctly.
Second, the user can control two LEDs using "Turn on/off Bedroom Light" and "Turn on/Off kitchen", while the status of the lamp and the state of the speech recognition can be seen on the application's interface as shown in.
The actual diagram of the application run is as follows:
5. Code download
The code for this project has been posted on GitHub, with the following links: Https://github.com/shijiong/RPiVoiceControl, welcome to download.
Windows IoT Serials 5-How to add speech recognition and interactivity for Raspberry Pi applications