C + + speech Recognition Interface QuickStart (Microsoft Speech SDK)
Recently graduated design used Microsoft's C + + speech recognition interface, find a lot of information, also encountered a lot of problems, took a lot of detours. Now write down my own experience, one is to improve themselves, the second is to repay the community. Hope you read this blog, 5min learn C + + speech recognition interface implementation. (The platform used is win8+vs2013)
First, install the SDK
Install Microsoftspeechplatformsdk.msi, the default path is installed.
Download path:
download.csdn.net/detail/michaelliang12/9510691
Second, the new project, configuration environment
Set up:
1, Properties – Configuration Properties –c/c++– General – Additional include directory: C:\Program Files\Microsoft Sdks\speech\v11.0\include (specific path related to installation path)
2, Properties – Configuration Properties – Linker – Input – Additional dependencies: Sapi.lib;
Third, speech recognition code
Speech recognition interface can be divided into text-to-speech and speech-to-text
1. Text To Speech
header files that need to be added:
#include <sapi.h>//import Voice header file #pragma comment (lib, "Sapi.lib")//import Voice header file library
Function:
void Cbodybasics::mssspeak (LPCTSTR speakcontent)//Speakcontent is a string of type LPCTSTR, call this function to convert text to speech { Ispvoice * Pvoice = NULL; Initialize COM interface if (FAILED (:: CoInitialize (null))) MessageBox (null, (LPCWSTR) L "COM interface initialization failed! ", (LPCWSTR) L" hint ", mb_iconwarning | Mb_canceltrycontinue | Mb_defbutton2); Get SpVoice interface HRESULT hr = CoCreateInstance (Clsid_spvoice, NULL, Clsctx_all, Iid_ispvoice, (void**) &pvoice); if (SUCCEEDED (HR)) { pvoice->setvolume ((USHORT) 100);//Set volume, range is 0-100 pvoice->setrate (2);// Set the speed, range is -10-10 hr = pvoice->speak (speakcontent, 0, NULL); Pvoice->release (); Pvoice = NULL; } Release COM resource :: CoUninitialize ();}
2. Voice to Text
This is a bit of a hassle because it requires real-time monitoring of the microphone, involving the Windows Messaging mechanism.
(1) First set the project properties:
Properties – Configuration Properties –c/c++– Preprocessor – Preprocessor definition: _win32_dcom;
(2) header file to be added:
#include <sapi.h>//import Voice header file #pragma comment (lib, "Sapi.lib")//import Voice header file Library # include <sphelper.h>//speech recognition header File # Include <atlstr.h>//to use cstring#pragma onceconst int wm_record = Wm_user + 100;//to define the message
(3) Define variables in the. h header file of the program
Defines the interface of the variable ccomptr<isprecognizer>m_cprecoengine;//speech recognition engine (recognition). ccomptr<isprecocontext>m_cprecoctxt;//the interface that identifies the engine context. The interface of the ccomptr<isprecogrammar>m_cpcmdgrammar;//Recognition Grammar (grammar). The interface of the ccomptr<ispstream>m_cpinputstream;//stream (). ccomptr<ispobjecttoken>m_cptoken;//the speech feature (token) interface. The interface for ccomptr<ispaudio>m_cpaudio;//audio. (Used to save the original default input stream) Ulonglong Ullgrammerid;
(4) Create a speech recognition initialization function (called when the program is just beginning to execute, such as in the sample code at the end of the text, put this initialization function in the Response code of the dialog initialization message wm_initdialog)
Speech recognition initialization function void Cbodybasics::msslisten () {//Initialize COM interface if (FAILED (:: CoInitialize (null))) MessageBox (null, (LPC WSTR) L "COM interface initialization failed! ", (LPCWSTR) L" hint ", mb_iconwarning | Mb_canceltrycontinue | Mb_defbutton2); HRESULT hr = m_cprecoengine.cocreateinstance (Clsid_spsharedrecognizer);//create Share-type recognition engine if (SUCCEEDED (HR)) {HR = M_cprecoengine->createrecocontext (&m_cprecoctxt);//Create recognition Context Interface hr = M_cprecoctxt->setnotifywindowmessage (M_hwnd, Wm_record, 0, 0);//Set the identity message const ULONGLONG ullinterest = Spfei (Spei_sound_start) | Spfei (spei_sound_end) | Spfei (spei_recognition);//Set the event we are interested in hr = M_cprecoctxt->setinterest (ullinterest, ullinterest); hr = Spcreatedefaultobjectfromcategoryid (Spcat_audioin, &m_cpaudio); M_cprecoengine->setinput (M_cpaudio, true); Create grammar rules//dictation heard-//hr = M_cprecoctxt->creategrammar (giddictation, &m_cpdictationgrammar); if (SUCCEEDED (HR))//{//hr= M_cpdictationgrammar->loaddictation (NULL, splo_static);//Load dictionary//}//c&c imperative, at which time the syntax file uses XML format ull Grammerid = 1000; hr = M_cprecoctxt->creategrammar (Ullgrammerid, &m_cpcmdgrammar); WCHAR wszxmlfile[20] = L "";//Load Syntax MultiByteToWideChar (CP_ACP, 0, (LPCSTR) "Cmdctrl.xml",-1, Wszxmlfile, n);//ansi Ext. Unincode hr = M_cpcmdgrammar->loadcmdfromfile (Wszxmlfile, splo_dynamic); MessageBox (NULL, (LPCWSTR) L "speech recognition started! ", (LPCWSTR) L" hint ", mb_canceltrycontinue); Activation syntax for recognition//hr = M_cpdictationgrammar->setdictationstate (sprs_active);//dictation hr = M_CPCMDGRAMMAR-&G T Setrulestate (null, NULL, sprs_active),//c&c hr = M_cprecoengine->setrecostate (sprst_active); } else {MessageBox (NULL, (LPCWSTR) L "speech recognition engine started error! ", (LPCWSTR) L" Warning ", MB_OK); Exit (0); }//Release COM resource:: CoUninitialize (); hr = M_cpcmdgrammar->setrulestate (null, NULL, sprs_inactive);//c&c}
(5) Defining a message Handler
needs to be put together with other message-handling code, as in this code, at the end of the Dlgproc () function of the sample code at the bottom of the text. The entire other code block of this article can be copied directly, only need to change the following message reaction module can
Message processing function uses_conversion; Cspevent event; if (M_cprecoctxt) {while (event. Getfrom (m_cprecoctxt) = = S_OK) {switch (Event.eeventid) {case spei_recognition: {//recognition of voice M_bgotreco = TRUE; static const WCHAR wszunrecognized[] = L "<Unrecognized>"; Cspdynamicstring Dstrtext; Gets the recognition result if (FAILED (event. Recoresult ()->gettext (Sp_getwholephrase, Sp_getwholephrase, TRUE, &dstrtext, NULL))) {dstrtext = wszunrecognized; } BSTR srout; Dstrtext.copytobstr (&srout); CString recstring; Recstring.empty (); recstring = Srout; Respond (* * * * * Message Response Module * * * * * * * recstring = "text message") { MessageBox (NULL, (LPCWSTR) L "good", (LPCWSTR) L "hint", MB_OK); Mssspeak (LPCTSTR (_t) ("OK, send text now!" "))); } else if (recstring = = "Li lei") { Mssspeak (LPCTSTR (_t ("Long time No See"))); }} break; } } }
(6) Modify the grammar file
to modify the Cmdctrl.xml file, you can improve the recognition of certain words, the word recognition effect will be very much, such as people's names. (in addition, run EXE separately also need to put this file and EXE in the same folder, do not put the error, but the grammar file in the word recognition effect is poor)
<?xml version= "1.0" encoding= "Utf-8"? ><grammar langid= "804" > <DEFINE> <id name= "vid_subname1" VA L= "4001"/> <id name= "vid_subname2" val= "4002"/> <id name= "Vid_subname3" val= "4003"/> <ID NAME= " Vid_subname4 "val=" 4004 "/> <id name=" vid_subname5 "val=" 4005 "/> <id name=" Vid_subname6 "VAL=" 4006 "/> <id name= "Vid_subname7" val= "4007"/> <id name= "Vid_subname8" val= "4008"/> <id NAME= "VID_SubName9" Val= "4009"/> <id name= "Vid_subnamerule" val= "3001"/> <id name= "Vid_toplevelrule" VAL= "+"/> </D efine> <rule id= "Vid_toplevelrule" toplevel= "ACTIVE" > <O> <L> <P> i want </P> <P> run </P> <P> execute </P> </L> </O> <ruleref refid= "Vid_subnameru Le "/> </RULE> <rule id=" Vid_subnamerule "> <l propid=" vid_subnamerule "> <p val=" vid_sub Name1 "> Texting </P> <p val= "vid_subname2" > Yes </P> <p val= "Vid_subname3" > Good </P> <p val= "Vid_subname4" &G T </P> <p val= "vid_subname5" > Li lei </P> <p val= "Vid_subname6" > Han Meimei </P> <p VAL = "Vid_subname7" > Chinese interface </P> <p val= "Vid_subname8" > English interface </P> <p val= "Vid_subname9" >english </P> </L> </RULE></GRAMMAR>