C # Implementing Voice dictation

Source: Internet
Author: User

This article is original, prohibit reprint.

Share how to use C # to connect to the Iflytek Voice dictation service for simple and efficient voice dictation.

The realization of voice dictation is mainly divided into two parts: recording and speech recognition; recording is the acquisition of audio data from a device's sound card port and saving it as an audio file, and speech recognition converts the audio file just described by invoking the Voice dictation service to text.

Related class library files

1. Open Source Recording Library NAudio.dll

Http://pan.baidu.com/s/1dFth2nv

2. Voice Dictation Library Msc.dll

Go to the Flying Open Platform application related SDK

The recording section can use open source. NET Audio processing class library NAudio.dll, it is a managed class library, it is convenient to use, of course, you can also read the sound card recording, Microsoft has the relevant system API, not detailed here.

Recording part of the core code:

1 //Initialize2String FilePath = AppDomain.CurrentDomain.BaseDirectory +"Temp.wav";3WaveIn M_wavesource =NewWaveIn ();4M_wavesource.waveformat =NewNAudio.Wave.WaveFormat (16000, -,1);//recording format for 16bit,16khz,mono5M_wavesource.dataavailable + =NewEventhandler<waveineventargs>(wavesource_dataavailable);6m_wavesource.recordingstopped + =NewEventhandler<stoppedeventargs>(wavesource_recordingstopped);7Wavefilewriter M_wavefile =NewWavefilewriter (M_filename, M_wavesource.waveformat);8             9             //Start RecordingTen m_wavesource.startrecording (); One              A             //save to intercepted sound -             Private voidWavesource_dataavailable (Objectsender, Waveineventargs e) - { the             if(M_wavefile! =NULL) -             { -M_wavefile.write (E.buffer,0, e.bytesrecorded); - M_wavefile.flush (); +             } - } +          A             //Stop Recording atM_wavesource.stoprecording ();

After the recording is complete, voice dictation is available, and the class library in the SDK for voice dictation services provided by Msc.dll is a native class library, and there is no way to use it as a managed class library in C #, either by using import or by wrapping it as a managed class library, only the first method is described here.

The above class library is msc.dll using C language encapsulation, in the declaration of the interface should be aware of C language Variable type expression and C # there are many differences; for example, there are many memory address operations in the SDK, so there are many pointer-type variables, and C # in the pointer concept is relatively weak. Provides two solutions, one is to declare the unsafe code in C #, so that you can use pointers like C + +, and the second is to use IntPtr, ref variable expression, to achieve "compatibility."
Related Interface declaration:

1[DllImport ("Msc.dll", CallingConvention =Callingconvention.stdcall)]2          Public Static extern intMsplogin (stringUsrstringPwdstring@params);3 4[DllImport ("Msc.dll", CallingConvention =Callingconvention.stdcall)]5          Public Static externIntPtr Qisrsessionbegin (stringGrammarlist,string_params,ref interrorCode);6 7[DllImport ("Msc.dll", CallingConvention =Callingconvention.stdcall)]8          Public Static extern intQisrgrammaractivate (stringSessionID,stringGrammarstringTypeintweight);9 Ten[DllImport ("Msc.dll", CallingConvention =Callingconvention.stdcall)] One          Public Static extern intQisraudiowrite (stringSessionID, IntPtr Wavedata,UINTWavelen,intAudiostatus,ref intEpstatus,ref intrecogstatus); A  -[DllImport ("Msc.dll", CallingConvention =Callingconvention.stdcall)] -          Public Static externIntPtr Qisrgetresult (stringSessionID,ref intRsltstatus,intWaitTime,ref interrorCode); the  -[DllImport ("Msc.dll", CallingConvention =Callingconvention.stdcall)] -          Public Static extern intQisrsessionend (stringSessionID,stringhints); -  +[DllImport ("Msc.dll", CallingConvention =Callingconvention.stdcall)] -          Public Static extern intQisrgetparam (stringSessionID,stringParamName,stringParamvalue,ref UINTValuelen); +  A[DllImport ("Msc.dll", CallingConvention =Callingconvention.stdcall)] at          Public Static extern intMsplogout ();

Business process:
1. Call Msplogin (...) Interface login, you can only log in once, but must be guaranteed to log in before invoking other interfaces;
2. Call Qisrsessionbegin (...) Start a voice dictation;
3. Call Qisraudiowrite (...) block write audio data
4. Loop call Qisrgetresult (...) interface to return dictation results
5. Call Qisrsessionend (...) to end this dictation
6. Call Msplogout () logout when you are no longer using the service to avoid unnecessary hassles.
Core code:

 Public stringAudiotostring (stringinFile) {            intRET =0; stringText =String.Empty; FileStream FileStream=NewFileStream (InFile, FileMode.OpenOrCreate); byte[] Array =New byte[ This.            Buffer_num]; INTPTR IntPtr= Marshal.allochglobal ( This.            Buffer_num); intAudiostatus =2; intEpstatus =-1; intRecogstatus =-1; intRsltstatus =-1;  while(Filestream.position! =filestream.length) {intWavelen = FileStream.Read (Array,0, This.                Buffer_num); Marshal.Copy (Array,0, INTPTR, array.                Length); RET= Iflyasr.qisraudiowrite ( This. M_sessionid, IntPtr, (UINT) Wavelen, Audiostatus,refEpstatus,refrecogstatus); if(Ret! =0) {filestream.close (); Throw NewException ("Qisraudiowrite err,errcode="+ret); }                if(Recogstatus = =0) {IntPtr intPtr2= Iflyasr.qisrgetresult ( This. M_sessionid,refRsltstatus,0,refret); if(INTPTR2! =IntPtr.Zero) {text+= This.                    Ptr2str (INTPTR2); }} thread.sleep ( -);            } filestream.close (); Audiostatus=4; RET= Iflyasr.qisraudiowrite ( This. M_sessionid, IntPtr,1u, Audiostatus,refEpstatus,refrecogstatus); if(Ret! =0)            {                Throw NewException ("Qisraudiowrite Write last audio err,errcode="+ret); }            intTimescount =0;  while(true) {IntPtr intPtr2= Iflyasr.qisrgetresult ( This. M_sessionid,refRsltstatus,0,refret); if(INTPTR2! =IntPtr.Zero) {text+= This.                Ptr2str (INTPTR2); }                if(Ret! =0)                {                     Break; } thread.sleep ( $); if(Rsltstatus = =5|| timescount++ >= -)                {                     Break; }            }            returntext; }

Design the following UI interaction yourself, or combine it with your application to have your application grow a pair of listening ears!

Results:

C # Implementing Voice dictation

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.