Develop Chinese speech applications on the. NET platform

Source: Internet
Author: User
Tags sapi microsoft website


Voice is the most natural way for humans to interact with each other and the highest goal of software user interface development at present. Microsoft has been actively promoting the development of speech technology, and released the speech development platform speech SDK to help developers implement speech applications.

As. NET technology is deeply rooted in the hearts of the people, more and moreProgramStaff started to go To the. NET platform for development. However, in the newly released. the net speech SDK does not support Chinese speech. Currently, the highest version of the speech SDK that supports Chinese is SAPI 5.1 () on Windows. use sapi5.1 to develop Chinese speech applications.


1. Analysis and installation of sapi.51 SDK

2. Import COM objects to. net

3. Use C # To develop a Chinese TTS application example

4. Conclusion

5. References

1. Analysis and installation of sapi.51 SDK

Sapi sdk is a free speech application development kit provided by Microsoft. It contains the speech application design interface (SAPI) and Microsoft's continuous speech recognition engine (MCSR) and Microsoft's speech synthesis (TTS) engine. The current version 5.1 supports recognition in three languages (English, Chinese and Japanese) and synthesis in two languages (English and Chinese ). SAPI also includes powerful design interfaces for low-level control and highly adaptive direct speech management, training wizard, events, syntax compilation, resources, speech recognition (SR) management, and TTS management. Its structure (1 ):

Figure (1)

The Speech Engine interacts with SAPI through the DDI layer (Device Driver Interface), and applications communicate with SAPI through the API layer. By using these APIs, you can quickly develop applications for speech recognition or speech synthesis.

Sapi5.1 sdks can be downloaded from the Microsoft Website: The requires the setup of speech SDK 5.1 (68 m) and 5.1 Language Pack (81.5 m ).

2. Import COM objects to. net

Sapi5.1 is based on the Windows platform and called through the COM interface. To use sapi5.1 on the. NET platform, we can use the. NET framework's powerful tool tlbimp.exe to import the COM Object of the sapi sdk to. net. Tlbimp.exe generates a controlled packaging class, which can be used by the management client. Number of reference packages for managing the actual COM object. When the packaging class is used as the collection garbage, the packaging class releases the COM object it wraps. Finished.

The following shows how to import the COM Object of SAPI:

D: \ Program Files \ common files \ microsoft shared \ speech> tlbimp SAPI. dll/out: dotnetspeech. dll

After the SDK is installed, you can find sapi.dllin the D: \ Program Files \ common files \ microsoft shared \ speech \ directory, which defines the sapicomobject and converts the DLL. NET platform Assembly --- dotnetspeech. DLL, the conversion process will prompt a lot of warnings (warning), but this affects our development, can ignore. Finally, we can use ildasm to view the objects in dotnetspeech. dll.

3. Use C # To develop a Chinese TTS application example

The following example shows how to use C # To develop a speech application. The development environment is:

Operating System: Windows 2000 Chinese version + SP3

. NET Framework: 1.0.3705 (English version)

Visual Studio. NET 7.0.9466 (English version)

First, create a C # windows application project speechapp and add the dotnetspeech Object Library in Solution Explorer on the right of the development environment. Right-click "Reference" and choose "add reference". In the displayed dialog box, find the generated dotnetspeech. dll.

Figure (2)

Open form1.csCodeFile, add a namespace (case sensitive) at the beginning of the Code ).

Using dotnetspeech;

In this way, the sapi sdk is imported. Now we can write the application code. This example shows how to read the text through the speaker and convert the text into a voice signal (wave audio file), the program interface (3 ):

// Read private void buttonsynthesis_click (Object sender, system. eventargs e) {try {speechvoicespeakflags spflags = speechvoicespeakflags. svsflagsasync; spvoice voice = new spvoice (); voice. speak (this. textboxtext. text, spflags);} catch (exception ER) {MessageBox. show ("an error occured! "," Speechapp ", messageboxbuttons. OK, messageboxicon. error) ;}} // generate the audio file (WAV) Private void buttonttstowave_click (Object sender, system. eventargs e) {try {speechvoicespeakflags spflags = speechvoicespeakflags. svsflagsasync; spvoice voice = new spvoice (); savefiledialog SFD = new savefiledialog (); SFD. filter = "all files (*. *) | *. * | WAV Files (*. wav) | *. wav "; SFD. title = "Save to a wave file"; SFD. filterindex = 2; SFD. restoredirectory = true; If (SFD. showdialog () = dialogresult. OK) {speechstreamfilemode spfilemode = speechstreamfilemode. ssfmcreateforwrite; spfilestream = new spfilestream (); spfilestream. open (SFD. filename, spfilemode, false); voice. audiooutputstream = spfilestream; voice. speak (txtspeaktext. text, spflags); voice. waituntildone (timeout. infinite); spfilestream. close () ;}} catch (exception er) {MessageBox. Show ("an error occured! "," Speechapp ", messageboxbuttons. OK, messageboxicon. Error );}}

Next, configure the current language of the speech SDK engine on the control panel. Open "Control Panel", open the "Voice" configuration item, and you can see where we can identify or synthesize the current language, you can also configure related hardware devices and control the language speed. (4)

In the "text-speech conversion" "speech selection" combo box, select Simplified Chinese (Microsoft Simplified Chinese ). In this way, Chinese characters can be merged.

Return to vs. net, F5 compile and run the application just now, enter Chinese characters in the text box, put on headphones, and click "read aloud" to experience the new generation of intelligent man-machine interfaces :)

4. Conclusion

Microsoft provides a powerful platform for Voice Man-machine interfaces. the. NET environment makes this development more convenient and convenient. Download the sapi5.1 SDK and let's go !!!

5. References

[1] Description documentation (SAPI. CHM) provided by the speech SDK)

[2] msdn (

Author: Chen benfeng

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.