In recent years, Microsoft has been paying more and more attention to putting speech technology into mainstream use, which has promoted some products, such as the speech server (used to enable voice-based telephone systems) and voice command (enable users to control Windows Mobile devices using voice commands ). Therefore, Microsoft's Voice group is always busy in the Development of Windows Vista. Combining powerful speech technologies with powerful APIS is always correct until Windows Vista does.
System. Speech. Synthesis
Let's take a look at the example of how to use speech synthesis from a hosted application. As the most typical UI output example, I will start from the application that only says "Hello, world", as shown in the following code.
Using system;
Using system. Speech. synthesis;
Namespace tts_lele_sample_1
{
Class Program
{
Static void main (string [] ARGs)
{
Speechsynthesizer synth = new speechsynthesizer ();
Synth. speaktext ("Hello, world! ");
}
}
}
This example is an obvious console application that was recently created using Visual C # And added three lines of code. The first line added only introduces the system. Speech. Synthesis namespace. The second line declares and instantiates the instance of speechsynthesizer, which accurately represents the meaning of its name: speech synthesizer. The third line added is the call to speaktext. This is all you need to call the synthesizer.
By default, the speechsynthesizer class uses the default recommended synthesizer in the speech control panel. However, it can use any synthesizer compatible with sapi ddi.
Next example:
Speechsynthesizer synth = new speechsynthesizer ();
Synth. selectvoice ("Microsoft Sam ");
Synth. speaktext ("I'm sam .");
Synth. speaktext ("You may have heard me speaking to you in Windows XP .");
Synth. speaktext ("Anna will make me redundant .");
Synth. selectvoice ("Microsoft Anna ");
Synth. speaktext ("I am the new voice in windows .");
Synth. speaktext ("Sam belongs to a previous generation .");
Synth. speaktext ("I sound great .");
Synth. selectvoice ("Microsoft Lili ");
Synth. speaktext ("I was developed in Beijing and I used the voice of a professional announcer. Each
People who have heard me say that I am the best in Chinese speech synthesis! "); // Requires MS mincho and simsun fonts to view
/* "I was developed in Beijing, using recordings of a professional news reader.
Everybody who hears me talk says that I am the best synthesized Chinese
Voice they have ever heard! "*/
Displays how to perform this operation, which uses legacy Sam voice for Windows 2000 and Windows XP, and Microsoft Lili voice for new Anna and Windows Vista. (Note that this example and all other system. Speech. Synthesis examples use the same code framework as the first example and replace the main body .) This example shows three instances of the selectvoice method using the desired synthesizer name. It also demonstrates the usage of Windows Vista Chinese synthesizer (Lili. Lili can also speak English well.
In these two examples, the method of merging APIS is very similar to that of using console APIs: The application only sends characters, and these characters are immediately serialized. However, for more complex outputs, synthesis is more likely to be seen as the equivalent of Document presentation. The synthesizer input is a document, which not only contains the content to be presented, it also includes different effects and settings to be applied at specific points of the content.
The speechsynthesizer class can use an XML document format named speech synthesis Markup Language (ssml, this is very similar to the XHTML document describing the rendering style and structure to be applied to specific content fragments on the web page. W3C ssml recommendation (www.w3.org/tr/speech-synthesis) is very readable, so in this article, I do not intend to describe ssml in depth. Certainly, applications can simply load ssml documents directly to the synthesizer and present them. The following is an example of loading and rendering an ssml file:
Speechsynthesizer synth = new speechsynthesizer ();
Promptbuilder savedprompt = new promptbuilder ();
Savedprompt. appendssml ("C: \ prompt. ssml ");
Synth. Speak (savedprompt );
Another easy way to compile ssml files is to use the promptbuilder class in system. Speech. synthesis. Promptbuilder can represent almost any content that ssml documents can represent and is easier to use. To create a general model for complex synthesis, use promptbuilder to generate a prompt as you want, and then use the synthesizer's speak or speakasync method to present it.
The following code:
// This prompt is quite complicated
// So I'm going to build it first, and then render it.
Promptbuilder myprompt = new promptbuilder ();
// Start the main speaking style
Promptstyle mainstyle = new promptstyle ();
Mainstyle. Rate = promptrate. medium;
Mainstyle. volume = promptvolume. volume;
Myprompt. startstyle (mainstyle );
// Alert the listener
Myprompt. appendaudio (New uri (
"File: // C :\\ Windows \ media \ policy.wav"), "Attention! ");
Myprompt. appendtext ("Here are some important messages .");
// Here's the first important message
Myprompt. appendtextwithpronunciation ("winfx", "W %n %f %ks ");
Myprompt. appendtext ("is a great platform .");
// And the second one
Myprompt. appendtextwithhint ("asp", sayas. acronym );
Myprompt. appendtext (
"Is an acronym for Active Server Pages. Whereas an ASP is a snake .");
Myprompt. appendbreak ();
// Let's emphasise how important these messages are
Promptstyle interimstyle = new promptstyle ();
Interimstyle. Emphasis = promptemphasis. Strong;
Myprompt. startstyle (interimstyle );
Myprompt. appendtext ("Please remember these two things .");
Myprompt. endstyle ();
// Then we can revert to the main speaking style
Myprompt. appendbreak ();
Myprompt. appendtext ("thank you ");
Myprompt. endstyle ();
// Now let's get the synthesizer to render this message
Speechsynthesizer synth = new speechsynthesizer ();
Synth. Speak (myprompt );
Explain the many powerful functions of promptbuilder. The first point to note is that it generates a document with a layered structure. The Speech Style used in this example is nested in another one. At the beginning of this document, I used the style that will be used throughout the document. Then I used another style to show the focus when the document was halfway through. When I finish this style, the document is automatically converted to the previous style.
This example also shows many other convenient functions. The appendaudio function combines WAV Files with output files. If wav files are not found, an equivalent text file can be used. The appendtextwithpronunciation function allows you to specify the correct pronunciation of a word. Through joint use of dictionaries and algorithms used to deduce the pronunciation of unknown words, the speech synthesis engine has known how to pronounce most common words in a language. However, this does not work for all words, such as some special terms or trademark names. For example, "winfx" may be pronounced as "winfeks ". Instead, I use International Phonetic Alphabet to describe "winfx" as "W? N? F? Ks ", with the letter"?" It is a Unicode Character 0x026a (the pronunciation of "I" is the same as "I" in "fish", different from "I" in "five"), and the letter "?" It is Unicode Character 0x025b (the pronunciation of "e" in the common American language is the same as "e" in "Bed ").
Generally, the synthesis engine can distinguish between abbreviations and uppercase words. Sometimes, however, you may find that a acronyms are incorrectly interpreted as a word by the engine's test method. Therefore, you can use the appendtextwithhint function to mark acronyms. There are many nuances for promptbuilder. Although my examples are not comprehensive, they are very descriptive.
Another benefit of separating a content specification from the runtime presentation is that you can then freely separate the application from the specific content it presents. You can use promptbuilder to load the prompt as an ssml from another part of the application or a completely different application. The following code uses promptbuilder to write an ssml file:
Using (streamwriter promptwriter = new streamwriter ("C: \ prompt. ssml "))
{
Promptwriter. Write (myprompt. toxml ());
}
Another way to separate content fragments is to present the entire prompt to an audio file for later REPLAY:
Speechsynthesizer synth = new speechsynthesizer ();
Synth. setoutputtowavefile ("C: \ message.wav ");
Synth. Speak (myprompt );
Synth. setoutputtonull ();
Whether to use ssml flag or promptbuilder class depends on your preferred style. You should use the one that you feel more comfortable.
For ssml and promptbuilder, note that the functions of each synthesizer are slightly different. Therefore, if the engine is able to take any specific action of a request using these two mechanisms, it should consider these actions as the creation request that the engine will adopt.
Reproduced http://blog.csdn.net/xwygn/article/details/6672784