Explanation of text-to-speech (TTS) Speech reading in Android sdk1.6

Source: Internet
Author: User
Tags time zones

Texttospeech (TTS) is an important new feature in Android 1.6. Converts the specified text to audio output in different languages. It can be easily embedded into games or applications to enhance user experience.
Before explaining tts api and applying this function to your actual project, you should first have a preliminary understanding of this TTS engine.

A general understanding of TTS resources:

TTS engine relies on the main languages supported by Android platform: English, French, German, Italian, and Spanish, at least Google's scientists haven't played Chinese well yet, so it's a matter of course .) TTS can convert texts to Speech Output in any of the above five languages. At the same time, for individual language versions, it depends on different time zones, for example, for English, in TTS, We can output two different versions, namely the American and English versions (from this we can see that Google's style of work is really meticulous, another reason why Google does not join Chinese is that there are too many Chinese dialects ).

To support such a large amount of data, the TTS engine takes the pre-loading method for resource optimization. Based on a series of parameter information (the usage of parameters will be detailed later), extract the corresponding resources from the database and load them to the current system.

Although most devices loaded with Android operating systems use this engine to provide TTS functions, the limited storage space of some devices affects TTS's ability to maximize its functionality, is a bottleneck. To this end, the development team has introduced detection modules so that application programs or games using this technology can have corresponding optimization adjustments for different devices, so as to avoid restrictions due to this function, affects the use of the entire application. It is safer to let users choose whether there is sufficient space or demand to load this resource. The following provides a standard detection method:

Intent checkintent = new intent ();
Checkintent. setaction (texttospeech. Engine. action_check_tts_data );
Startactivityforresult (checkintent, my_data_check_code );
If the current system allows the creation of an "android. Speech. TTS. texttospeech" object, it indicates that the TTS function is supported and the check_voice_data_pass mark is provided in the returned results. If the system does not support this function, you can choose whether to load it, so that the device can support the multi-lingual talking voice function in multiple languages ". "Action_install_tts_data" intent introduces the user to the TTS download interface in Android Market. After the download is complete, the installation is automatically completed. The complete code (androidres.com) is as follows ):

Private texttospeech MTTS;
Protected void onactivityresult (
Int requestcode, int resultcode, intent data ){
If (requestcode = my_data_check_code ){
If (resultcode = texttospeech. Engine. check_voice_data_pass ){
// Success, create the TTS instance
MTTS = new texttospeech (this, this );
} Else {
// Missing data, install it
Intent installintent = new intent ();
Installintent. setaction (
Texttospeech. Engine. action_install_tts_data );
Startactivity (installintent );
}
}
}
Both texttospeech and oninitlistener must reference the context of the current activity as the construction parameter. Oninitlistener () is used to notify the system that the current TTS engine has been loaded and is available.

Set language parameters as needed:

As early as the Google I/O conference, the official team provided a fresh experience on the application of this function, which output the Translation results directly through the voice of five different languages. The language loading method is very simple:

MTTS. setlanguage (locale. US );
The above Code indicates that the current TTS entity is loaded in American English. The parameter does not indicate the name of a language, but is represented by country code. The advantage of this is that the language selection can be determined and different based on the region. For example, English, as the most widely used language, varies in different regions. To determine whether the current system supports language resources in a region, you can call the return value of the islanguageavailable () method and select the correct processing method based on the description of the returned value. To make applications with some brilliant functions more robust, This is the technical link that needs to be considered throughout the development process. Below are some application instances (androidres.com ):

MTTS. isw.ageavailable (locale. uk ))
MTTS. isw.ageavailable (locale. France ))
MTTS. islanguageavailable (New locale ("spa", "esp ")))
If the returned value is "texttospeech. lang_country_available", the selected region is included in the current TTS System. If a TTS object has been created in the system, you can use the islanguageavailable () method to replace the START "action_check_tts_data" intent detection. If no available resource matches the specified parameter, the result of "texttospeech. lang_missing_data" is returned. The following two examples return different status information:

MTTS. isw.ageavailable (locale. canada_french ))
MTTS. islanguageavailable (New locale ("spa "))
The return values of the two statements are "texttospeech. lang_available ". The first is to check whether the current system supports the Canadian *** language. Because the system cannot find the French branch in this region in the resource library, it means that it only supports this language (French ), it does not support the language branches in the current region.

 

In addition, we recommend that you use the locale. getdefault () method to select the appropriate language library based on the user's default region settings.

 

Specific Method for executing speak:

 

According to the above introduction, texttospeech initialization and parameter configuration are basically implemented. The following is an example of an alarm application. The speak () method can be used to directly play a powerful voice function in the application. Yes, it's so easy to use:

String mytext1 = "this translation is from androidres.com ";
String mytext2 = "I hope so, because it's time to wake up .";
MTTS. Speak (mytext1, texttospeech. queue_flush, null );
MTTS. Speak (mytext2, texttospeech. queue_add, null );
How TTS engine works:

 

Each Independent Application can create a TTS entity separately, and the speech Message Queue (Queue) they need to execute is managed by the TTS engine and speech synthesis.

Glossary:

Synthesize [ˈ s limit n θ limit SA limit Z] Dj ['s limit n θ limit SA limit Z] KK: To produce sounds, music or speech using electronic equipment (audio) synthesis

 

Utterances [Voice Messaging t audio R audio ns] DJ [Voice Messaging t audio R audio ns] KK: Speech mode, voice/tone.

 

Each Independent TTS instance manages the priority and sequence of voice message queue requests. When the "texttospeech. queue_flush" method is referenced to call the speak () method, the task running on the current instance is interrupted (it can also be understood as clearing the current voice task and executing the New Queuing task ). The pronunciation task that references the texttospeech. queue_add label is added to the current task queue.

Associate stream type with a voice task:

 

All audio stream tasks in the Android operating system are implemented through the audiomanager class, which changes the playing mode of the voice for different stream types. Streamtype can be understood as the playing attribute of speech. This attribute is an application scheme that you configure in the system based on your own needs. If you clearly classify voice tasks, you can easily manage the attributes of tasks of the same category in a unified manner. Based on the previous alarm clock example, replace the last null parameter of the speak () method with a numerical value with actual meanings. The type of this parameter is hashmap. If you want to set the current stream type to the alarm type in the system, make slight changes to the previous example:

Hashmap myhashalarm = new hashmap ();
Myhashalarm. Put (texttospeech. Engine. key_param_stream,
String. valueof (audiomanager. stream_alarm ));
MTTS. Speak (mytext1, texttospeech. queue_flush, myhashalarm );
MTTS. Speak (mytext2, texttospeech. queue_add, myhashalarm );
Application speech function completion callback:

 

The speak () in TTS is an asynchronous call. You can use queue_flush or queue_add as the parameter to define the listener to listen to the completion status of the current task. You can use this method to append some additional operations after the speak () is executed. In the following example, after the second speak () method is called, The onutterancecompletedlistener interface is used to call other methods:

MTTS. setonutterancecompletedlistener (this );
Myhashalarm. Put (texttospeech. Engine. key_param_stream,
String. valueof (audiomanager. stream_alarm ));
MTTS. Speak (mytext1, texttospeech. queue_flush, myhashalarm );
Myhashalarm. Put (texttospeech. Engine. key_param_utterance_id,
"End of wakeup Message ID ");
// Myhashalarm now contains two optional parameters
MTTS. Speak (mytext2, texttospeech. queue_add, myhashalarm );
The following is the code for defining listener, similar to the listening button or other view events method. Here, the hashmap parameter in speak () will be passed into the listener for condition judgment:

Public void onutterancecompleted (string uttid ){
If (uttid = "End of wakeup Message ID "){
Playannoyingmusic ();
}
}
"Baking" current real-time voice data:

 

When you see the two words of baking, you will think of the fragrant bread. Software development should focus on whether resources can be reused to the maximum extent, especially for mobile application platforms with limited resources. How can TTS be used more efficiently? This time, we will experience more exciting features than baking breads. We will save the audio stream output by TTS engine as a permanent audio file in the current storage space (sdcard ). In this way, you can quickly play back some speech content that requires repeated playback, so as to achieve the "Emission Reduction" purpose advocated by the international community. This saves your time! The following example uses the synthesizetofile method of TTS to save the synthesized speech stream to the address specified by the parameter.

Hashmap myhashrender = new hashmap ();
String wakeuptext = "are you up yet? ";
String destfilename = "/sdcard/myappcache/wakeup.wav ";
Myhashrender. Put (texttospeech. Engine. key_param_utterance_id, wakeuptext );
MTTS. synthesizetofile (wakuuptext, myhashrender, destfilename );
After completing the preceding operations, you will receive a notification of system completion. You can play the video using the Android. Media. mediaplayer method just like other audio resources. However, this is contrary to the texttospeech application process. You can use addspeech () to store the speech and text description together in the TTS library.

MTTS. addspeech (wakeuptext, destfilename );
In the current TTS instance, any call that uses the speak () method to execute the same content will reuse the generated audio file. If resources are lost or sdcard and other storage devices are removed, the system will re-synthesize the specified voice content through TTS engine.

MTTS. Speak (wakeuptext, texttospeech. queue_add, myhashalarm );
Reclaim tts:

After confirming that the application no longer needs TTS functions, you can call Shutdown () in the ondestroy () method of the activity to release the resources occupied by the current TTS entity.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.