Speech Synthesis
Basic Principles of Speech Synthesis
Speech synthesis is a process of "analysis-storage-synthesis. Generally, you need to select an appropriate primitive (the smallest basic unit of Speech Science processed by the speech synthesis system) and store the primitive in a certain parameter encoding or waveform mode to form a voice database. During the synthesis, the corresponding elements are extracted from the voice database based on the speech information to be merged, and then restored to the voice signal.
Main Types of binary speech synthesis
Based on the selection of elements and their storage formats, the synthesis methods can be divided into waveform synthesis methods and Parameter Synthesis methods in general.
The waveform synthesis method is simpler than the parameter synthesis method, and the speech quality and definition are better. However, the space storage required is relatively large. Therefore, the synthesis vocabulary is limited accordingly.
Summary
Now the research and application of Speech synthesis is mainly focused on converting Text into Speech synthesis, that is, the TTS (Text-To-Speech) system. Although I have not directly participated in the speech synthesis project, I have a TTS project in the team and learned some basic knowledge. From a non-professional perspective, there are some important steps for speech synthesis (You can also buy from a third party ):
1. determine the language and vocabulary;
The methods for Chinese synthesis and English synthesis are obviously different. No matter which language has a very large vocabulary, we cannot achieve the synthesis of an infinite vocabulary, which is usually the synthesis of common words. Generally, you need to purchase a voice database or record a high-quality voice database.
2. Selection of compositing Elements
This is a question to be weighed. phoneme, diphoneme, semi-syllable, syllable, word, phrase, and sentence may all be used as the elements of the Synthesis System in the ascending order of Speech Science, generally, the smaller the primitive, the smaller the storage space required, the more rules the application combination has, and the worse the synthesis quality. Sometimes you can add some large elements to deal with special situations. For example, the synthesis of some single phoneme may seriously degrade the quality of the Transition audio, therefore, adding some diphoneme or half-syllable can better ensure the continuity of the speech.
3. Create merging rules
This is a very important link that directly affects the quality of speech synthesis. These rules are mainly based on linguistic characteristics, and need to understand the pronunciation of words and when it is a change of light, accent, and tone, two words are connected to each other, including the accent and changyin, and the simultaneous pronunciation of multiple syllables. In many cases, most of the manpower is used to establish fine synthesis rules and algorithms. Such rules will have an impact on the synthesis elements. Once you modify the voice library of the elements, it is a very time-consuming task.
4. merge two steps
First, the words and phrases entered are converted into phonetic symbols according to the synthesis rules, and the pronunciation features (light, heavy, slow, urgent, and continuous reading) are included ), then, use the search algorithm (Vitebi algorithm) to select appropriate elements and splice them into a complete word or phrase.
5. Advantages of Chinese synthesis
There is still one advantage in the speech synthesis of the infinite vocabulary of Chinese, because Chinese sentences are made up of phrases, and words are basically made up of monosyllabic words. Although there is a sound and multiple words, as long as the machine reads phrases or sentences, people will naturally understand and distinguish these Homophone Words. There are only about 1300 Chinese syllables. Even if you do not need a smaller initials or vowels, it is not too large to use them as the base language library.
6. Current Situation of Chinese synthesis
As the leader in Chinese synthesis, the synthesis technology is far ahead of that of foreign companies.