A simple implementation of Chinese TTS (based on Linux) for Speech Synthesis

Source: Internet
Author: User

In the final analysis, speech synthesis uses data from the voice database based on the positioning of Chinese Characters in character sets.

Positioning method:

Calculate the bit WM of the Chinese character from the high byte based on the value of the saved word in two bytes. Calculate the area QM of the Chinese character from the low byte,
(QM-176) * 94 + Wm-160 is the position of the word in the Chinese character set,
The offset of the voice data corresponding to the Chinese character is (Position 1) 3200 + 46.

After obtaining the pronunciation data of Chinese characters in the voice database based on the positioning method, the audio files are synthesized in WAV format.

The locating and merging code is as follows:
# Define maxlen 32000

/*
The STR parameter is a string of Chinese characters and the encoding format is GBK.
Return Value:
-1: indicates that the voice library file is opened incorrectly.
-2: Indicates An error occurred while opening or generating the merged audio file.
Others: The function is successfully executed.
*/
Int WAV (char * Str)
{
File * fpf, * FPT; // file pointer
Int Qm, WM; // Chinese Character area, location code
Int re; // function return value
Long fileleng = 0; // useful when modifying the WAV format after the file length
If (fpf = fopen ("ddd.wav", "RB +") = NULL) // open the voice Library File
Return-1;
 
If (FPT = fopen ("china.wav", "WB +") = NULL) // open or generate a synthesized audio file for playing
Return-2;

Char head [46]; // WAV file header
Char buffer [maxlen]; // pronunciation data buff
Memset (buffer, 0, maxlen); // set it to 0
 
Fread (Head, sizeof (head), 1, fpf); // read the voice database file header
Fwrite (Head, sizeof (head), 1, FPT); // write the synthesized speech File
 
Int L = strlen (STR );
Char * s = STR;
For (INT I = 0; I <= L; I = I + 2)
{
QM = (unsigned char) * (S + I); // retrieve the area code of Chinese Characters
WM = (unsigned char) * (S + 1 + I); // retrieves the Chinese character bit code

If (QM <176 | QM> 215) // determines whether the data is in the Chinese character set.
Continue;

If (WM <161 | WM> 254) // determines whether the image is in the Chinese character set.
Continue;

Int position = (qm-176) * 94 + wm-160;
Int offset = (position-1) * maxlen + 46; // locate
Fseek (fpf, offset, 0 );
Fread (buffer, sizeof (buffer), 1, fpf); // obtain pronunciation data
Fwrite (buffer, sizeof (buffer), 1, FPT); // write the merged File
Fileleng ++; // The length of the merged file is increased.

} // End
 
Re = fileleng;
Fileleng = fileleng * maxlen;
Fseek (FPT, 42, seek_set );
Fwrite (& fileleng, sizeof (long), 1, FPT); // modify the WAV format of the merged file, mainly to modify the file size. For details, see the WAV format table.
 
Fileleng + = 44;
 
Fseek (FPT, 4, seek_set );
Fwrite (& fileleng, sizeof (long), 1, FPT); // modify the WAV format of the merged file, mainly to modify the file size. For details, see the WAV format table.
 
Fclose (fpf); // close the file
Fclose (FPT );
Return re;
}

 

Others:
From the wav function, we can see that the encoding of the input characters that we receive must be GBK,
Therefore, if the system does not use GBK encoding, we should also perform encoding conversion.
If the encoding is correct, Chinese characters must be extracted from the user input.
For this reason, I wrote a short piece of code to filter non-Chinese characters.
Void trans (char * Str)
{
Int I = 0, j = 0;
While (STR [I]! = '/0 ')
{
If (STR [I] <0)
{
STR [J ++] = STR [I ++];
STR [J ++] = STR [I ++];
}
Else
I ++;
 
} // End while
STR [J] = '/0 ';
}

 

 

Article introduction:
Introduction to the simplified implementation of Chinese TTS (based on Linux)
Http://blog.csdn.net/dedodong/archive/2006/07/15/923543.aspx

The implementation principle of Simplified Chinese TTS (based on Linux:
Http://blog.csdn.net/dedodong/archive/2006/07/16/927041.aspx

Implementation of a Chinese TTS Language Library Based on Linux
Http://blog.csdn.net/dedodong/archive/2006/08/22/1105742.aspx

Chinese TTS simple implementation (based on Linux) postscript
Http://blog.csdn.net/dedodong/archive/2006/08/24/1109908.aspx

 

 

 

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.