In the final analysis, speech synthesis uses data from the voice database based on the positioning of Chinese Characters in character sets.
Positioning method:
Calculate the bit WM of the Chinese character from the high byte based on the value of the saved word in two bytes. Calculate the area QM of the Chinese character from the low byte,
(QM-176) * 94 + Wm-160 is the position of the word in the Chinese character set,
The offset of the voice data corresponding to the Chinese character is (Position 1) 3200 + 46.
After obtaining the pronunciation data of Chinese characters in the voice database based on the positioning method, the audio files are synthesized in WAV format.
The locating and merging code is as follows:
# Define maxlen 32000
/*
The STR parameter is a string of Chinese characters and the encoding format is GBK.
Return Value:
-1: indicates that the voice library file is opened incorrectly.
-2: Indicates An error occurred while opening or generating the merged audio file.
Others: The function is successfully executed.
*/
Int WAV (char * Str)
{
File * fpf, * FPT; // file pointer
Int Qm, WM; // Chinese Character area, location code
Int re; // function return value
Long fileleng = 0; // useful when modifying the WAV format after the file length
If (fpf = fopen ("ddd.wav", "RB +") = NULL) // open the voice Library File
Return-1;
If (FPT = fopen ("china.wav", "WB +") = NULL) // open or generate a synthesized audio file for playing
Return-2;
Char head [46]; // WAV file header
Char buffer [maxlen]; // pronunciation data buff
Memset (buffer, 0, maxlen); // set it to 0
Fread (Head, sizeof (head), 1, fpf); // read the voice database file header
Fwrite (Head, sizeof (head), 1, FPT); // write the synthesized speech File
Int L = strlen (STR );
Char * s = STR;
For (INT I = 0; I <= L; I = I + 2)
{
QM = (unsigned char) * (S + I); // retrieve the area code of Chinese Characters
WM = (unsigned char) * (S + 1 + I); // retrieves the Chinese character bit code
If (QM <176 | QM> 215) // determines whether the data is in the Chinese character set.
Continue;
If (WM <161 | WM> 254) // determines whether the image is in the Chinese character set.
Continue;
Int position = (qm-176) * 94 + wm-160;
Int offset = (position-1) * maxlen + 46; // locate
Fseek (fpf, offset, 0 );
Fread (buffer, sizeof (buffer), 1, fpf); // obtain pronunciation data
Fwrite (buffer, sizeof (buffer), 1, FPT); // write the merged File
Fileleng ++; // The length of the merged file is increased.
} // End
Re = fileleng;
Fileleng = fileleng * maxlen;
Fseek (FPT, 42, seek_set );
Fwrite (& fileleng, sizeof (long), 1, FPT); // modify the WAV format of the merged file, mainly to modify the file size. For details, see the WAV format table.
Fileleng + = 44;
Fseek (FPT, 4, seek_set );
Fwrite (& fileleng, sizeof (long), 1, FPT); // modify the WAV format of the merged file, mainly to modify the file size. For details, see the WAV format table.
Fclose (fpf); // close the file
Fclose (FPT );
Return re;
}
Others:
From the wav function, we can see that the encoding of the input characters that we receive must be GBK,
Therefore, if the system does not use GBK encoding, we should also perform encoding conversion.
If the encoding is correct, Chinese characters must be extracted from the user input.
For this reason, I wrote a short piece of code to filter non-Chinese characters.
Void trans (char * Str)
{
Int I = 0, j = 0;
While (STR [I]! = '/0 ')
{
If (STR [I] <0)
{
STR [J ++] = STR [I ++];
STR [J ++] = STR [I ++];
}
Else
I ++;
} // End while
STR [J] = '/0 ';
}
Article introduction:
Introduction to the simplified implementation of Chinese TTS (based on Linux)
Http://blog.csdn.net/dedodong/archive/2006/07/15/923543.aspx
The implementation principle of Simplified Chinese TTS (based on Linux:
Http://blog.csdn.net/dedodong/archive/2006/07/16/927041.aspx
Implementation of a Chinese TTS Language Library Based on Linux
Http://blog.csdn.net/dedodong/archive/2006/08/22/1105742.aspx
Chinese TTS simple implementation (based on Linux) postscript
Http://blog.csdn.net/dedodong/archive/2006/08/24/1109908.aspx