As one of the acoustic file formats used in multimedia, the wave file header adopts the riff format as the standard. Riff is short for resource interchange file format. The first four bytes of each wave file are "riff ". Reasonable Use of the wave file header can be more effective in speech decoding.
Generally speaking, voice encoding refers to compressing 8 kHz sampled, 16-bit quantified linear PCM voice signals into other formats of voice signals, during decoding, voice signals in other formats are transformed into 8 kHz sampled, 16-bit quantified linear PCM voice signals. Generally, the conversion process is complex and time-consuming and labor-intensive. If the corresponding wave file header is directly added to the voice signals of other formats, the conversion process is not needed. The speech can be decoded using the recorder provided by Microsoft.
The following section analyzes the wave file header formats of various speech codes, and compares them with the following tables (from table 1 to table 7.
Table 1 wave File Header Format for 8 KHz sampling and 16-bit quantization of linear PCM speech signals (44 bytes in total)
Offset address byte data type content file header is defined
00 H 4 char "riff" char riff_id [4] = "riff"
04 H 4 long int file total length-8 long int size0 = total length-8
08 h 8 char "wavefmt" char wave_fmt [8]
10 H 4 long int 10 00 00 00 H (PCM) Long int size1 = 0x10
14 H 2 int 01 00 h int fmttag = 0x01
16 H 2 int channel = 1 or 2
18 H 4 long int sampling rate long int samplespersec
1ch 4 long int playback bytes per second long int bytepersec
20 H 2 int sampling number of occupied segments at a time int blockalign = number of channels * quantization number/8
22 h 2 int quantization number int bitpersamples = 8 or 16
24 h 4 char "data" char data_id = "data"
28 H 4 long int sample data bytes long int size2 = length-44
2 CH to end char sampling data
Table 2 8 KHz sampling, 8-Bit A-Law Quantization of PCM voice signal wave File Header Format table (58 bytes in total)
Offset address byte data type content file header is defined
00 H 4 char "riff" char riff_id [4] = "riff"
04 H 4 long int file total length-8 long int size0 = total length-8
08 h 8 char "wavefmt" char wave_fmt [8]
10 H 4 long int 12000000 H (alaw) Long int size1 = 0x12
14 H 2 int 06 00 h int fmttag = 0x06
16 H 2 int channels int channel = 1 or 2
18 H 4 long int sampling rate long int samplespersec
1ch 4 long int playback bytes per second long int bytepersec
20 H 2 int sampling number of characters at a time int blockalign = 0x01
22 h 4 long int quantization number long int bitpersamples = 8
26 H 4 char "fact" char wave_fact = "fact"
2ah 8 char 0400000000530700h fixed char temp
32 H 4 char "data" char wave_data = "data"
36 H 4 long int sample data bytes lont int size2 = length-58
Table 3. 8 KHz sampling and 8-bit U-Law Quantization of PCM voice signal wave File Header Format table (58 bytes in total)
Offset address byte data type content file header is defined
00 H 4 char "riff" char riff_id [4] = "riff"
04 H 4 long int file total length-8 long int size0 = total length-8
08 h 8 char "wavefmt" char wave_fmt [8]
10 H 4 long int 12000000 H (ulaw) Long int size1 = 0x12
14 H 2 int 07 00 h int fmttag = 0x07
16 H 2 int channels int channel = 1 or 2
18 H 4 long int sampling rate long int samplespersec
1ch 4 long int playback bytes per second long int bytepersec
20 H 2 int sampling number of characters at a time int blockalign = 0x01
22 h 4 long int quantization number long int bitpersamples = 8
26 H 4 char "fact" char wave_fact = "fact"
2ah 8 char 0400000000530700h fixed char temp
32 H 4 char "data" char wave_data = "data"
36 H 4 long int sample data bytes lont int size2 = length-58
Table 4 Wave File Header Format table after ADPCM Speech Encoding (90 bytes in total)
Offset address byte data type content file header is defined
00 H 4 char "riff" char riff_id [4] = "riff"
04 H 4 long int file total length-8 long int size0 = total length-8
08 h 8 char "wavefmt" char wave_fmt [8]
10 H 4 long int 32000000 H (ADPCM) Long int size1 = 0x32
14 H 2 int 02 00 h int fmttag = 0x02
16 H 2 int channels int channel = 1 or 2
18 H 4 long int sampling rate long int samplespersec
1ch 4 long int playback bytes per second long int bytepersec
20 H 2 int sampling number of occupied segments at a time int blockalign = number of channels * quantization number/8
22 h 2 int quantization number int bitpersamples = 4
24 h 34 char fixed byte char temp1
46 H 4 char "fact" char wave_fact = "fact"
4ah 8 char 0400000004930600h fixed char temp2
52 H 4 char "data" char wave_data = "data"
56 H 4 long int sampled data bytes lont int size2 = length-90
5ah-end sampling data
Table 5 formats of wave file headers after GSM Speech Encoding (60 bytes in total)
Offset address byte data type content file header is defined
00 H 4 char "riff" char riff_id [4] = "riff"
04 H 4 long int file total length-8 long int size0 = total length-8
08 h 8 char "wavefmt" char wave_fmt [8]
10 H 4 long int 14000000 H (GSM) Long int size1 = 0x14
14 H 2 int 31 00 h int fmttag = 0x31
16 H 2 int channels int channel = 1 or 2
18 H 4 long int sampling rate long int samplespersec
1ch 4 long int playback bytes per second long int bytepersec
20 h 8 char 400000002004001h set char temp1
28 h 8 char 66616374040020.h fixed char temp2
30 H 4 char 40 E2 05 00h fixed char temp3
34 H 4 char "data" char wave_data = "data"
38 H 4 long int sample data bytes lont int size2 = length-60
3 ch to end-to-end sampling data
Table 6 SBC audio-encoded wave File Header Format table (58 bytes in total)
Offset address byte data type content file header is defined
00 H 4 char "riff" char riff_id [4] = "riff"
04 H 4 long int file total length-8 long int size0 = total length-8
08 h 8 char "wavefmt" char wave_fmt [8]
10 H 4 long int 12000000 H (SBC) Long int size1 = 0x12
14 H 2 int 71 00 h int fmttag = 0x71
16 H 2 int channels int channel = 1 or 2
18 H 4 long int sampling rate long int samplespersec
1ch 4 long int playback bytes per second long int bytepersec
20 H 2 int sampling number of characters at a time int blockalign = 0x25
22 h 4 long int quantization number long int bitpersamples = 16
26 H 4 char "fact" char wave_fact = "fact"
2ah 8 char 040011676280400h set char temp
32 H 4 char "data" char wave_data = "data"
36 H 4 long int sampled data bytes lont int size2 = length-59
Table 7 wave File Header Format table after CELP Speech Encoding (58 bytes in total)
Offset address byte data type content file header is defined
00 H 4 char "riff" char riff_id [4] = "riff"
04 H 4 long int file total length-8 long int size0 = total length-8
08 h 8 char "wavefmt" char wave_fmt [8]
10 H 4 long int 12000000 H (CELP) Long int size1 = 0x12
14 H 2 int 70 00 h int fmttag = 0x70
16 H 2 int channels int channel = 1 or 2
18 H 4 long int sampling rate long int samplespersec
1ch 4 long int playback bytes per second long int bytepersec
20 H 2 int sampling number of characters at a time int blockalign = 0x0c
22 h 4 long int quantization number long int bitpersamples = 16
26 H 4 char "fact" char wave_fact = "fact"
2ah 8 char 040011660520700h fixed char temp
32 H 4 char "data" char wave_data = "data"
36 H 4 long int sample data bytes lont int size2 = length-58
This article is transferred from: http://blog.csdn.net/solond/archive/2008/03/11/2169129.aspx