I recently prepared to learn directsound, directmusic, and DirectShow, But I encountered many problems with wave files as soon as I got started. I had to go back and learn wave files.
Basic knowledge of wave files
We often see this description: 44100Hz 16 bit stereo or 22050Hz 8 bit mono and so on.
44100Hz 16-bit STEREO: 44100 samples per second. The sampling data is recorded in 16 bits (2 byte) and dual-channel (STEREO );
22050Hz 8-bit Mono: 22050 sampling times per second. The sampling data is recorded in 8 bits (1 byte), with a single channel;
Of course, there can also be 16-bit single-channel or 8-bit stereo sound, and so on.
The frequency of Human Recognition ranges from 20Hz to 20000Hz. If 20000 audios can be sampled every second, the human ears can be satisfied during playback. therefore, the sampling frequency of 22050 is commonly used, and the sampling frequency of 44100 is CD sound quality. Over 48000 of the sampling frequency is meaningless to human ears. this is similar to the principle of 24 frames per second for a movie.
Each sampled data records the amplitude, and the sampling accuracy depends on the size of the storage space:
1 byte (8 bit) can only record 256 records, that is, only 256 types of amplitude recognition can be performed;
2 bytes (that is, 16 bit) can be subdivided into 65536 numbers, which is now the CD standard;
4 bytes (that is, 32 bit) can refine the amplitude to 4294967296 possibilities, which is unnecessary.
If it is a dual-channel (STEREO), the sampling is dual, and the file size is almost doubled.
In this way, we can estimate the length of a WAV file based on the size, sampling frequency, and sampling size of a WAV file. For example, the file length of "Windows XP startup .wav" is 424,644 bytes, it is in the "22050Hz/16 bit/Stereo" format (which can be seen from its "properties-> summary ).
Its transmission rate per second is 22050*16*2 = 705600 (BIT), converted to 705600/8 = 88200 (bytes );
424644 (total bytes)/88200 (number of bytes per second) ≈ 4.8145578 (seconds ).
This is not accurate enough. In the standard PCM format, there are 44 bytes outside of the sample data in the wave file, and should be removed:
(424644-44)/(22050*16*2/8) ≈ 4.8140589 (seconds). This is more accurate.
There is another concept about the sound file: "bit rate", also known as bit rate and sampling rate. For example, the bit rate in the above file is 705.6 kbps or 705600bps, where B is bit, PS is the meaning of every second. compressed audio files are often expressed by bit rate. For example, MP3 files that reach CD sound quality are 128 Kbps/44100Hz.
Wave File Format
Microsoft's multimedia files (WAV, Avi, Tif, and so on) all have a riff header. The wave file basically looks like this:
Riff Header |
|
FMT sub-block |
Data sub-block |
There are many encoding methods for wave files. The most common and simple method is PCM encoding.
Other codes contain more "blocks", but at least the above blocks. PCM codes only contain the above blocks.
The following is a PCM-encoded Xiang table:
Riff Header |
Ckid |
4 |
"Riff"Identifier |
Cksize |
4 |
File size. This size does not includeCkid AndCksizeThe size of the sub-block below is also the same. |
Fcctype |
4 |
Type. Here is "wave"Identifier |
|
|
24 |
FMT sub-block |
Ckid |
4 |
"FMT"Identifier |
Cksize |
4 |
Block size;PairPCMEncodingHere is 16, Other codes are not less than 16. |
Wformattag |
2 |
Encoding format; 1 indicates PCM Encoding |
Nchannels |
2 |
Number of audio channels; 1 is single channel; 2 is stereo |
Nsamplespersec |
4 |
Sampling frequency (Number of samples per second); For example 44100 |
Navgbytespersec |
4 |
Transmission Rate = sampling frequency * size of each sample, in bytes |
Nblockalign |
2 |
Size of each sample = sampling precision * Number of audio channels/8 (required/8 because the unit is byte); This is also the minimum unit of byte alignment, such16bitThe value of stereo here is 4. Bytes |
Wbitspersample |
2 |
Sampling precision, for example16bitThe value here is 16. |
|
? |
Data sub-block |
Ckid |
4 |
"Data"Identifier |
Cksize |
4 |
Block Size |
Sample Data |
? |
Two-channel data arrangement: left and right...; 8bit: 0-255, 16bit:-32768-32767 |
Other codes may contain the following blocks:
Fact, cue, label, note, labeled text, and sampler) instrument blocks and list blocks. If list blocks exist, more sub-blocks are included.
Next, it is easy to access, play, and record.