Wave file operations (1): Basic knowledge about wave files and file formats

Source: Internet
Author: User
I recently prepared to learn directsound, directmusic, and DirectShow, But I encountered many problems with wave files as soon as I got started. I had to go back and learn wave files.
Basic knowledge of wave files

We often see this description: 44100Hz 16 bit stereo or 22050Hz 8 bit mono and so on.

44100Hz 16-bit STEREO: 44100 samples per second. The sampling data is recorded in 16 bits (2 byte) and dual-channel (STEREO );
22050Hz 8-bit Mono: 22050 sampling times per second. The sampling data is recorded in 8 bits (1 byte), with a single channel;

Of course, there can also be 16-bit single-channel or 8-bit stereo sound, and so on.

The frequency of Human Recognition ranges from 20Hz to 20000Hz. If 20000 audios can be sampled every second, the human ears can be satisfied during playback. therefore, the sampling frequency of 22050 is commonly used, and the sampling frequency of 44100 is CD sound quality. Over 48000 of the sampling frequency is meaningless to human ears. this is similar to the principle of 24 frames per second for a movie.

Each sampled data records the amplitude, and the sampling accuracy depends on the size of the storage space:
1 byte (8 bit) can only record 256 records, that is, only 256 types of amplitude recognition can be performed;
2 bytes (that is, 16 bit) can be subdivided into 65536 numbers, which is now the CD standard;
4 bytes (that is, 32 bit) can refine the amplitude to 4294967296 possibilities, which is unnecessary.

If it is a dual-channel (STEREO), the sampling is dual, and the file size is almost doubled.

In this way, we can estimate the length of a WAV file based on the size, sampling frequency, and sampling size of a WAV file. For example, the file length of "Windows XP startup .wav" is 424,644 bytes, it is in the "22050Hz/16 bit/Stereo" format (which can be seen from its "properties-> summary ).
Its transmission rate per second is 22050*16*2 = 705600 (BIT), converted to 705600/8 = 88200 (bytes );
424644 (total bytes)/88200 (number of bytes per second) ≈ 4.8145578 (seconds ).

This is not accurate enough. In the standard PCM format, there are 44 bytes outside of the sample data in the wave file, and should be removed:
(424644-44)/(22050*16*2/8) ≈ 4.8140589 (seconds). This is more accurate.

There is another concept about the sound file: "bit rate", also known as bit rate and sampling rate. For example, the bit rate in the above file is 705.6 kbps or 705600bps, where B is bit, PS is the meaning of every second. compressed audio files are often expressed by bit rate. For example, MP3 files that reach CD sound quality are 128 Kbps/44100Hz.

Wave File Format

Microsoft's multimedia files (WAV, Avi, Tif, and so on) all have a riff header. The wave file basically looks like this:

Riff Header
FMT sub-block
Data sub-block

There are many encoding methods for wave files. The most common and simple method is PCM encoding.

Other codes contain more "blocks", but at least the above blocks. PCM codes only contain the above blocks.

The following is a PCM-encoded Xiang table:

Riff Header Ckid 4 "Riff"Identifier
Cksize 4 File size. This size does not includeCkid
AndCksizeThe size of the sub-block below is also the same.
Fcctype 4 Type. Here is "wave"Identifier
24 FMT sub-block Ckid 4 "FMT"Identifier
Cksize 4 Block size;PairPCMEncodingHere is 16,
Other codes are not less than 16.
Wformattag 2 Encoding format; 1 indicates PCM Encoding
Nchannels 2 Number of audio channels; 1 is single channel; 2 is stereo
Nsamplespersec 4 Sampling frequency (Number of samples per second); For example 44100
Navgbytespersec 4 Transmission Rate = sampling frequency * size of each sample, in bytes
Nblockalign 2 Size of each sample = sampling precision * Number of audio channels/8 (required/8 because the unit is byte);
This is also the minimum unit of byte alignment, such16bitThe value of stereo here is 4.
Bytes
Wbitspersample 2 Sampling precision, for example16bitThe value here is 16.
? Data sub-block Ckid 4 "Data"Identifier
Cksize 4 Block Size
Sample Data ? Two-channel data arrangement: left and right...; 8bit: 0-255, 16bit:-32768-32767

Other codes may contain the following blocks:
Fact, cue, label, note, labeled text, and sampler) instrument blocks and list blocks. If list blocks exist, more sub-blocks are included.

Next, it is easy to access, play, and record.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.