Audio Processing (I) audio files, audio processing audio files
Audio files
Audio files are data files stored after digital conversion of sounds. To understand audio data, you must first understand several important concepts.
1. Sampling: The minimum operating unit for entering sound information. Generally, one sampling operation has two or more sound channels, each of which is stored in one or two bytes;
This sampleQuantified bitsIs 8 bits, or 16 bits (sample width ),The higher the number of BITs, the better the sound quality.Like the 11-digit phone number, the number is much larger than the 7-digit phone number;
2. sampling frequency: the number of samples per second, in Hz. The average audio files include 11.025 kHz, 22.05 kHz, and 44.10kHz. ObviouslyModule-NumberInformation Conversion: the more sampling times a second, the more accurate the sound;
3. bit Rate: the number of encoded bits per second. Unit: kb/s. Calculation Method: bit Width x number of audio channels x sampling frequency. Unit: bit, not byte)
4. Number of audio channels. The fixed value is 1-single channel or 2-dual channel. When two channels exist, each sample contains audio data of the left and right channels. Therefore, the data of the two channels isStaggered ArrangementOf;
(1) Wave format
WAVE is a sound file format developed by Microsoft. It is used to store Audio Information Resources on the Windows platform, with the file suffix *. wav. It supports multiple compression algorithms, multiple audio bits, sampling frequencies, and channels;
The standard wav file adopts the 44.1kHz sampling frequency, 16-bit quantize bits, and the quality of the audio file is equal to that of the CD. The Wave format does not process the source data. If the source data is lossless, the encoded Wav file is also lossless. If the source data is lossy, the encoded Wav file is also lossy;
1. Composition of the Wave file:
RIFF |
Mark 4B |
"RIFF" |
Data size 4B |
- |
Format 4B |
"WAVE" |
Fmt |
Mark 4B |
"Fmt" |
Struct size 4B |
16/18 |
Struct 16B/18B |
|
Data |
Mark 4B |
"Data" |
Sound data size 4B |
- |
Data |
- |
2. detailed structure of the Wave file:
// RIFF standard media stream File Header struct Riff_Header {
Char szRiffId [4]; // 'R', 'I', 'F', 'F' DWORD dwRiffSize; // Size, except for the eight bytes, remaining file size, equal to the total number of bytes of the file-8 char szRiffFormat [4]; // 'w', 'A', 'V', 'E '}; struct Fmt_Block {char szFmtId [4]; // 'F', 'M', 't', ''dword dwFmtSize; // The Size is 16 or 18 WORD wFormatTag; // encoding method, generally 0x0001 WORD wChannels; // Number of Audio Channels 1 -- Single Channel 2 -- Dual Channel DWORD dwSamplesPerSec; // sampling frequency/Hz DWORD dwAvgBytesPerSec; // number of bytes per second WORD wBlockAlign; // data block alignment unit (the number of bytes required for each sample) WORD wBitsPerSample; // bit required for each sample // WORD wBits; // It may not exist, determined by the dwFmtSize field}; // Fact_Block, some wav files do not contain struct Fact_Block {char szFactId [4]; // 'F ', 'A', 'C', 't'dword dwFactSize; //}; // data block struct Data_Block {char szDataId [4]; // 'D', 'a, ', 't', 'A' DWORD dwDataSize; // audio data size // data ...};
Note:
1. in the RIFF block, dwRiffSize indicates the size of the entire file except the first eight bytes. The value 0x24 0xCD 0x01 0x00 indicates 118,052 bytes, check that the file size is 118,060 bytes through the file attribute;
2. dwFmtSize is 0x10 0x00 0x00 0x00, that is, 16. The remaining part of the fmt block is a waveform information structure defined by Microsoft:
/* * extended waveform format structure used for all non-PCM formats. this * structure is common to all non-PCM formats. */typedef struct tWAVEFORMATEX{ WORD wFormatTag; /* format type */ WORD nChannels; /* number of channels (i.e. mono, stereo...) */ DWORD nSamplesPerSec; /* sample rate */ DWORD nAvgBytesPerSec; /* for buffer estimation */ WORD nBlockAlign; /* block size of data */ WORD wBitsPerSample; /* number of bits per sample of mono data */ WORD cbSize; /* the count in bytes of the size of */ /* extra information (after cbSize) */} WAVEFORMATEX, *PWAVEFORMATEX, NEAR *NPWAVEFORMATEX, FAR *LPWAVEFORMATEX;
WAVEFORMATEX
3. Data Block: dwDataSize indicates the size of the audio Data, 0x00 0x01 0xCD 0x00, that is, 118,016, slightly less than 118,052, indicating that some invalid Data at the end of the file;
(2) MP3 format
MP3 format (To be continued ...)