Audio Processing (I) audio files, audio processing audio files

Source: Internet
Author: User

Audio Processing (I) audio files, audio processing audio files

Audio files

Audio files are data files stored after digital conversion of sounds. To understand audio data, you must first understand several important concepts.

1. Sampling: The minimum operating unit for entering sound information. Generally, one sampling operation has two or more sound channels, each of which is stored in one or two bytes;

This sampleQuantified bitsIs 8 bits, or 16 bits (sample width ),The higher the number of BITs, the better the sound quality.Like the 11-digit phone number, the number is much larger than the 7-digit phone number;

2. sampling frequency: the number of samples per second, in Hz. The average audio files include 11.025 kHz, 22.05 kHz, and 44.10kHz. ObviouslyModule-NumberInformation Conversion: the more sampling times a second, the more accurate the sound;

3. bit Rate: the number of encoded bits per second. Unit: kb/s. Calculation Method: bit Width x number of audio channels x sampling frequency. Unit: bit, not byte)

4. Number of audio channels. The fixed value is 1-single channel or 2-dual channel. When two channels exist, each sample contains audio data of the left and right channels. Therefore, the data of the two channels isStaggered ArrangementOf;

 

(1) Wave format

WAVE is a sound file format developed by Microsoft. It is used to store Audio Information Resources on the Windows platform, with the file suffix *. wav. It supports multiple compression algorithms, multiple audio bits, sampling frequencies, and channels;

The standard wav file adopts the 44.1kHz sampling frequency, 16-bit quantize bits, and the quality of the audio file is equal to that of the CD. The Wave format does not process the source data. If the source data is lossless, the encoded Wav file is also lossless. If the source data is lossy, the encoded Wav file is also lossy;

1. Composition of the Wave file:

RIFF Mark 4B "RIFF"
Data size 4B -
Format 4B "WAVE"
Fmt Mark 4B "Fmt"
Struct size 4B 16/18
Struct 16B/18B  
Data Mark 4B "Data"
Sound data size 4B -
Data -

 

 

 

 

 

 

 

 

 

 

 

 

 

2. detailed structure of the Wave file:

// RIFF standard media stream File Header struct Riff_Header {
Char szRiffId [4]; // 'R', 'I', 'F', 'F' DWORD dwRiffSize; // Size, except for the eight bytes, remaining file size, equal to the total number of bytes of the file-8 char szRiffFormat [4]; // 'w', 'A', 'V', 'E '}; struct Fmt_Block {char szFmtId [4]; // 'F', 'M', 't', ''dword dwFmtSize; // The Size is 16 or 18 WORD wFormatTag; // encoding method, generally 0x0001 WORD wChannels; // Number of Audio Channels 1 -- Single Channel 2 -- Dual Channel DWORD dwSamplesPerSec; // sampling frequency/Hz DWORD dwAvgBytesPerSec; // number of bytes per second WORD wBlockAlign; // data block alignment unit (the number of bytes required for each sample) WORD wBitsPerSample; // bit required for each sample // WORD wBits; // It may not exist, determined by the dwFmtSize field}; // Fact_Block, some wav files do not contain struct Fact_Block {char szFactId [4]; // 'F ', 'A', 'C', 't'dword dwFactSize; //}; // data block struct Data_Block {char szDataId [4]; // 'D', 'a, ', 't', 'A' DWORD dwDataSize; // audio data size // data ...};

Note:

1. in the RIFF block, dwRiffSize indicates the size of the entire file except the first eight bytes. The value 0x24 0xCD 0x01 0x00 indicates 118,052 bytes, check that the file size is 118,060 bytes through the file attribute;

2. dwFmtSize is 0x10 0x00 0x00 0x00, that is, 16. The remaining part of the fmt block is a waveform information structure defined by Microsoft:

/* *  extended waveform format structure used for all non-PCM formats. this *  structure is common to all non-PCM formats. */typedef struct tWAVEFORMATEX{    WORD        wFormatTag;         /* format type */    WORD        nChannels;          /* number of channels (i.e. mono, stereo...) */    DWORD       nSamplesPerSec;     /* sample rate */    DWORD       nAvgBytesPerSec;    /* for buffer estimation */    WORD        nBlockAlign;        /* block size of data */    WORD        wBitsPerSample;     /* number of bits per sample of mono data */    WORD        cbSize;             /* the count in bytes of the size of */                                                                                      /* extra information (after cbSize) */} WAVEFORMATEX, *PWAVEFORMATEX, NEAR *NPWAVEFORMATEX, FAR *LPWAVEFORMATEX;
WAVEFORMATEX

3. Data Block: dwDataSize indicates the size of the audio Data, 0x00 0x01 0xCD 0x00, that is, 118,016, slightly less than 118,052, indicating that some invalid Data at the end of the file;

 

(2) MP3 format

MP3 format (To be continued ...)

 

 

 

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.