The difference between ADPCM and PCM, and the compression and decompression of wave files

Source: Internet
Author: User
Tags diff advantage

Transferred from: http://blog.csdn.net/nogodoss/article/details/10399243


Http://blog.csdn.net/kindyb/archive/2005/10/13/503024.aspx

First, overview:

This paper describes how to convert from IMA-ADPCM file to PCM file by IMA-ADPCM compression and decompression algorithm. The main contents include: The internal structure of PCM and IMA-ADPCM wave files, IMA-ADPCM compression and decompression algorithm, and how to generate unique audio compression format files and other three aspects of the content.

Second, the wave file understanding

Wave file is one of the most commonly used digital sound file formats in the Computer field, and it is the waveform file format (waveform Audio) that Microsoft specifically defines for Windows systems, because of its extension "*.wav".

Wave files have many different compression formats, and now some programs generate wave files that are more or less containing some errors. These errors are not caused by the problem of a single data compression and decompression algorithm, but because the internal structure of the file is not properly organized after compression and decompression. Therefore, the correct and detailed understanding of the internal structure of the various wave files is the basis for successful compression and decompression, but also the creation of a unique audio compression format file premise.

The most basic wave file is PCM (pulse code modulation) format, the file directly stored sampled sound data without any compression, is directly supported by the sound card data format, to allow the sound card to play the other compressed sound data, you should first compress the compressed data into a PCM format, Then let the sound card play again.

Internal structure of the 1.Wave file

The wave file organizes the internal structure in riff (Resource Interchange file format, the "Resource interaction Files" format). The RIFF file structure can be thought of as a tree structure, its basic composition is called "Block" (Chunk) of the unit, the top is a "RIFF" block, each of the following blocks have "type block identification (optional)", "marker", "Data Size" and "data", the structure of the block is shown in table 1:

The "type block ID" mentioned above is only used in some chunk, such as "WAVE" chunk, which means that there are other chunk nested underneath, when the "type block identifier" is used, the chunk has no other item (such as block marker, data size, etc.), which is only an identity when the file is read. This "type block ID" is found first, and then it reads the other chunk nested underneath it.

The most front-end write for each file is the riff block, with only one riff block per file. The structure of this can be seen in table 2:

Files in non-PCM format are added to at least one "fact" block, which is used to record the size of the data after decompression. (Note that data is not a file) This "fact" block is usually preceded by the "data" block.

2. The understanding of Waveformat structure

The main difference between PCM and non-PCM is that sound data is organized differently, and these differences can be differentiated by the waveformat structure of the two. The following are compared with PCM and IMA-ADPCM:

Wave's basic structure WAVEFORMATEX structure is defined as follows:

typedef struct

{

WORD Wformatag; encoding format, including WAVE_FORMAT_PCM,//WAVEFORMAT_ADPCM, etc.

WORD Nchannls; Number of channels, mono channel is 1, dual channel is 2;

DWORD nsamplespersec;//sampling frequency;

DWORD navgbytespersec;//The amount of data per second;

WORD nblockalign;//block Alignment;

The sample size of the WORD wbitspersample;//wave file;

WORD sbsize; This value is ignored in PCM

}waveformatex;

The structure of PCM is the basic structure;

The IMAADPCMWAVEFORMAT structure is defined as follows:

Typedef struct

{

WaveFormatEx wfmt;

WORD Nsamplesperblock;

}imaadpcmwaveformat;

IMA-ADPCM wfmt->cbsize can not be ignored, the general value is 2, indicating that this type of waveformat than the general Waveformat more than 2 bytes. These two characters are also nsamplesperblock.

3. Internal organization of the "fact" chunk

In non-PCM format files, a "fact" chunk is typically added to the WAVEFORMAT structure as follows:

typedef struct{

CHAR[4]; "Fact" string

DWORD chunksize;

DWORD datafactsize; The size after which the data is converted to PCM format.

}factchunk;

Datafactsize is the most important data in this chunk, and if this is a sound file in a compressed format, then you can tell from here what size he unzipped. There is a great benefit in the calculation of the decompression.

4. Internal organization of the "data" chunk

Starting with the 9th byte of the "data" chunk, the data for the sound information is stored, (the first eight bytes are stored with the marker "data" and the data size (DWORD). The data may be compressed or not compressed.

The sound data in the PCM is not compressed, and if it is a mono file, the sampled data is sequentially stored in chronological order. (its basic organizational unit is byte (8bit) or Word (16bit)) if it is a two-channel file, the sampled data is deposited sequentially in chronological order. As shown in the figure:

IMA-ADPCM is a compression format that is compressed from 16-bit samples of the PCM into 4-bit. For mono-channel IMA-ADPCM, it is to compress the PCM data in chronological order and write to the file, each byte contains two samples, the lower four bits correspond to the first sample, and the higher four bits correspond to the second sample. For two-channel IMA-ADPCM, its storage is relatively troublesome, it is the first 8 samples of the left channel of the PCM compressed and written to a DWORD, and then written to the "data" chunk. The first 8 samples of the right channel are followed. In this loop, when the number of samples is less than 8 o'clock (to the end of the data), the extra sample should be filled with 0. The schematic is as follows:

Special attention:

In IMA-ADPCM, the data in Chuck is organized in block form, I call it "segment", that is, in the case of compression, not all of the data is compressed in sequence, but in a segmented, This has a very important advantage: that is, when you only need a piece of information in the file, you can extract the data that you want to be in the same segment as you like, there is no need to extract from the beginning of the file one by one. This will have a considerable advantage in dealing with large files. At the same time, this also can guarantee the sound effect.

Blocks are generally composed of the block header (block head) and data. where the block header is a structure, it is defined in mono as follows:

Typedef struct

{

Short Sample0; First sampled value in block (uncompressed)

BYTE index; Last block index, the index=0 of the first block;

BYTE reserved; Not yet used

}monoblockheader;

With the Blockheader information, you can easily solve the compressed data in this block without having to know the data in front and back of the block. For a two-channel, its blockheader should contain two monoblockheader which are defined as follows:

TYPEDAF struct

{

Monoblockheader Leftbher;

Monoblockheader Rightbher;

}stereoblockheader;

In the decompression, the left and right channels are processed separately, so there must be two monoblockheader;

Note 1: The above index is a parameter that must be used in the decompression algorithm. See later.

NOTE 2: With regard to the size of the block, there are usually several situations:

For mono, the size is generally 512byte, it is obvious that the number of sample can be saved (512-sizeof (Monoblockheader))/4 + 1 = 1017 < where "+1" is the first existence of the header structure of the non-compressed sample.

For two-channel, the size is generally 1024byte, according to the above algorithm can be obtained, wherein the number of sample is also 1017.

4. How to read wave files.

After you know the internal data organization of the wave file, you can use file or hfile to read the files directly. However, because wave files are organized in riff format, it is more convenient to use the multimedia input and output stream to find chunk and locate the data directly in the file.

Three, IMA-ADPCM coding and decoding algorithm

IMA-ADPCM was first developed by Intel to be a lossy compression algorithm primarily for 16bit sampled waveform data with a compression ratio of 4:1. It is the same algorithm as the usual DVI-ADPCM. (For 8bit data compression is 3.2:1, there are non-standard IMA-ADPCM compression algorithm, can achieve 5:1 or even higher compression ratio) 4:1 compression is currently the most used compression mode.

ADPCM (Adaptive differential pulse code modulation differential Pulse code modulation) is mainly for continuous waveform data, saving is the phase of the change of the waveform, in order to achieve the purpose of describing the entire waveform. The algorithm must use two one-dimensional arrays, setptab[] and index_adjust[], appended to the following code.

--------------------------------------------------------------------------------

IMA-ADPCM Compression Process

First of all we think that the sound signal is zero-based, so we need to initialize two variables

int index = 0,prev_sample = 0;

In actual use, however, the value of Prev_sample is the first sampled value in each block. (This is explained in detail in the block below)

Let's say two functions have been written:

Getnextsamp ()--to obtain a 16bit sampling data;

Savecomcode ()--Save a 4bit compressed sample;

The following loop compresses the sound data stream in turn:

While (there is data to process) {

Cur_sample = Getnextsamp (); Get the current sampled data in the PCM

diff = cur_sample-prev_sample; Calculated and previous increments

if (diff<0)

{

Diff=-diff;

fg=8;

}

else fg=0; FG holds the symbol bit

Code = 4*diff/steptab[index];

if (code>7) code=7; Based on steptab[] to obtain a 0~7 value, which describes the amount of variation in the sample amplitude

Index+=index_adjust[code]; According to the sound intensity of the next fetch steptab sequence number, so that the next time to get a more accurate description of the amount of change

if (index<0) index=0; Adjust the value of index

else if (index>88) index=88;

Prev_sample=cur_sample;

Savecomcode (CODE|FG); Plus sign bits to save.

}

--------------------------------------------------------------------------------

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.