C + + implementation of the Wavread function of "WAV audio parsing"

Last Update:2014-09-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This paper consists of three parts, the first part of the background-audio type and the motive of this article, the second classification than MATLAB under the Wavread () function, the third part gives the C + + implementation of the function.

A Background introduction1.1 Motive of this article

1) All WAV audio processing is based on the WAV format of the file parsing out, to parse the group to do for us to do the subsequent processing (FFT, etc.).

2) in MATLAB directly has a very useful function wavread (' test.wav '), the input is WAV audio, the output is an array, as described in Chapter two.

3) The general C + + function reads the data, the format as described in section 1.2, however, regardless of the format, the data can be converted to each other.

In view of this, this article will introduce how to fully implement the Wavread function of Matlab with C + +, the output data format is identical, in this process, we can also appreciate the nature of the data in the document, and the relationship between the transformation.

1.2 Audio Type

RIFF is all called the Resource Interchange File format (Resourcesinterchange fileformat), RIFF files are a file structure that most multimedia files in the Windows environment follow. The data type that the riff file contains is identified by the file's extension, and the data that can be stored in the riff file includes: Audio Video Interleaved format data (. AVI) waveform format data (. WAV) Bitmap Format data (. RDI) MIDI Format data (. RMI) palette format (. PAL) Multimedia Movies (. RMN) animation cursor (. ANI) other riff files (. BND).

C hunk is the basic unit that makes up the riff file , and its basic structure is as follows:

struct chunk{u32 ID;//consists of 4 ASCII characters to identify the data contained in the block.    such as: ' RIFF ', ' LIST ', ' fmt ', ' data ', ' WAV ', ' AVI ' and other u32 size;   The block size, which is the length of the data stored in the database field, and the size of the ID and Size field is not included in the value U8 Dat[size]; Block content, the data is arranged in words (word), and if the length of the data structure is odd, a null byte is added last;

1.3 WAV audio files

Wave files are one of the sound waveform file formats used in multimedia, and are standard in the format riff (Resource Interchange File format). The first four bytes of each wave file are "RIFF". Similarly, WAVE files consist of two main parts: the file header and the data body. The file header is divided into riff/wav file identification segment and sound data Format Description Section two parts. The contents and format of the wave file are shown later in this article.

There are two main types of sound files, which correspond to mono (11.025KHz sample rate, 8Bit sample value) and dual channel (44.1KHz sample rate, 16Bit sample value). The sampling rate is the number of times the sound signal is sampled in units of time during the "modulo → number" conversion process. The sampled values refer to each sampling period
The integral value of the internal sound analog signal.

For mono sound files, the sampled data is a eight-bit short integer (00H-FFH), whereas for a two-channel stereo sound file, each sampled data is a 16-bit integer (int), and the high eight-bit and low eight-bit respectively represent the left and right two channels.

WAVE file data blocks contain samples expressed in pulse-coded modulation (PCM) format. WAVE files are organized by samples. In a mono WAVE file, Channel 0 represents the left channel and Channel 1 represents the right channel. In a multichannel wave file, the sample is alternately present.

Wave files In addition to the previous small section of the file header to the data organization, the informationblock is the original sound sample data , WAVE files can be compressed, but generally use the uncompressed format. 44.1KHz sample rate, 16Bit resolution, dual channel, so wave can save very high quality sound files, CD Use this format, sound experts or music enthusiasts should be very familiar. But the size of this file is also very large, in 44.1KHz 16bit dual-channel data for example, the amount of one minute of sound data: 4100*2byte*2channel*60s/1024/1024=10.09m. So it's not appropriate to send it online.

Below we specifically analyze the format of WAVE files

Endian	Field name	Size
Big	Chunkid	4	File header identification, generally is "RIFF" four letters
Little	ChunkSize	4	The size of the entire data file, excluding the above ID and size itself
Big	Format	4	It's usually "WAVE" four letters.
Big	Subchunk1id	4	Format Description block, this field is generally "FMT"
Little	Subchunk1size	4	The size of this data block, excluding the ID and size field itself
Little	Audioformat	2	Format description for audio
Little	Numchannels	2	Number of channels
Little	Samplerate	4	Sample Rate
Little	Byterate	4	Bit rate, number of bytes required per second
Little	Blockalign	2	Data Block snap-in unit
Little	BitsPerSample	2	Resolution of analog-to-digital conversion at sampling
Big	Subchunk2id	4	Real sound data block, this field is generally "data"
Little	Subchunk2size	4	The size of this data block, excluding the ID and size field itself
Little	Data	N	Sampled data for audio

The following is a detailed explanation of each field:

Chunkid	4bytes	The ASCII code represents the "RIFF". (0x52494646)
ChunkSize	4bytes	36+subchunk2size, or 4 + (8 + subchunk1size) + (8 + subchunk2size), This is the size of the entire data block (excluding the size of Chunkid and chunksize)
Format	4bytes	The ASCII code represents "WAVE". (0x57415645)

Subchunk1id		New block of data (format information description block) The ASCII code represents the "FMT"--finally a space. (0x666d7420)
Subchunk1size	4bytes	The size of this block of data (for PCM, a value of 16).
Audioformat	2bytes	PCM = 1 (for example, linear sampling), if it is a different value, it may be some form of compression
Numchannels	2bytes	1 = Mono \| 2 = Dual Channel
Samplerate	4bytes	Sample rate, such as 8000,44100 equivalent
Byterate	4bytes	equals: Samplerate * numchannels * BITSPERSAMPLE/8
Blockalign	2bytes	equals: Numchannels * BITSPERSAMPLE/8
BitsPerSample	2bytes	Sampling resolution, that is, each sample is represented by several, usually 8bits or 16bits
Subchunk2id	4bytes	New data blocks, real sound data The ASCII code represents "data"--and finally a space. (0x64617461)
Subchunk2size	4bytes	Data size, that is, the size of the sampled data followed by.
Data	N bytes	Real Sound data

For data blocks, depending on the number of channels and the sample rate, the layout is as follows (each column represents 8bits):

1). 8 Bit Mono:

Sampling 1	Sampling 2
Data 1	Data 2

2). 8 Bit Dual Channel

Sampling 1		Sampling 2
Channel 1 Data 1	Channel 2 Data 1	Channel 1 Data 2	Channel 2 Data 2

3). Single Bit Mono:

Sampling 1		Sampling 2
Data 1 Low byte	Data 1 High-byte	Data 1 Low byte	Data 1 High-byte

4). Two-Bit dual channel

Sampling 1
Channel 1 data 1 Low byte	Channel 1 Data 1 high-byte	Channel 2 data 1 Low byte	Channel 2 Data 1 high-byte
Sampling 2
Channel 1 data 2 low byte	Channel 1 Data 2 high-byte	Channel 2 data 2 low byte	Channel 2 Data 2 high-byte

Let's look at a specific example of WAV audio file as follows: (16 binary form)

74 20 10 00 00 00 01 00 02 00 22 56 00 00 88 58 01 00 04 00 10 00 64 61 74 61 00 from xx to XX (XX) 1e 3c 3c, all of them

The corresponding analysis is as follows:

Analyze data For example: The shape of ' FFFF ' is a complete data we need. such as the SAMPLE3:3C and 13 are two numbers together is a number we need, 3c 13, but the right end is big, then 3c 13, hexadecimal number 3c bitwise conversion to 2 0011, the same 1100 bitwise conversion to 13 binary 2 0011, then the binary number of the connected 16bits is 0011 1100 0001 0011, then we can see that the sign bit is 0, that is, positive.

The Wavread () function in MATLAB

Wavread (' Testwav.wav ')
Readers try the output. For example, take one of my sound files ' testwav.wav ' and output the last 10 data as:

-0.0001-0.0001-0.0002-0.0003-0.0002-0.0002-0.0002-0.0003-0.0002-0.0002

2. Wavread (' testwav.wav ', ' native ')

Readers can try out the output. The last 10 data for my ' testwav.wav ' output are:

-4-2-8-9-7-8 -8-11-5-7

The conversion equation between the output data of 1 and 2 is:-0.0002 = -7/32768 (where 32768 = 2 ^15, or 2 to 15 power. This is normalization. Because the encoding is 16bits)

C + + implementation of three Readwav

As described above, we come to the topic, how to use C + + to implement the Wavread (' testwav.wav ') function in Matlab, and the output is consistent.

3.1 Encoding conversion rules

Before we introduce, we need to understand the relationship between these strings of data. This chapter analyses the data of the Test.wav file as an example:

(1) The data block of the wave file, which is the last 20 of the raw sampled data, is:

FC FF FE FF F8 FF F7 FF F9 ff F8 ff F8 ff F5 FF FB FF F9 FF

(2) The last 10 data parsed in MATLAB are:

-0.0001-0.0001-0.0002-0.0003-0.0002-0.0002-0.0002-0.0003-0.0002-0.0002

The two sets of data between the original code and the complement of the relationship, that is (1) is the original code and (2) is the complement.

The step of converting from data (1) to Data (2) is to convert (1) to its complement, then the complement by 32768, then get (2).

The principle of conversion between the original code and the complement:

(Conversion in 2 binary form): Wakahara code is a positive number, then the complement is its own. The Wakahara code is negative, then the complement is the sign bit, the value bit is reversed, plus 1.

(Conversion in numerical form): The Wakahara code is a positive number, then the complement is itself. Wakahara code is negative, complement = original code-2^16. Warm tip: In order to facilitate the calculation of the value of equivalent substitution 2^16 = FFFF-1.

For a better understanding, illustrate:

Step One (16 bytes per read): Because the data is data from X0000 to XFFFF. Take F9 ff For example, the right end is big, in other words, the right end is high, then it should be fff9. Step two (convert to complement): The bitwise conversion to binary form is 1111 1111 1111 1001 (1-bit 16 binary value corresponds to 4-bit binary value), the data is the original code, converted to a signed decimal form, first look at the sign bit to judge it as negative, then the complement is Fff9-ffff-1 =-7. Step Three (Normalization): Use the complement value-7 divided by 32768, take the decimal point 4 bits (rounded), then equals-0.0002, correct.

The reader can try my method to calculate the 3rd 4th number in the right of (1), whether it corresponds to the 2nd number of the right (2).

3.2 C + + implementation

Then the C + + implementation, is to read the original sample data, read 16 bytes each time, and then convert 16 bytes of 16 binary numbers into decimal numbers, and then converted to its complement, and normalized. Note the size end and symbol issues when converting.

Specific C + + code, I have shared, readers can see: http://www.oschina.net/code/snippet_1768500_39013

Reference documents

1. http://www.cnblogs.com/liyiwen/archive/2010/04/19/1715715.html

C + + implementation of the Wavread function of "WAV audio parsing"

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

C + + implementation of the Wavread function of "WAV audio parsing"

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

C + + implementation of the Wavread function of "WAV audio parsing"

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support