Audio format details (mixed)

Last Update:2018-12-04 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Several format details (mixed)

Waveformatex

Mediatype_audio
Format_waveformatex

Clsid_audioinputdevicecategory

Typedef struct _ mediatype {
Guid majortype;
Guid subtype;
Bool bfixedsizesamples;
Bool btemporalcompression;
Ulong lsamplesize;
Guid formattype;
Iunknown * punk; // not use
Ulong cbformat;
Byte * pbformat;
} Am_media_type;

Mainly include
General description of majortype media types
More detailed description of subtype
Formattype
The data formats include:
Format_none
Format_dvinfo
Format_mpegvideo
Format_mpeg2video
Format_videoinfo
Format_videoinfo2
Format_waveformatex
Guid_null

The cbforamt Member specifies the size of the format block pbformat.
The pbformat Pointer Points to the format subblock.
Pbformat is a void * pointer, because the format block will be based on the media type
And have different points. For example, the audio is filled with the waveformatex structure.
Data.

The data format that can be retrieved from it.

// Twaveformatex structure:
Twaveformatex = packed record
Wformattag: word; {specify the format type; default: wave_format_pcm = 1 ;}
Nchannels: word; {indicates the number of channels for waveform data; 1 for single channel and 2 for STEREO}
Nsamplespersec: DWORD; {specified sample rate (number of samples per second)} is generally 8000
Navgbytespersec: DWORD; {average rate of data transmission (bytes per second)} bytes per second:
Nblockalign: word; {specify block alignment (in bytes). Block align is the smallest unit of data}
Wbitspersample: word; {sample size (bytes)} The number of BITs for each sample, usually 16
Cbsize: word; {It should be the size of the structure}
End;

Nchannels: for PCM, nchannels cannot exceed 2. For non-PCM format, nchannels exceed 2.
Nsamplespersec: generally 8 kHz, 11.025 kHz, 22.05 kHz, and 44.1 kHz.
Navgbytespersec: number of bytes transferred per second = nsamplespersec * nblockalign
Nblockalign: aligned byte = nchannels * wbitspersample/8
Indicates the minimum byte of a sample.
Wbitspersample: the default format is 8 to 16, indicating the number of bits of samples.

For an 8-bit stereo with 11 K transmission
Nchannels = 2
Nsamplespersec (number of samples per second) = 11025 is the number of samples
Nblockalign = 2*8/8 = 2 aligned bytes, minimum number of sample bytes
Navgbytespersec = 11025*2 = 22050
Wbitspersample = 8

The figure below clearly expresses the sample from another aspect

	Sample 1	Sample 2	... N
8-Bit Single Channel	0-Channel	0-Channel
8-bit stereo	0-channel L 1-Channel R	0-channel, 1-channel, R-Channel
16-Bit Single Channel	0 channels (low bytes) 0 channels (high bytes)	0 channels (low bytes) 0 channels (high bytes)
16-bit stereo	0 channels (low bytes) 0 channels (high bytes) 1 channel (low) 1 channel (high)	Same as left

---------

Waveform-audio cache format
Typedef struct {
Lpstr lpdata; // memory pointer for storing audio PCM sample data
DWORD dwbufferlength; // Length
DWORD dwbytesrecorded; // The length of the recorded bytes.
DWORD dwuser;
DWORD dwflags;
DWORD dwloops; // number of cycles
Struct wavehdr_tag * lpnext; // Reserved
DWORD reserved; // Reserved
} Wavehdr;

Lpdata is the sample data in PCM format.

If the sampling size is 8 bits, the dynamic sampling range is 20 * log (256) dB = 48 dB.
If the sample size is 16 bits, the dynamic sampling range is 20 * log (65536), which is about 96 dB.

Amplitude: 20 * log (A1/A2) decibels. A1 and A2 indicate the amplitude of two sounds.
For audio:
8-digit 20 * lg (lpdata [0]/256)
16-bit 20 * lg (lpdata [0] -- lpdata [1]/65536)
Considering the single and double channels, you also need to extract the value of the left and right channels accordingly.
Considering that the value of LG is between 48 and 0, + 48or96 is required in actual conversion.

Sample size data format maximum value minimum value
8-bit PCM unsigned int 256 0
16-bit PCM int 32767-32767

The 8-bit audio is unsigned to store the waveform, and the amplitude must be-127.
Because the 16 bits are stored in the int type, the formula is directly applied.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Audio format details (mixed)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Audio format details (mixed)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support