Several format details (mixed)
Waveformatex
Mediatype_audio
Format_waveformatex
Clsid_audioinputdevicecategory
Clsid_audioinputdevicecategory
Typedef struct _ mediatype {
Guid majortype;
Guid subtype;
Bool bfixedsizesamples;
Bool btemporalcompression;
Ulong lsamplesize;
Guid formattype;
Iunknown * punk; // not use
Ulong cbformat;
Byte * pbformat;
} Am_media_type;
Mainly include
General description of majortype media types
More detailed description of subtype
Formattype
The data formats include:
Format_none
Format_dvinfo
Format_mpegvideo
Format_mpeg2video
Format_videoinfo
Format_videoinfo2
Format_waveformatex
Guid_null
The cbforamt Member specifies the size of the format block pbformat.
The pbformat Pointer Points to the format subblock.
Pbformat is a void * pointer, because the format block will be based on the media type
And have different points. For example, the audio is filled with the waveformatex structure.
Data.
The data format that can be retrieved from it.
// Twaveformatex structure:
Twaveformatex = packed record
Wformattag: word; {specify the format type; default: wave_format_pcm = 1 ;}
Nchannels: word; {indicates the number of channels for waveform data; 1 for single channel and 2 for STEREO}
Nsamplespersec: DWORD; {specified sample rate (number of samples per second)} is generally 8000
Navgbytespersec: DWORD; {average rate of data transmission (bytes per second)} bytes per second:
Nblockalign: word; {specify block alignment (in bytes). Block align is the smallest unit of data}
Wbitspersample: word; {sample size (bytes)} The number of BITs for each sample, usually 16
Cbsize: word; {It should be the size of the structure}
End;
Nchannels: for PCM, nchannels cannot exceed 2. For non-PCM format, nchannels exceed 2.
Nsamplespersec: generally 8 kHz, 11.025 kHz, 22.05 kHz, and 44.1 kHz.
Navgbytespersec: number of bytes transferred per second = nsamplespersec * nblockalign
Nblockalign: aligned byte = nchannels * wbitspersample/8
Indicates the minimum byte of a sample.
Wbitspersample: the default format is 8 to 16, indicating the number of bits of samples.
For an 8-bit stereo with 11 K transmission
Nchannels = 2
Nsamplespersec (number of samples per second) = 11025 is the number of samples
Nblockalign = 2*8/8 = 2 aligned bytes, minimum number of sample bytes
Navgbytespersec = 11025*2 = 22050
Wbitspersample = 8
The figure below clearly expresses the sample from another aspect
|
Sample 1 |
Sample 2 |
... N |
8-Bit Single Channel |
0-Channel |
0-Channel |
|
8-bit stereo |
0-channel L 1-Channel R |
0-channel, 1-channel, R-Channel |
|
16-Bit Single Channel |
0 channels (low bytes) 0 channels (high bytes) |
0 channels (low bytes) 0 channels (high bytes) |
|
16-bit stereo |
0 channels (low bytes) 0 channels (high bytes) 1 channel (low) 1 channel (high) |
Same as left |
|
---------
Waveform-audio cache format
Typedef struct {
Lpstr lpdata; // memory pointer for storing audio PCM sample data
DWORD dwbufferlength; // Length
DWORD dwbytesrecorded; // The length of the recorded bytes.
DWORD dwuser;
DWORD dwflags;
DWORD dwloops; // number of cycles
Struct wavehdr_tag * lpnext; // Reserved
DWORD reserved; // Reserved
} Wavehdr;
Lpdata is the sample data in PCM format.
If the sampling size is 8 bits, the dynamic sampling range is 20 * log (256) dB = 48 dB.
If the sample size is 16 bits, the dynamic sampling range is 20 * log (65536), which is about 96 dB.
Amplitude: 20 * log (A1/A2) decibels. A1 and A2 indicate the amplitude of two sounds.
For audio:
8-digit 20 * lg (lpdata [0]/256)
16-bit 20 * lg (lpdata [0] -- lpdata [1]/65536)
Considering the single and double channels, you also need to extract the value of the left and right channels accordingly.
Considering that the value of LG is between 48 and 0, + 48or96 is required in actual conversion.
Sample size data format maximum value minimum value
8-bit PCM unsigned int 256 0
16-bit PCM int 32767-32767
The 8-bit audio is unsigned to store the waveform, and the amplitude must be-127.
Because the 16 bits are stored in the int type, the formula is directly applied.