Windows Recording API Learning notes--go

Source: Internet
Author: User

Windows Recording API Learning notes

struct and function information

Structural body

Waveincaps

This structure describes the ability of a waveform audio input device.

typedef struct {

WORD Wmid; The manufacturer identifier of the device driver for the waveform audio input device.

WORD Wpid; The product identification code for the sound input device.

Mmversion vdriverversion; The version number of the device driver used for the waveform audio input device. The high-level byte is the major version number, and the low byte is the minor version number.

CHAR Szpname[maxpnamelen]; Device Name

DWORD Dwformats; The standard format that is supported. Can be a combination of the following:

WORD Wchannels; value Specifies whether the device supports single (1) or stereo (2) input

WORD wReserved1; Fill

} waveincaps;

Hwavein is now presumably the handle taken after opening the sound device (which is now the handle)

Wavehdr

This structure defines the header used to identify a waveform in the audio buffer.

LPSTR lpdata; Points to the waveform buffer.
DWORD dwbufferlength; Buffer length
DWORD dwbytesrecorded; The data length of the buffer to be used for input
DWORD Dwuser; User Data
DWORD DwFlags; Flags provide information about the buffer. See MSDN
DWORD Dwloops; Number of cycles played, only for output buffers
struct Wavehdr_tag * LPNEXT; Keep
DWORD reserved; Keep

WaveFormatEx

The structure defines the format of the waveform's audio data. Only common format information for all waveforms of the audio data format is included in this structure. For more information formats that are required, the structure is included in the other structure of the first part, along with additional information.

WORD wFormatTag; Waveform audio format type. The format tag registers many of the Microsoft Company's compression algorithms. A complete list of format labels can be found in the MMREG.H header file.
WORD nchannels; The number of channels in the waveform of the audio data. Mono data uses one channel, and stereo data uses two channels.
DWORD nsamplespersec; Sample rate, in samples per second (Hz), each channel should be played or recorded. If wFormatTag is WAVE_FORMAT_PCM, then nsamplespersec common values are 8.0 khz, 11.025 khz, 22.05 khz and 44.1 khz. For non-PCM formats, this must be calculated according to the specifications of the manufacturer's format tags.
DWORD navgbytespersec; The average data transfer rate required, marked in bytes per second. If wFormatTag is wave_format_pcm,navgbytespersec, it should be equal to the product of Nsamplespersec and nblockalign. For non-PCM formats, this must be calculated according to the specifications of the manufacturer's format tags.
Playback and recording software, you can estimate the buffer size by using the NAVGBYTESPERSEC member.
The block alignment, in bytes. Block alignment is the smallest base unit used for wFormatTag format type data. If wFormatTag is wave_format_pcm,nblockalign should be equal to the product of nchannels and wBitsPerSample/8 (bits per byte). For non-PCM formats, this must be calculated according to the specifications of the manufacturer's format tags.
The time at which the playback and recording software must process data in multiples of nblockalign bytes. Written and read data from the device must begin at the beginning of a block. For example, it is illegal to start playing the intermediate PCM data in the sample (that is, at a non-block aligned boundary).
The number of sample bits is used for the wFormatTag format type. If wFormatTag is WAVE_FORMAT_PCM, then wbitspersample should be equal to 8 or 16. For non-PCM formats, this must be set according to the specifications of the manufacturer's format tags. Note that some compression schemes cannot be defined as wbitspersample a value, so the widget can be zero.
Additional format information of size in bytes is appended to the end of the WAVEFORMATEX structure. This information can be used in non-PCM format to store extra for the wFormatTag property. If no additional information is required for the wFormatTag, this part must be set to zero. Note that for WAVE_FORMAT_PCM format (only WAVE_FORMAT_PCM format), this member is ignored.

function information

Here get to the sound device handle using the Waveinopen function, where the initialization is performed.

Returns the number of units. The returned 0 value indicates that no device exists or an error has occurred.

Waveingetnumdevs

The function retrieves the ability of a given waveform audio input device.

Mmresult waveingetdevcaps(

UINTudeviceid, a device that identifies the Fupo audio output. It can be a device identifier or a handle to an open waveform audio input device.

LpwaveincapsPwic, a function that points to a WAVEINCAPS structure to populate information about the device

UINTcbwic waveincaps number of bytes in the struct

);

Mmresult Waveinopen(

LphwaveinPhwi, which points to the identity handle of the open sound device. This parameter can be empty when the Fdwopen parameter is specified as Wave_format_query

UINTUdeviceid, to open the identifier of the device

LpwaveformatexPwfx, which points to a WAVEFORMATEX structure that identifies the format required to record the waveform audio data. You can release this structure immediately after Waveinopen returns.

DWORDdwcallback, pointer to a fixed callback function, an event handle, a handle to a window or line waveforms an identifier that is called during the audio recording process to process the progress message with the record. If no callback function is required, the value can be zero.

DWORDdwcallbackinstance, the data type passed to the callback mechanism, which is not used for window callbacks

DWORDfdwopenz indicates the type of data passed dwcallback, such as a time handle or a function pointer

);

This function provides a buffer for the audio input device

Mmresult Waveinprepareheader(

Hwaveinhwi, sound input device handle

LpwavehdrPWH, pointing to the WAVEHDR structure, identifies the buffer to be prepared.

Size of the WAVEHDR structure of the UINTCBWH

);

The function sends an input buffer for the given waveform of the audio input device. Notifies the application when the buffer is filled.

Mmresult Waveinaddbuffer(

Hwaveinhwi, sound input device handle

LpwavehdrPWH, pointing to the WAVEHDR structure, identifies the buffer to be prepared.

Size of the WAVEHDR structure of the UINTCBWH

);

This function starts the input of the specified device

Mmresult Waveinstart(

Hwaveinhwi device handle

);

Audio analysis based on Windows API

Yesterday read the Audio API and Windows WAV file related information, gradually clarify a little idea, so in this summary, after this article has been completed, should be able to continue the next step.

1 The first thing to know is how a computer represents a sound file.

The audio formats we are familiar with are: Mp3,wma,flac, and WAV. Here I only focus on WAV for the time being. To know, WAV is actually wave, meaning waveform. The real-world sounds are continuous, because they are analog signals, but the information stored in the computer is a digital signal. So before the sound is stored on the computer, it is necessary to digitize the sound and convert it into a form that the computer can store.

Learn the signal and system should know that analog signal conversion to digital signal, a more general method is to do equal interval sampling. According to the Nyquist theorem, the sampling frequency is at least twice times the signal frequency to preserve the original audio signal without distortion. So the sampling frequency determines the fidelity of the digital signal, naturally the higher the better. For example, a period of 1ms sine signal, mining two points and mining 100 points of the signal in the reduction of the analog signal, it is sure to pick 100 point signal to restore the effect better.

After sampling the analog signal, a series of discrete voltage signals are obtained. Therefore, these analog signals need to be quantified before the data can be stored on the computer. The so-called quantization, is to use binary data to represent the size of the level. Generally used 8-bit (256-level) or 16-bit (65,536-level) data to indicate that in the hardware-level design, according to the specific circumstances of the ADC to determine. In Windows, you can use the Waveingetdevcaps function to get the sound card information to determine the number of quantization sample bits using either 8-bit or 16-bit. In the dwformats member of its parameter, the corresponding information is included, as follows:

Assuming the sinusoidal signal of the 0-5v level, the 0-5v is divided into 256 gradients with a 8-bit quantization, with a voltage difference of 0.0195V for each gradient. A value corresponding to a 4V sample is 205 (204.8).

After two steps of processing, the analog sound signal is converted to a digital signal with a fixed sampling frequency and a corresponding quantization standard. So the main two parameters of digital sound signal is the sampling bit width and sampling frequency.

Another thing to note is the channel. We usually hear the music is stereo, that is, two-channel. This is a special introduction in WAV file format, but I did not look closely ... Remember the alternate appearance.

Now look back at the table above, the sound card when providing a signal, the above three main factors: sampling frequency, channel, quantization bit width.

I wrote a code based on the WAV file header data, opened and viewed a WAV file on my computer, which was previously copied on a CD.

The results of the operation are as follows:

But this is to take a look, temporarily still not used, with the above basis, the next thing to do is to analyze the use of the API.

The first function to invoke:

UINT Waveingetnumdevs(VOID);

This function is only used to see if there is a sound card device on the computer, if the return value is 0, then there is no sound card. However, this is generally not the case, after all, now the computer motherboard are integrated sound card.

The second function to invoke:

Mmresult waveingetdevcaps(

UINTUdeviceid,

LpwaveincapsPwic,

UINTCbwic);

The first parameter of this function is the device ID, but we don't know the device ID right now, but it doesn't matter, as long as we know that there is a device available. But on MSDN I have a very confusing sentence:

"Use this function to determine the number of waveform-audio input devices present in the system."

Use this function to determine the number of sound input devices in the system. I think this function name feels more like acquiring device capability information, and the second parameter is a direct point out of its purpose, "the ability of a sound input device."

Come back. The first argument, MSDN in remarks, shows that this parameter can be either from 0 to any number in the number of devices, or use Wave_mapper. I estimate that this wave_mapper is a value of 0, because the incoming parameter requirement is a pointer, so, you know. So just fill in a 0. But as for the multi-sound device specific how to play, I really do not know.

The second parameter is a pointer to the WAVEINCAPS structure, which is a good analysis of the structure:

The first two members of the structure are device manufacturer information and product identifiers, and I clicked to see the macro definition of a bunch of device manufacturer information ... Don't care too much.

The third member is Mmversion, whose essence is a uint, which drives the device version number and does not care too much.

The fourth member is the device name, which is a string ending with '/', also without too much control.

The fifth member is important because it indicates a combination of supported audio signal standards, specifically the MSDN translation of this structure.

The sixth one is the judgment of the channel support. In fact, this parameter can be seen in the analysis of the fifth parameter.

The seventh one is reserved parameters, but so far it is useless.

The third parameter is simply to tell the second parameter the size of the struct.

Depending on the debug results of the program, the purpose of this function is to initialize the waveincaps. As you can see from the figure below, after the function call is complete, the struct is initialized.

The third call to a function:

Mmresult Waveinopen (
Lphwavein Phwi ,
UINT Udeviceid ,
Lpwaveformatex pwfx ,
DWORD Dwcallback ,
DWORD dwcallbackinstance ,
DWORD Fdwopen
);

This function is very interesting, but also relatively more complex. The first parameter here is the device handle, but here the parameter is initialized here, and it can be passed in. The second parameter is also the device ID, but the device ID also seems to be just a 0 pass. The third parameter needs to be studied well, that is, the structure,WaveFormatEx. I now feel that it is used to initialize the device.

Take a look at the instructions on MSDN:

The first member is the format type parameter, and in the second function, the child member of the second argument of the Waveingetdevcaps function, the composition is basically consistent. If the macro definition specified above is used in this parameter, then the following function, such as the Nchannel parameter, is not processed. However, a more general approach is to modify the wFormatTag to WAVE_FORMAT_PCM, and initialize the following parameters accordingly.

This third parameter can be initialized according to the requirements, for example, in this form:

Waveinitformat (

WORD NCh,

DWORD Nsamplerate,

WORD bitspersample)

{

M_waveformat.wformattag = WAVE_FORMAT_PCM;

M_waveformat.nchannels = nCh;

M_waveformat.nsamplespersec = nsamplerate;

m_waveformat.navgbytespersec = nsamplerate * nCh * BITSPERSAMPLE/8;

M_waveformat.nblockalign = M_waveformat.nchannels * BITSPERSAMPLE/8;

M_waveformat.wbitspersample = BitsPerSample;

m_waveformat.cbsize = 0;

}

In this initialization function, the first one is WAVE_FORMAT_PCM, the second is the option of the number of channels, the third is the sampling frequency, the fourth parameter is calculated based on the sampling frequency, the channel, the number of bytes per sample point. The fifth parameter is the block alignment size, which is the size of each individual sound node information, which is calculated as the number of channels * The number of bytes per sample point. The sixth parameter is the number of bits in the sample point, specifying the quantization level of the sound. The seventh argument is a bit of a statement, but when wFormatTag is specified as WAVE_FORMAT_PCM, this parameter is ignored.

The IV parameter is a callback processing parameter, such as a callback function, or an event handle, or a thread handle. A parameter that notifies the application to process audio data. When the audio signal of the operating system is received, it is through this parameter to decide in what way to pass the data to the application.

The fifth parameter is the data type passed to the callback mechanism, not the window callback. I don't quite understand this parameter, but when I use it, it's OK to pass in null.

The sixth parameter is a notification message that opens the device, and the main purpose is to tell the function whether the fourth parameter is a callback function or an event handle. It's so simple.

The fourth function called is Waveinprepareheader. The function is to prepare a buffer for the sound input device to hold the sound data after it has been initialized.

Mmresult Waveinprepareheader (
HwaveinHwi
LPWAVEHDRPWH
UINTCBWH

);

The first parameter is the device handle, not much.

The second parameter requires a good look at this structure.

typedef struct {

} WAVEHDR;

Lpdata is a buffer that points to the memory space you provide. As for the specific size, there seems to be no limit, because filling up will inform you, then use a special function to deal with OK.

The Dwbufferlength parameter is the buffer length, not much to say.

dwbyterecorded the number of bytes actually recorded, because all buffers may not be used every time.

Dwuser user data, specifically what to do, MSDN did not say, there may be time to pass the message to the callback function, I guess.

dwflags additional information for buffer, this parameter must be 0 when using this function Waveinprepareheader, as required by MSDN.

Dwloops the number of times the loop is played, only for output use. It's not available here.

Lpnext This is a pointer, but actually a reserved parameter.

Reserved the last one is the same as above.

So the structure is finished. is to provide a buffer for the input of the head.

The third parameter is the number of bytes in the structure above. A sizeof.

The purpose of this function is to prepare a buffer for the sound input device, in which case the audio information will be written into the memory area.

The fifth function is Waveinaddbuffer, which is used to send the function waveinprepareheader the buffer that is prepared to the sound input device.

Mmresult Waveinaddbuffer (

HwaveinHwi,

LpwavehdrPWH,

UINTCBWH

);

is consistent with the prototype of the previous function, and the parameters used by the three parameter and the Waveinprepareheader function must be identical. And before this function is called, the buffer of thePWH parameter must be processed by the Waveinprepareheader function. When the buffer is filled, the application is notified. See MSDN.

The sixth function is Waveinstart, this function is only used to turn on the recording function, the simplest.

The recording process for Windows is roughly as follows:

1 first check if the local machine has a sound input device.

2 getting information for a sound input device

Generally the above two is not necessary, after all, now the basic computer has an integrated sound card. However, from the perspective of stability and versatility, it is still necessary.

3 Open the device, get the device handle, and pass in the corresponding event handle.

4 prepare an asynchronous thread dedicated to the processing of the recording after completion and wait for the event.

5 preparing a buffer for it through a device handle

6 Add the prepared buffer to the device through a handle

The next part is the work of the system, which, under normal circumstances, will trigger an event when the buffer is filled to notify the asynchronous thread of processing. After you get the sound information, you can add the buffer again to continue recording.

20140118 Supplement:

Sometimes a stack allocation or heap allocation is usually taken in the creation of buffers. The destruction of stack allocated memory is done by the stack, and the user does not have to handle it manually. But the heap allocation of time will encounter trouble, as follows:

The Waveinprepareheader function is called when the buffer is prepared, and the memory allocated for it cannot be freed by either delete or free because the memory area is locked after the function call. You must call the Waveinunprepareheader function at this point to unlock it before releasing it.

However, after calling the Waveinprepareheader function and then calling the Waveinaddbuffer function, when the buffer is not filled, attempting to unlock using the Waveinunprepareheader function returns a failure code of 33. The solution here is to call a function first before deciding to free up space: Waveinreset. After this function is called, the memory can be freed from the Waveinaddbuffer function's qualification, then normally released with Waveinunprepareheader, and the last call to delete or free frees the memory space.

From:http://www.cnblogs.com/matrix-r/p/3523303.html

Windows Recording API Learning notes--go

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.