FFmpeg document 3: playback sound

Last Update:2018-12-07 Source: Internet

Author: User

Tags decode all

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Tutorial 3: play a sound

Now let's play the sound. SDL also provides us with a method to output sound. The sdl_openaudio () function is used to open the sound device. It uses a struct called sdl_audiospec as the parameter, which contains all the information about the audio to be output.

Before we demonstrate how to create an audio system, Let's explain how the computer processes audio. Digital Audio is composed of a long string of sample streams. Each sample represents a value in the sound waveform. The sound is recorded at a specific sampling rate. The sampling rate indicates how fast the sample stream is to be played. The sampling rate indicates the number of samples per second. For example, the sampling rates of 22050 and 44100 are commonly used by radio stations and CD. In addition, most audios have more than one channel for stereo or surround. For example, if the sampling is stereo, the number of samples each time is two. When we medium data from a movie file, we don't know how many samples we will get, however, FFMPEG will not give us some samples-which means it will not split the stereo.

The method for playing a sound in SDL is as follows: You first set the sound option: sampling rate (expressed by freq In the SDL struct), number of sound channels, and other parameters, then we set a callback function and some user data userdata. When you start playing audio, SDL constantly calls this callback function and requires it to buffer the sound with a specified number of bytes. After we put this information in the sdl_audiospec struct, we call the sdl_openaudio () function to open the sound device and send us another audiospec struct. This struct is actually used-because we cannot guarantee that it meets our requirements.

Set audio

At present, let's keep it in mind, because we don't actually have any information about sound streams. Let's look back at our code and see how we find the video stream. We can also find the sound stream.

// Find the first video stream

Videostream =-1;

Audiostream =-1;

For (I = 0; I <pformatctx-> nb_streams; I ++ ){

If (pformatctx-> streams [I]-> codec-> codec_type = codec_type_video

Videostream <0 ){

Videostream = I;

}

If (pformatctx-> streams [I]-> codec-> codec_type = codec_type_audio &&

Audiostream <0 ){

Audiostream = I;

}

If (videostream =-1)

Return-1; // didn't find a video stream

If (audiostream =-1)

Return-1;

Here we can get the desired information from the avcodeccontext of the stream description, just like we get the video stream information.

Avcodeccontext * acodecctx;

Acodecctx = pformatctx-> streams [audiostream]-> codec;

All the information contained in the codec context is exactly what we need to create the Audio Information:

Wanted_spec.freq = acodecctx-> sample_rate;

Wanted_spec.format = audio_s16sys;

Wanted_spec.channels = acodecctx-> channels;

Wanted_spec.silence = 0;

Wanted_spec.samples = sdl_audio_buffer_size;

Wanted_spec.callback = audio_callback;

Wanted_spec.userdata = acodecctx;

If (sdl_openaudio (& wanted_spec, & spec) <0 ){

Fprintf (stderr, "sdl_openaudio: % s \ n", sdl_geterror ());

Return-1;

}

Let's take a look at these:

· The sampling rate mentioned above in freq

· Format tells SDL the format we will give. In "s16sys", s indicates signed, 16 indicates that each sample is 16-bit long, and SYS indicates that the sequence of large and small headers is the same as that of the system used. These formats are the input audio formats provided by avcodec_decode_audio2.

· Number of channels for channels sound

· Silence: indicates the mute value. Because sound sampling is signed, 0 is of course the value.

· Samples this is the size of the sound buffer we want SDL to give when we want more sounds. A suitable value is between 512 and 8192; ffplay uses 1024.

· Callback: This is our callback function. We will discuss it in detail later.

· Userdata: The parameter that SDL provides for callback function operation. We will let the callback function get the context of the entire codec; you will know the reason later.

Finally, we use the sdl_openaudio function to open the sound.

If you still remember the previous instructions, we still need to enable the sound codecs themselves. This is obvious.

Avcodec * acodec;

Acodec = avcodec_find_decoder (acodecctx-> codec_id );

If (! Acodec ){

Fprintf (stderr, "unsupported codec! \ N ");

Return-1;

}

Avcodec_open (acodecctx, acodec );

Queue

Well! Now we are ready to extract sound information from the stream. But how can we handle this information? We will constantly get these packages from the file, but SDL will also call the callback function. The solution is to create a global struct variable so that we can store the sound package from the file and ensure that the audio callback function audio_callback in SDL can obtain sound data from this place. So what we need to do is create a queue for a package. There is a struct named avpacketlist in FFMPEG to help us. This struct is actually a chain table of packages. The following is our queue structure:

Typedef struct packetqueue {

Avpacketlist * first_pkt, * last_pkt;

Int nb_packets;

Int size;

Sdl_mutex * mutex;

Sdl_cond * cond;

} Packetqueue;

First, we should point out that nb_packets is different from size -- size indicates the number of bytes we get from packet-> size. You will notice that we have a mutex and a condition variable cond In the struct. This is because SDL processes audio in an independent thread. If we do not properly lock this queue, we may confuse the data. We will see how a queue runs in the future. Every programmer should know how to generate a queue, but we will discuss this part so that we can learn the SDL function.

First, we will create a function to initialize the queue:

Void packet_queue_init (packetqueue * q ){

Memset (Q, 0, sizeof (packetqueue ));

Q-> mutex = sdl_createmutex ();

Q-> cond = sdl_createcond ();

}

Then let's make another function to fill in the queue with something:

Int packet_queue_put (packetqueue * q, avpacket * Pkt ){

Avpacketlist * pkt1;

If (av_dup_packet (Pkt) <0 ){

Return-1;

}

Pkt1 = av_malloc (sizeof (avpacketlist ));

If (! Pkt1)

Return-1;

Pkt1-> Pkt = * Pkt;

Pkt1-> next = NULL;

Sdl_lockmutex (Q-> mutex );

If (! Q-> last_pkt)

Q-> first_pkt = pkt1;

Else

Q-> last_pkt-> next = pkt1;

Q-> last_pkt = pkt1;

Q-> nb_packets ++;

Q-> size + = pkt1-> Pkt. size;

Sdl_condsignal (Q-> Cond );

Sdl_unlockmutex (Q-> mutex );

Return 0;

}

The function sdl_lockmutex () locks the mutex of the queue so that we can add something to the queue, and then the function sdl_condsignal () uses our conditional variable as a receiving function (if it is waiting) send a signal to tell it that there is already data, and then unlock the mutex and allow free access to the queue.

Below are the corresponding receiving functions. Note how the function sdl_condwait () blocks the function according to our requirements (for example, waiting until there is data in the queue ).

Int quit = 0;

Static int packet_queue_get (packetqueue * q, avpacket * Pkt, int block ){

Avpacketlist * pkt1;

Int ret;

Sdl_lockmutex (Q-> mutex );

For (;;){

If (quit ){

Ret =-1;

Break;

}

Pkt1 = Q-> first_pkt;

If (pkt1 ){

Q-> first_pkt = pkt1-> next;

If (! Q-> first_pkt)

Q-> last_pkt = NULL;

Q-> nb_packets -;

Q-> size-= pkt1-> Pkt. size;

* Pkt = pkt1-> Pkt;

Av_free (pkt1 );

Ret = 1;

Break;

} Else if (! Block ){

Ret = 0;

Break;

} Else {

Sdl_condwait (Q-> cond, Q-> mutex );

}

Sdl_unlockmutex (Q-> mutex );

Return ret;

}

As you can see, we have packaged this function with an infinite loop to help us get data in blocking ways. We use the sdl_condwait () function in SDL to avoid infinite loops. Basically, all condwait will only wait for the signal sent from the sdl_condsignal () function (or sdl_condbroadcast () function), and then continue the execution. However, although we seem to be stuck in our mutex-if we keep this lock, our function will never be able to put data into the queue! However, the sdl_condwait () function also unlocks mutex for us and then tries to lock it again after receiving the signal.

Unexpected situation

You will notice that we have a global variable quit, which we use to ensure that no program exit signal is set yet (SDL will automatically process signals similar to the term ). Otherwise, this thread will not stop running until we end the program using kill-9. FFmpeg also provides a function for callback and checks whether we need to exit some blocked functions: url_set_interrupt_cb.

Int decode_interrupt_cb (void ){

Return quit;

}

...

Main (){

...

Url_set_interrupt_cb (decode_interrupt_cb );

...

Sdl_pollevent (& event );

Switch (event. Type ){

Case sdl_quit:

Quit = 1;

...

Of course, this is only used for blocking in FFMPEG, not in SDL. We also need to set the quit flag to 1.

Provide packages for queues

The only thing we need to do for the queue is to provide the package:

Packetqueue audioq;

Main (){

...

Avcodec_open (acodecctx, acodec );

Packet_queue_init (& audioq );

Sdl_pauseaudio (0 );

The function sdl_pauseaudio () enables the audio device to eventually start working. If you do not supply enough data immediately, it will play mute.

We have established our queue, and now we are ready to provide it with packages. Let's take a look at our reading package cycle:

While (av_read_frame (pformatctx, & Packet)> = 0 ){

// Is this a packet from the video stream?

If (packet. stream_index = videostream ){

// Decode Video Frame

....

}

} Else if (packet. stream_index = audiostream ){

Packet_queue_put (& audioq, & Packet );

} Else {

Av_free_packet (& Packet );

}

Note: We didn't release the package when we put it in the queue. We will release it After decoding.

Remove package

Now, let's finally let the sound callback function audio_callback to retrieve the packet from the queue. The format of the callback function must be void callback (void * userdata, uint8 * stream, int Len). Here, userdata is the pointer we give to SDL, stream is the buffer pointer for writing sound data, and Len is the buffer size. The following code is used:

Void audio_callback (void * userdata, uint8 * stream, int Len ){

Avcodeccontext * acodecctx = (avcodeccontext *) userdata;

Int len1, audio_size;

Static uint8_t audio_buf [(avcodec_max_audio_frame_size * 3)/2];

Static unsigned int audio_buf_size = 0;

Static unsigned int audio_buf_index = 0;

While (LEN> 0 ){

If (audio_buf_index> = audio_buf_size ){

Audio_size = audio_decode_frame (acodecctx, audio_buf,

Sizeof (audio_buf ));

If (audio_size <0 ){

Audio_buf_size = 1024;

Memset (audio_buf, 0, audio_buf_size );

} Else {

Audio_buf_size = audio_size;

}

Audio_buf_index = 0;

}

Len1 = audio_buf_size-audio_buf_index;

If (len1> Len)

Len1 = Len;

Memcpy (stream, (uint8_t *) audio_buf + audio_buf_index, len1 );

Len-= len1;

Stream + = len1;

Audio_buf_index + = len1;

}

This is basically a simple loop of getting data from another audio_decode_frame () function that we will write. This loop writes the result to the intermediate buffer zone, try to write len bytes into the stream and get more data when we don't have enough data or save it for later use when we have excess data. The size of the audio_buf is 1.5 times the size of the sound frame to facilitate a better buffer. The size of the sound frame is given by FFMPEG.

Final audio decoding

Let's take a look at the real part of the decoder: audio_decode_frame

Int audio_decode_frame (avcodeccontext * acodecctx, uint8_t * audio_buf,

Int buf_size ){

Static avpacket Pkt;

Static uint8_t * audio_pkt_data = NULL;

Static int audio_pkt_size = 0;

Int len1, data_size;

For (;;){

While (audio_pkt_size> 0 ){

Data_size = buf_size;

Len1 = avcodec_decode_audio2 (acodecctx, (int16_t *) audio_buf, & data_size,

Audio_pkt_data, audio_pkt_size );

If (len1 <0 ){

Audio_pkt_size = 0;

Break;

}

Audio_pkt_data + = len1;

Audio_pkt_size-= len1;

If (data_size <= 0 ){

Continue;

}

Return data_size;

}

If (Pkt. Data)

Av_free_packet (& Pkt );

If (quit ){

Return-1;

}

If (packet_queue_get (& audioq, & Pkt, 1) <0 ){

Return-1;

}

Audio_pkt_data = Pkt. Data;

Audio_pkt_size = Pkt. size;

}

The whole process actually starts from the end of the function. Here we call the packet_queue_get () function. We retrieve the packet from the queue and save its information. Then, once we have a usable package, we call the avcodec_decode_audio2 () function, which functions like avcodec_decode_video, the only difference is that there may be more than one sound frame in a package, so you may need to call it many times to decode all the data in the package. At the same time, remember to forcibly convert the pointer audio_buf, because SDL provides 8-bit integer buffer pointers and FFMPEG provides 16-bit integer pointers. You should also notice the difference between len1 and data_size. len1 indicates the size of the decoded data in the package. data_size indicates the size of the actually returned original sound data.

When we get some data, we immediately return to see if we still need to get more data from the queue or if we have completed it. If we still have more data to process, we save it to the next time. If we have finished processing a package, we will release it.

That's it. We use the main read queue loop to obtain audio from the file and send it to the queue. Then, the audio_callback function reads and processes the audio from the queue, and finally sends the data to SDL, so SDL is equivalent to our sound card. Let's continue and compile:

Gcc-O tutorial03 tutorial03.c-lavutil-lavformat-lavcodec-LZ-LM \

'Sdl-config-cflags-libs'

Aha! Although the video is as fast as before, the sound can be played normally. Why? Because the sampling rate in the sound information-although we fill the sound data into the sound card buffer as quickly as possible, the sound device will play the video according to the original specified sampling rate.

We are almost ready to start synchronizing audio and video, but what we need first is a program organization. Using queues to organize and play audio works well in an independent thread: it makes the program easier to control and modularize. Before we start synchronizing audio and video, we need to make our code easier to process. So next time we will talk about creating a thread.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More