In the previous study, video and audio playback were performed separately. This is mainly for the convenience of learning, after a period of study, the FFmpeg also have a certain understanding, this article introduces
How to use multithreading to play audio and video simultaneously (not synchronized), and the previous learning code is refactored to facilitate later extension.
This article mainly has the following aspects the content:
- The overall process for multi-threaded playback of visual audio
- Multi-threaded queues
- Audio playback
- Video playback
- Summary and follow-up plan
1. Overall process
The initialization process for ffmpeg and SDL is not mentioned here. The entire process is as follows:
- For an open video file (
AVFormatContext
that is, get it), create a separate thread, constantly read the Packetfrom the stream, and, according to its stream index, store the Packet separately in Audio Packet Queue and Video Packet are both queued caches.
- The audio playback thread. Create a callback function to remove the Packet from the audio Packet queue and decode it to send the decoded data to the SDL Audio device for playback
- Video playback thread.
- Creates a video decoding thread, extracts the Packet from the video Packet queue, decodes it, and puts the decoded data into the video Frame queue cache.
- Enter the SDL Window event loop, remove the Frame from the Video frame queue at a certain speed, convert it to the appropriate format, and then display it on the SDL screen
Throughout its process, such as:
1.1 The main function after refactoring
In the previous study process, the main is followed Dranger tutorial. Because the tutorial is C-based, it is not very convenient to use the code in a tutorial that uses multi-threaded audio and video playback. In this article, the code is refactored with C + + encapsulation.
The main
following functions are encapsulated:
av_register_all(); SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER); char* filename = "F:\\test.rmvb"; MediaState media(filename); if (media.openInput()) SDL_CreateThread(decode_thread, "", &media); // 创建解码线程,读取packet到队列中缓存 media.audio->audio_play(); // create audio thread media.video->video_play(); // create video thread SDL_Event event; while (true) // SDL event loop { SDL_WaitEvent(&event); switch (event.type) { case FF_QUIT_EVENT: case SDL_QUIT: quit = 1; SDL_Quit(); return 0; break; case FF_REFRESH_EVENT: video_refresh_timer(media.video); break; default: break; } }
The main function is divided into three parts:
- Initializing FFmpeg and SDL
- Creating an audio playback thread and a video playback thread
- The SDL event loop displays the image.
1.2 Data structures used
Encapsulates the main data that you need to use during playback as three structures:
- Mediastate mainly contain
AudioState
and VideoState
pointers, as well asAVFormatContext
- Audiostate the data needed to play the audio
- Videostate the data needed to play the video
Here we introduce the MediaState
data structures associated with the audio and video playback later.
MediaState
The statement is as follows:
struct MediaState{ AudioState *audio; VideoState *video; AVFormatContext *pFormatCtx; char* filename; //bool quit; MediaState(char *filename); ~MediaState(); bool openInput();};
The structure is relatively simple, its main function is in oepnInput
, the function is used to open the corresponding video file, and read the corresponding information filled into VideoState
and AudioState
structure.
There are several main functions as follows:
- Call
avformat_open_input
to get avformatcontext pointer
- Locate the audio stream index, and then open the appropriate
AVCodecContext
- Find the index of the video stream and open the corresponding
AVCodecContext
1.3 Packet Detach Thread
After the call oepnInput
, to get enough information, and then create the packet detach thread, follow the resulting stream index and place the av_read_frame
read packet into the corresponding packet cache queue.
Some of the code is as follows:
if (packet->stream_index == media->audio->audio_stream) // audio stream{ media->audio->audioq.enQueue(packet); av_packet_unref(packet);} else if (packet->stream_index == media->video->video_stream) // video stream{ media->video->videoq->enQueue(packet); av_packet_unref(packet);} else av_packet_unref(packet);
2. Multi-threaded queues
The detach thread will read the packet into the packet queue of the video and audio respectively, the packet queue will be accessed by multiple threads, the separating thread fills the packet; The video and audio playback threads remove packet from the queue
To decode and then play. PacketQueue
The statement is as follows:
struct PacketQueue{ std::queue<AVPacket> queue; Uint32 nb_packets; Uint32 size; SDL_mutex *mutex; SDL_cond *cond; PacketQueue(); bool enQueue(const AVPacket *packet); bool deQueue(AVPacket *packet, bool block);};
Use the standard library std::queue
as a container for storing data, SDL_mutex
and is the synchronization of the SDL_cond
threads that are provided in the SDL library and that are used to control the queue of the mutex and condition variables.
When you want to access the elements in the queue, use SDL_mutex
to lock the queue, when there is no packet in the queue, and then there is a video or audio thread to take packet in the queue, you need to set a
Set SDL_cond
the semaphore to wait for the new packet into the queue.
The methods for entering the queue are as follows:
bool PacketQueue::enQueue(const AVPacket *packet){AVPacket *pkt = av_packet_alloc();if (av_packet_ref(pkt, packet) < 0) return false;SDL_LockMutex(mutex);queue.push(*pkt);size += pkt->size;nb_packets++;SDL_CondSignal(cond);SDL_UnlockMutex(mutex);return true;}
Note the packet call to the incoming queue av_packet_ref
increases the reference count method to replicate the data in the packet. After the new packet is enqueued, the semaphore is set to notify the new packet into the queue, and
Unlocks the packet queue.
The way out of the team is implemented as follows:
bool PacketQueue::deQueue(AVPacket *packet, bool block){bool ret = false;SDL_LockMutex(mutex);while (true){ if (quit) { ret = false; break; } if (!queue.empty()) { if (av_packet_ref(packet, &queue.front()) < 0) { ret = false; break; } //av_packet_free(&queue.front()); AVPacket pkt = queue.front(); queue.pop(); av_packet_unref(&pkt); nb_packets--; size -= packet->size; ret = true; break; } else if (!block) { ret = false; break; } else { SDL_CondWait(cond, mutex); }}SDL_UnlockMutex(mutex);return ret;}
The parameter block
identifies if the queue is empty and if the wait is blocked, and when set to true, the thread that takes packet blocks waits until the cond semaphore is notified. In addition, the
av_packet_unref
the reference count to reduce packet data is called after the packet is removed.
3. Audio playback
Audio playback has been summarized in the previous FFmpeg Learning 3: Play audio, its playback process is mainly set up to send data to the audio device callback function, here is no longer detailed. Unlike before, the playback data is encapsulated as follows:
struct AudioState{ const uint32_t BUFFER_SIZE;// 缓冲区的大小 PacketQueue audioq; uint8_t *audio_buff; // 解码后数据的缓冲空间 uint32_t audio_buff_size; // buffer中的字节数 uint32_t audio_buff_index; // buffer中未发送数据的index int audio_stream; // audio流index AVCodecContext *audio_ctx; // 已经调用avcodec_open2打开 AudioState(); //默认构造函数 AudioState(AVCodecContext *audio_ctx, int audio_stream); ~AudioState(); /** * audio play */ bool audio_play();};
audioq
Is the queue that holds the audio packet;
audio_stream
is the index of audio stream
Several other fields are used to cache the decoded data, and the callback function extracts the data from that buffer to the audio device.
audio_buff
Pointers to buffers
audio_buff_size
How much data in the buffer
audio_buff_index
A pointer to a buffer in which data has been sent
BUFFER_SIZE
Maximum capacity of the buffer
function audio_play
to set the required parameters for playback and to start the audio playback thread
bool AudioState::audio_play(){ SDL_AudioSpec desired; desired.freq = audio_ctx->sample_rate; desired.channels = audio_ctx->channels; desired.format = AUDIO_S16SYS; desired.samples = 1024; desired.silence = 0; desired.userdata = this; desired.callback = audio_callback; if (SDL_OpenAudio(&desired, nullptr) < 0) { return false; } SDL_PauseAudio(0); // playing return true;}
4. Video Playback 4.1 videostate
Similar to audio playback, it also encapsulates the VideoState
data needed to save video playback
struct VideoState{ PacketQueue* videoq; // 保存的video packet的队列缓存 int video_stream; // index of video stream AVCodecContext *video_ctx; // have already be opened by avcodec_open2 FrameQueue frameq; // 保存解码后的原始帧数据 AVFrame *frame; AVFrame *displayFrame; SDL_Window *window; SDL_Renderer *renderer; SDL_Texture *bmp; SDL_Rect rect; void video_play(); VideoState(); ~VideoState();};
VideoState
The fields in can be broadly divided into three categories:
- Video decoding requires data packet queue, stream's index, and Avcodeccontext
- Will decode the intermediate data
- Framequeue frame queue, storing the frame decoded from the packet. To refresh a new frame, remove the frame from the queue and render it to the interface after the format is converted.
- Frame format Conversion Intermediate variable
- After the Displayframe format is converted to Fram, the data in Fram is finally rendered to the frame on the interface
- Data required for SDL to play video
FrameQueue
Implementation and PacketQueue
the implementation of similar, no longer repeat.
4.2 Video's Decode and play
In videostate
function video_play
to initialize video playback and turn on video decoding thread
void Videostate::video_play () {int width = 800; int height = 600; Create SDL window Windows = Sdl_createwindow ("FFmpeg Decode", sdl_windowpos_undefined, sdl_windowpos_undefined, Width, h eight, SDL_WINDOW_OPENGL); Renderer = sdl_createrenderer (window,-1, 0); BMP = Sdl_createtexture (renderer, SDL_PIXELFORMAT_YV12, sdl_textureaccess_streaming, width, height); rect.x = 0; Rect.y = 0; RECT.W = width; rect.h = height; frame = Av_frame_alloc (); Displayframe = Av_frame_alloc (); Displayframe->format = av_pix_fmt_yuv420p; Displayframe->width = width; Displayframe->height = height; int numbytes = Avpicture_get_size ((avpixelformat) displayframe->format,displayframe->width, displayFrame-> height); uint8_t *buffer = (uint8_t*) av_malloc (numbytes * sizeof (uint8_t)); Avpicture_fill (avpicture*) displayframe, buffer, (Avpixelformat) Displayframe->format, Displayframe->width, Displayframe->height); Sdl_createthread (DecOde, "", this); Schedule_refresh (this, 40); Start Display}
First, some variables of the SDL window are created, and the data space is allocated according to the corresponding format, displayFrame
then the decoding thread of video is created, and the last sentence schedule_refresh(this, 40)
is to start the SDL event loop and refresh the frame continuously on the window.
The decoding thread function for video is as follows:
int decode(void *arg){ VideoState *video = (VideoState*)arg; AVFrame *frame = av_frame_alloc(); AVPacket packet; while (true) { video->videoq->deQueue(&packet, true); int ret = avcodec_send_packet(video->video_ctx, &packet); if (ret < 0 && ret != AVERROR(EAGAIN) && ret != AVERROR_EOF) continue; ret = avcodec_receive_frame(video->video_ctx, frame); if (ret < 0 && ret != AVERROR_EOF) continue; if (video->frameq.nb_frames >= FrameQueue::capacity) SDL_Delay(500); video->frameq.enQueue(frame); av_frame_unref(frame); } av_frame_free(&frame); return 0;}
The function is simple, is to constantly remove the packet from the packet queue, and then decode, the decoded frame queue for the display thread to use, and finally rendered to the interface. Note that this gives the frame queue a maximum capacity, and when the frame queue is full, it blocks the decoding thread and waits for the display thread to play for some time.
4.3 Display thread
Frames are rendered using the SDL library, so the display thread is actually the SDL window time loop. The video frame display process is as follows:
In the video_play
function, after the decoding thread of the video is started, a function is called schedule_refresh
to start the display thread of the frame.
// 延迟delay ms后刷新video帧void schedule_refresh(VideoState *video, int delay){ SDL_AddTimer(delay, sdl_refresh_timer_cb, video);}uint32_t sdl_refresh_timer_cb(uint32_t interval, void *opaque){ SDL_Event event; event.type = FF_REFRESH_EVENT; event.user.data1 = opaque; SDL_PushEvent(&event); return 0; /* 0 means stop timer */}
Schedule_refresh
sets a delay time, and then calls the SDL_REFRESH_TIMER_CB
function. SDL_REFRESH_TIMER_CB
is a
send a ff_refresh_event
event to the SDL event loop. From the previous event handling, the Video_refresh_timer
is called after the ff_refresh_event
event is received The function takes each frame out of the frame queue and renders it to the interface after the format is converted.
void Video_refresh_timer (void *userdata) {videostate *video = (videostate*) UserData; if (video->video_stream >= 0) {if (Video->videoq->queue.empty ()) Schedule_refresh (video, 1); else {/* Now, normally here goes a ton of code about timing, etc. we ' re just going to Guess at a delay for now. can increase and decrease this value and hard code the timing-but I don ' t suggest that;) We'll learn how to does it for real later. */Schedule_refresh (video, 40); Video->frameq.dequeue (&video->frame); Swscontext *sws_ctx = Sws_getcontext (Video->video_ctx->width, Video->video_ctx->height, Video->video _ctx->pix_fmt, Video->displayframe->width,video->displayframe->height, (AVPixelFormat) video->d Isplayframe->format, Sws_bilinear, nullptr, nullptr, nullptr); Sws_scale (Sws_CTX, (uint8_t const * const *) video->frame->data, video->frame->linesize, 0, Video->video_c Tx->height, Video->displayframe->data, video->displayframe->linesize); Display the image to screen sdl_updatetexture (Video->bmp, & (Video->rect), video->displayframe-& Gt;data[0], video->displayframe->linesize[0]); Sdl_renderclear (Video->renderer); Sdl_rendercopy (Video->renderer, Video->bmp, &video->rect, &video->rect); Sdl_renderpresent (Video->renderer); Sws_freecontext (SWS_CTX); Av_frame_unref (Video->frame); }} else {Schedule_refresh (video, 100); }}
The implementation of this function is also very clear, constantly take the frame from the frame queue, SwsContext
create VideoState
The parameters set by the format of the frame conversion. here to mention a blood and tears lesson, SwsContext
after use must remember to call the sws_freeContext
release. after writing the demo of this article, play the video discovery
The memory it occupies has been growing, needless to say it's a memory leak. I was focused on detecting several cache queues and no problems were found. In the end there is no way, a piece of code to check, and eventually found that the use is done SwsContext
not released. At first, I thought I SwsContext
just set a conversion parameter, and didn't care, who knew it would take up so much space, playing a video memory occupied once reached a G, which just played for more than 10 minutes.
Summary
From the previous summary to now, the linger nearly half a month finally is the multi-threaded play finished, from which really learned a lot of things.
From graduation to now into the company nearly 3 months, the basic is soy sauce three months, the company's code has not seen, all day to the computer screen there is no matter to do.
Some of the plans behind it, to urge yourself not to be so lazy
- Synchronizing Video and Audio
- Use multi-line libraries with c++11
- Re-refactoring the code to use a different UI library for rendering (try to change QT)
The code for this article Fsplayer
FFmpeg Learning 5: Multi-threaded playback video audio