FFmpeg Learning 5: Multi-threaded playback video audio

Source: Internet
Author: User
Tags semaphore

In the previous study, video and audio playback were performed separately. This is mainly for the convenience of learning, after a period of study, the FFmpeg also have a certain understanding, this article introduces
How to use multithreading to play audio and video simultaneously (not synchronized), and the previous learning code is refactored to facilitate later extension.
This article mainly has the following aspects the content:

    • The overall process for multi-threaded playback of visual audio
    • Multi-threaded queues
    • Audio playback
    • Video playback
    • Summary and follow-up plan
1. Overall process

The initialization process for ffmpeg and SDL is not mentioned here. The entire process is as follows:

    • For an open video file ( AVFormatContext that is, get it), create a separate thread, constantly read the Packetfrom the stream, and, according to its stream index, store the Packet separately in Audio Packet Queue and Video Packet are both queued caches.
    • The audio playback thread. Create a callback function to remove the Packet from the audio Packet queue and decode it to send the decoded data to the SDL Audio device for playback
    • Video playback thread.
      • Creates a video decoding thread, extracts the Packet from the video Packet queue, decodes it, and puts the decoded data into the video Frame queue cache.
      • Enter the SDL Window event loop, remove the Frame from the Video frame queue at a certain speed, convert it to the appropriate format, and then display it on the SDL screen

Throughout its process, such as:

1.1 The main function after refactoring

In the previous study process, the main is followed Dranger tutorial. Because the tutorial is C-based, it is not very convenient to use the code in a tutorial that uses multi-threaded audio and video playback. In this article, the code is refactored with C + + encapsulation.
The main following functions are encapsulated:

    av_register_all();    SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER);    char* filename = "F:\\test.rmvb";    MediaState media(filename);    if (media.openInput())        SDL_CreateThread(decode_thread, "", &media); // 创建解码线程,读取packet到队列中缓存    media.audio->audio_play(); // create audio thread    media.video->video_play(); // create video thread    SDL_Event event;    while (true) // SDL event loop    {        SDL_WaitEvent(&event);        switch (event.type)        {        case FF_QUIT_EVENT:        case SDL_QUIT:            quit = 1;            SDL_Quit();            return 0;            break;        case FF_REFRESH_EVENT:            video_refresh_timer(media.video);            break;        default:            break;        }    }

The main function is divided into three parts:

    • Initializing FFmpeg and SDL
    • Creating an audio playback thread and a video playback thread
    • The SDL event loop displays the image.
1.2 Data structures used

Encapsulates the main data that you need to use during playback as three structures:

    • Mediastate mainly contain AudioState and VideoState pointers, as well asAVFormatContext
    • Audiostate the data needed to play the audio
    • Videostate the data needed to play the video

Here we introduce the MediaState data structures associated with the audio and video playback later.
MediaStateThe statement is as follows:

struct MediaState{    AudioState *audio;    VideoState *video;    AVFormatContext *pFormatCtx;    char* filename;    //bool quit;    MediaState(char *filename);    ~MediaState();    bool openInput();};

The structure is relatively simple, its main function is in oepnInput , the function is used to open the corresponding video file, and read the corresponding information filled into VideoState and AudioState structure.
There are several main functions as follows:

    • Call avformat_open_input to get avformatcontext pointer
    • Locate the audio stream index, and then open the appropriateAVCodecContext
    • Find the index of the video stream and open the correspondingAVCodecContext
1.3 Packet Detach Thread

After the call oepnInput , to get enough information, and then create the packet detach thread, follow the resulting stream index and place the av_read_frame read packet into the corresponding packet cache queue.
Some of the code is as follows:

if (packet->stream_index == media->audio->audio_stream) // audio stream{    media->audio->audioq.enQueue(packet);    av_packet_unref(packet);}       else if (packet->stream_index == media->video->video_stream) // video stream{    media->video->videoq->enQueue(packet);    av_packet_unref(packet);}       else    av_packet_unref(packet);
2. Multi-threaded queues

The detach thread will read the packet into the packet queue of the video and audio respectively, the packet queue will be accessed by multiple threads, the separating thread fills the packet; The video and audio playback threads remove packet from the queue
To decode and then play. PacketQueueThe statement is as follows:

struct PacketQueue{    std::queue<AVPacket> queue;    Uint32    nb_packets;    Uint32    size;    SDL_mutex *mutex;    SDL_cond  *cond;    PacketQueue();    bool enQueue(const AVPacket *packet);    bool deQueue(AVPacket *packet, bool block);};

Use the standard library std::queue as a container for storing data, SDL_mutex and is the synchronization of the SDL_cond threads that are provided in the SDL library and that are used to control the queue of the mutex and condition variables.
When you want to access the elements in the queue, use SDL_mutex to lock the queue, when there is no packet in the queue, and then there is a video or audio thread to take packet in the queue, you need to set a
Set SDL_cond the semaphore to wait for the new packet into the queue.

  • The methods for entering the queue are as follows:

    bool PacketQueue::enQueue(const AVPacket *packet){AVPacket *pkt = av_packet_alloc();if (av_packet_ref(pkt, packet) < 0)    return false;SDL_LockMutex(mutex);queue.push(*pkt);size += pkt->size;nb_packets++;SDL_CondSignal(cond);SDL_UnlockMutex(mutex);return true;}
    Note the packet call to the incoming queue av_packet_ref increases the reference count method to replicate the data in the packet. After the new packet is enqueued, the semaphore is set to notify the new packet into the queue, and
    Unlocks the packet queue.
  • The way out of the team is implemented as follows:

    bool PacketQueue::deQueue(AVPacket *packet, bool block){bool ret = false;SDL_LockMutex(mutex);while (true){    if (quit)    {        ret = false;        break;    }    if (!queue.empty())    {        if (av_packet_ref(packet, &queue.front()) < 0)        {            ret = false;            break;        }        //av_packet_free(&queue.front());        AVPacket pkt = queue.front();        queue.pop();        av_packet_unref(&pkt);        nb_packets--;        size -= packet->size;        ret = true;        break;    }    else if (!block)    {        ret = false;        break;    }    else    {        SDL_CondWait(cond, mutex);    }}SDL_UnlockMutex(mutex);return ret;}

    The parameter block identifies if the queue is empty and if the wait is blocked, and when set to true, the thread that takes packet blocks waits until the cond semaphore is notified. In addition, the
    av_packet_unrefthe reference count to reduce packet data is called after the packet is removed.

3. Audio playback

Audio playback has been summarized in the previous FFmpeg Learning 3: Play audio, its playback process is mainly set up to send data to the audio device callback function, here is no longer detailed. Unlike before, the playback data is encapsulated as follows:

struct AudioState{    const uint32_t BUFFER_SIZE;// 缓冲区的大小    PacketQueue audioq;    uint8_t *audio_buff;       // 解码后数据的缓冲空间    uint32_t audio_buff_size;  // buffer中的字节数    uint32_t audio_buff_index; // buffer中未发送数据的index        int audio_stream;          // audio流index    AVCodecContext *audio_ctx; // 已经调用avcodec_open2打开    AudioState();              //默认构造函数    AudioState(AVCodecContext *audio_ctx, int audio_stream);        ~AudioState();    /**    * audio play    */    bool audio_play();};
    • audioqIs the queue that holds the audio packet;
    • audio_streamis the index of audio stream

Several other fields are used to cache the decoded data, and the callback function extracts the data from that buffer to the audio device.

    • audio_buffPointers to buffers
    • audio_buff_sizeHow much data in the buffer
    • audio_buff_indexA pointer to a buffer in which data has been sent
    • BUFFER_SIZEMaximum capacity of the buffer

function audio_play to set the required parameters for playback and to start the audio playback thread

bool AudioState::audio_play(){    SDL_AudioSpec desired;    desired.freq = audio_ctx->sample_rate;    desired.channels = audio_ctx->channels;    desired.format = AUDIO_S16SYS;    desired.samples = 1024;    desired.silence = 0;    desired.userdata = this;    desired.callback = audio_callback;    if (SDL_OpenAudio(&desired, nullptr) < 0)    {        return false;    }    SDL_PauseAudio(0); // playing    return true;}
4. Video Playback 4.1 videostate

Similar to audio playback, it also encapsulates the VideoState data needed to save video playback

struct VideoState{    PacketQueue* videoq;        // 保存的video packet的队列缓存    int video_stream;          // index of video stream    AVCodecContext *video_ctx; // have already be opened by avcodec_open2    FrameQueue frameq;         // 保存解码后的原始帧数据    AVFrame *frame;    AVFrame *displayFrame;    SDL_Window *window;    SDL_Renderer *renderer;    SDL_Texture *bmp;    SDL_Rect rect;    void video_play();        VideoState();    ~VideoState();};

VideoStateThe fields in can be broadly divided into three categories:

    • Video decoding requires data packet queue, stream's index, and Avcodeccontext
    • Will decode the intermediate data
      • Framequeue frame queue, storing the frame decoded from the packet. To refresh a new frame, remove the frame from the queue and render it to the interface after the format is converted.
      • Frame format Conversion Intermediate variable
      • After the Displayframe format is converted to Fram, the data in Fram is finally rendered to the frame on the interface
    • Data required for SDL to play video

FrameQueueImplementation and PacketQueue the implementation of similar, no longer repeat.

4.2 Video's Decode and play

In videostate function video_play to initialize video playback and turn on video decoding thread

void Videostate::video_play () {int width = 800;    int height = 600; Create SDL window Windows = Sdl_createwindow ("FFmpeg Decode", sdl_windowpos_undefined, sdl_windowpos_undefined, Width, h    eight, SDL_WINDOW_OPENGL);    Renderer = sdl_createrenderer (window,-1, 0);    BMP = Sdl_createtexture (renderer, SDL_PIXELFORMAT_YV12, sdl_textureaccess_streaming, width, height);    rect.x = 0;    Rect.y = 0;    RECT.W = width;    rect.h = height;    frame = Av_frame_alloc ();    Displayframe = Av_frame_alloc ();    Displayframe->format = av_pix_fmt_yuv420p;    Displayframe->width = width;    Displayframe->height = height; int numbytes = Avpicture_get_size ((avpixelformat) displayframe->format,displayframe->width, displayFrame->    height);    uint8_t *buffer = (uint8_t*) av_malloc (numbytes * sizeof (uint8_t)); Avpicture_fill (avpicture*) displayframe, buffer, (Avpixelformat) Displayframe->format, Displayframe->width,    Displayframe->height); Sdl_createthread (DecOde, "", this); Schedule_refresh (this, 40); Start Display}

First, some variables of the SDL window are created, and the data space is allocated according to the corresponding format, displayFrame then the decoding thread of video is created, and the last sentence schedule_refresh(this, 40) is to start the SDL event loop and refresh the frame continuously on the window.
The decoding thread function for video is as follows:

int  decode(void *arg){    VideoState *video = (VideoState*)arg;    AVFrame *frame = av_frame_alloc();    AVPacket packet;    while (true)    {        video->videoq->deQueue(&packet, true);        int ret = avcodec_send_packet(video->video_ctx, &packet);        if (ret < 0 && ret != AVERROR(EAGAIN) && ret != AVERROR_EOF)            continue;        ret = avcodec_receive_frame(video->video_ctx, frame);        if (ret < 0 && ret != AVERROR_EOF)            continue;        if (video->frameq.nb_frames >= FrameQueue::capacity)            SDL_Delay(500);        video->frameq.enQueue(frame);        av_frame_unref(frame);    }    av_frame_free(&frame);    return 0;}

The function is simple, is to constantly remove the packet from the packet queue, and then decode, the decoded frame queue for the display thread to use, and finally rendered to the interface. Note that this gives the frame queue a maximum capacity, and when the frame queue is full, it blocks the decoding thread and waits for the display thread to play for some time.

4.3 Display thread

Frames are rendered using the SDL library, so the display thread is actually the SDL window time loop. The video frame display process is as follows:

In the video_play function, after the decoding thread of the video is started, a function is called schedule_refresh to start the display thread of the frame.

// 延迟delay ms后刷新video帧void schedule_refresh(VideoState *video, int delay){    SDL_AddTimer(delay, sdl_refresh_timer_cb, video);}uint32_t sdl_refresh_timer_cb(uint32_t interval, void *opaque){    SDL_Event event;    event.type = FF_REFRESH_EVENT;    event.user.data1 = opaque;    SDL_PushEvent(&event);    return 0; /* 0 means stop timer */}

Schedule_refresh sets a delay time, and then calls the SDL_REFRESH_TIMER_CB function. SDL_REFRESH_TIMER_CB is a
send a ff_refresh_event event to the SDL event loop. From the previous event handling, the Video_refresh_timer
is called after the ff_refresh_event event is received The function takes each frame out of the frame queue and renders it to the interface after the format is converted.

void Video_refresh_timer (void *userdata) {videostate *video = (videostate*) UserData;  if (video->video_stream >= 0) {if (Video->videoq->queue.empty ()) Schedule_refresh (video,        1);            else {/* Now, normally here goes a ton of code about timing, etc. we ' re just going to Guess at a delay for now.            can increase and decrease this value and hard code the timing-but I don ' t suggest that;)            We'll learn how to does it for real later.            */Schedule_refresh (video, 40);            Video->frameq.dequeue (&video->frame); Swscontext *sws_ctx = Sws_getcontext (Video->video_ctx->width, Video->video_ctx->height, Video->video _ctx->pix_fmt, Video->displayframe->width,video->displayframe->height, (AVPixelFormat) video->d            Isplayframe->format, Sws_bilinear, nullptr, nullptr, nullptr); Sws_scale (Sws_CTX, (uint8_t const * const *) video->frame->data, video->frame->linesize, 0, Video->video_c            Tx->height, Video->displayframe->data, video->displayframe->linesize); Display the image to screen sdl_updatetexture (Video->bmp, & (Video->rect), video->displayframe-&            Gt;data[0], video->displayframe->linesize[0]);            Sdl_renderclear (Video->renderer);            Sdl_rendercopy (Video->renderer, Video->bmp, &video->rect, &video->rect);            Sdl_renderpresent (Video->renderer);            Sws_freecontext (SWS_CTX);        Av_frame_unref (Video->frame);    }} else {Schedule_refresh (video, 100); }}

The implementation of this function is also very clear, constantly take the frame from the frame queue, SwsContext create VideoState The parameters set by the format of the frame conversion. here to mention a blood and tears lesson, SwsContext after use must remember to call the sws_freeContext release. after writing the demo of this article, play the video discovery
The memory it occupies has been growing, needless to say it's a memory leak. I was focused on detecting several cache queues and no problems were found. In the end there is no way, a piece of code to check, and eventually found that the use is done SwsContext not released. At first, I thought I SwsContext just set a conversion parameter, and didn't care, who knew it would take up so much space, playing a video memory occupied once reached a G, which just played for more than 10 minutes.

Summary

From the previous summary to now, the linger nearly half a month finally is the multi-threaded play finished, from which really learned a lot of things.
From graduation to now into the company nearly 3 months, the basic is soy sauce three months, the company's code has not seen, all day to the computer screen there is no matter to do.
Some of the plans behind it, to urge yourself not to be so lazy

    • Synchronizing Video and Audio
    • Use multi-line libraries with c++11
    • Re-refactoring the code to use a different UI library for rendering (try to change QT)

The code for this article Fsplayer

FFmpeg Learning 5: Multi-threaded playback video audio

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.