Caching in the system of audio and video synchronization _

Caching in the system of audio and video synchronization __ Cache

Last Update:2018-08-21 Source: Internet

Author: User

Tags benchmark

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Two ways of synchronizing audio and video synchronization with the time stamp as the benchmark with the time stamp of the video as the benchmark with the external clock as the reference to the standard audio-video synchronization This is a digression. This blog theme is not write audio and video synchronization but write audio and video synchronization a neglected place summary

Preface

These days to do video screen software, generated h264 files, with VLC playback is audio and video synchronization, but with their own write the player, is always a little bit worse. From the perception to see probably between 0.2s~0.4s.
The original video is synchronized with audio and video.
It is when a person speaks quickly, the mouth is aligned with the sound. two modes of audio and video synchronization

Prerequisite: Both the audio frame and the video frame must have their own time stamp.
No time stamp:
If there is no such premise. Audio is OK, set the audio playback parameters after the continuous broadcast.
Video frames are in trouble, either playing at a fixed frame rate or playing fast or slow.
No time stamp is not considered, this is the practice of beginners. base on audio timestamp

That is, sync the video to audio. This way the sound plays smoothly.
Application scenario: most video players. base on video timestamp

Video frames play at their own pace (the buffer is not playing more than the frame of the play faster, on the contrary on the slow, or fixed frame rate);
The disadvantage is: in order to audio and video synchronization, the sound will be Carrington.
Application Scenario: Live class receiver end player. base on an external clock

Select an external clock as the benchmark, and the video and audio playback speed is based on the clock. standard for audio and video synchronization

From the perspective of perception (this is my perception, not the standard).
Let's agree on some data:
Tv0: The time stamp that currently displays the video frame;
TA0: The current display of the video frame should play the time stamp of the audio frame;
TA1: The current display of the video frame actually plays the time stamp of the audio frame;
If | ta0-ta1| < 0.1s, can be considered synchronous.
If | ta0-ta1| >= 0.3s, it was obvious that the conversation was not right. write here feeling digress, this blog theme is not write audio and video synchronization, but write audio and video sync a neglected place

Audio and video synchronization, a lot of online articles. Find yourself, not explain.

Here's a picture of the sound playing in the system

A description of several nouns:
T0: The moment of the sound that is currently playing
T1-t0: Time required for audio data not yet played in the current system cache
T2-T1: The time that the data that is read by each audio callback function is played
BUF0: Data not yet played in the system
BUF1: Data read every time the audio callback function

When the player finishes playing the Buf0, Buf1 hasn't read it yet. So now there's no sound.
The size of the BUF1 should be based on the system. I use QT in Win, read 16,384 byte at a time.
16,383 bytes:
If it is 11025hz, two channels, 16bit, then the sound duration is probably 0.371519s.
If it is 44100hz, two channels, 16bit, then the sound duration is probably 0.0928798s.

In general, the size of the Buf0 <=buf1. But there's no absolute. Maybe it just wants to read a little bit more data.

After I modified the playback sample rate, Qaudiooutput::buffersize () also changed, but I did not feel the impact of the change here.
After I modified the playback sampling rate, the number of bytes per read was 16384 unchanged.

In fact, I always wanted to change the size of Buf1, but I didn't find a way. Summary

The details of this place are easily overlooked. Before writing the player, did not pay attention to such a little gap, when their own coding to deal with, the problem emerges.
I don't really want to use the 44100 sample rate because the amount of data is 4 times times that of 11025.
But because of the time difference between audio and video synchronization, we have to adopt a 44100 sampling rate.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More