Direct Mode Coding for bi-predictive pictures in the H.264 Standard

Last Update:2018-12-05 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The new H. 264 (MPEG-4 AVC) video coding standard can achieve considerably higher coding efficiencycompared to previous standards. this is usually mainly due to the consideration of variable block sizes formotion compensation, multiple reference frames, intra prediction, but also due to better exploitation of specified correlation that may exbetist ween adjacent limit, with the Skip mode in predictive (p) slices and the two direct modes in bi-predictive (B) slices. these modes, when signaled, cocould in effect representthe motion of a macroblock or block without having to transmit any additional motion information required by otherinter macroblock types. this property also allows these modes to be highly compressible especially due to theconsideration of Run Length Coding strategies. although for Skip mode spatial correlation of motion vectors fromadjacent macroblocks is used to predict its motion parameters, until recently direct mode considered onlytemporal correlation of adjacent pictures. in this paper we introduce alternative methods for the generation of themotion information for the direct mode using spatial or combined spatiotemporal correlation. considering specified correlation requires that the motion and timestamp information from previous pictures are available in boththe encoder and decoder, it is shown that our spatial-only method can reduce or eliminate such requirements while, at the same time, achieving similar performance. the combined methods on the other hand, by jointly exploitingspatial and temporal correlation either at the macroblock or slice/picture level, can achieve even higher codingefficiency. finally, depends on the existing rate distortion optimization related to B slices within the h.2codec are also presented, which can lead to improvements of up to 16% in bitrate limit ction or, equivalently, morethan 0.7db in SNR. contact Author: Alexis Michael tourapisthomson Inc. revoke ate research-princeton2 Independence Way, Princeton, NJ 08540, usatel: (609) 987-7329, Fax: (609) 987-7299email: alexismt@ieee.orgThe new H. 264 (or JVT, H.26L, MPEG-4 AVC) [1] video coding standard has gained more and more attentionrecently, mainly due to its higher coding efficiency versus previous standards. this new standard essentially relieson several new features [2] such as the consideration of variable block sizes, ranging from 16x16 down to 4x4, and aquadtree structure for motion compensation, multiple reference frames, intra prediction, Context Adaptive entropycoders, in loop de-blocking filtering, but also due to the consideration of generalized bi-predictive (B) slices [3]. unlike older standards such as MPEG-2 [4] and MPEG-4 [5], B slices can use multiple predictions from picturescoming from the same direction (forward or backward) while these cocould also be used as references for other slices. unfortunately, the above also implied that for this standard a considerably higher percentage of BITs is needed forencoding motion information, either due to signaling block sizes, multiple references, or multiple predictions. toalleviate this problem the Skip [2] and direct modes were introduced within predictive (P) and B picturesrespectively, according to which motion is derived directly from previusly encoded information, thus not having toencode any additional motion data for a macroblock (MB) or block. motion for these modes was obtained byexploiting either spatial (skip) or temporal (direct) correlation of the Motion of adjacent MBS or pictures. furthermore, encrypt the Bi-predictive nature of B pictures, direct mode cocould derive two such motionvectors pointing to different references, namely the list 0 and List 1 references, thus leading to even furtherperformance benefit. additionally, since these modes can be signaled without having to transmit any residual data, and due to their very high occurrence within a video stream, Run Length Coding (RLC) strategies are employedwithin H. 264 for coding mode information that depend on the entropy encoding scheme used and can increaseefficiency even further. in particle, if several (I. e. runn) adjacent MBS according to the scanning order are to becoded Using skip mode or direct mode without coefficients then only the actual number of such MBS needs to becoded instead of coding each individually vidually. on the other hand, a drawback of this method is that if no such mbexists, it is also necessary to signal a zero runlength code, which cocould introduce additional overhead within thebitstream. nevertheless, due to the high occurrence of skip and direct modes, this negative impact is more thancompensated in most cases. direct mode motion parameters using temporal correlation, were essential tially derived for the current MB/blockby considering the motion information within a co-located position in a usually subsequent reference picture or moreCSVT-03-04-26 3 precisely the first list 1 reference. following the assumption that an object is moving with constant speed theseparameters are scaled according to the temporal distances (figure 1) of the reference pictures involved. the motionvectors MV l0 and MV L1 for a direct coded block versus the motion vector MV of its co-located position in thefirst List 1 reference were originally calculated as: mvtrmv trdbl0 = ×, (1) mvtrmv tr trdb DL = (−) × 1, (2) but were later replaced with equations :() d x = 16384 + ABS (TD/2)/TD, (3) scalefactor = clip (−1024,1023, (TD × x + 32)> 6) B, (4) mV L0 = (scalefactor × MV + 128)> 8, (5) mV L1 = MV l0 − mV, (6) which can reduce the number of divisions required since variables X and scalefactor can be pre-computed at theslice/picture level. in the above TDB and TDD are the temporal distances, or more precisely picture order count (POC) distances [1], of the reference picture used by the list 0 motion vector of the co-located block in the List 1 picture compared to the current and the list 1 picture respectively. the List 1 reference picture and the reference inlist 0 referred by the motion vectors of the co-located block in List 1 are used as the two references of directmode. due to the temporal nature of the derivation of the motion parameters we will name this direct mode astemporal direct.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Direct Mode Coding for bi-predictive pictures in the H.264 Standard

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Direct Mode Coding for bi-predictive pictures in the H.264 Standard

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support