Video vendors such as Polycom,vidyo and radvision have introduced H. SVC Technology. The introduction of H. SVC Technology is introduced.
Both Cisco and Polycom offer royalty-free versions of H. svc. Which open264 now the brightest eye.
1.
What is SVC?
H.264svc (Scalable Video Coding) is based on the H. A syntax and toolset extended to support the bitstream with hierarchical characteristics, H.264SVC is the Appendix G of the H. S standard, and as a new profile of H. H.264svc became the official standard in October 2007.
2.
The concept of SVC grading coding
The code stream generated by the encoder contains one or more sub-streams that can be decoded separately, and the sub-stream can have different bitrate, frame rate and spatial resolution.
Type of Rating:
Time domain can be graded (temporal scalability): The code stream with different frame rate can be proposed from the code stream.
Spatially scalable (spatial scalability): code streams with different image sizes can be presented from the stream.
Quality can be graded (quality scalability): Code streams with different image qualities can be presented from the stream.
Figure 1 Classification type
3.
Application of SVC grading coding
1. Monitoring area: Monitoring video stream generally produces 2, 1 good quality for storage, 1 for preview. With the SVC encoder can generate 2 layer of grading stream, 1 basic layer for preview, 1 enhancement layer to ensure that the image quality of storage is high. With the mobile phone remote monitoring preview, you can generate a low bit rate base layer.
2. Video conferencing field: Video Conferencing terminal using SVC to compile multi-resolution, layered quality, the center of the conference instead of the traditional MCU two-time encoding and decoding method to video routing decomposition and forwarding. The time domain can be graded in the network packet loss environment, and the network adaptability can be realized by discarding some time domain level. Svc also has the imagination space in the cloud video field.
3. Streaming IPTV Application: The server can discard the quality layer according to different network situation, ensure the smooth video.
4. Compatible with different network environments and terminal applications.
Figure 2 Applications for different networks and terminals
4.
Advantages and disadvantages of SVC grading coding
Advantage: The advantage of grading code flow is that the application is very flexible and can generate different code streams or extract different streams as needed. It is more efficient to use the SVC to implement a hierarchical encoding than multiple times with AVC. Layered coding has the technical advantage, the new encoder h.265 also uses the layering idea, can realize the flexible application, also may improve the network adaptability.
Disadvantage: The decoding complexity of the hierarchical code stream increases. The basic layer is AVC-compatible bitstream, which has no effect on coding efficiency. Under the same conditions, the classification code stream is less than 10% of the compression efficiency of the single-layer stream, the more the grading layer, the more efficiency decreases, the current JSVM encoder supports up to 3 airspace grading layers. Under the same conditions, the decoding complexity of the hierarchical code stream is higher than that of the single-layer code stream. SVC is October 2007 only to be made as a formal standard, compatibility and connectivity far from AVC good, so SVC practical application is not extensive.
Figure 3 Comparison of the efficiency of the hierarchical coding and the single layer coding
Note: The figure is referenced from the German HHI website.
(1) For TIME domain classification, AVC has been implemented, time domain classification has no effect on coding efficiency.
(2) Quality grading 3 (a) shows that the quality-graded code stream affects the coding efficiency by about 10%.
(3) Airspace Classification 3 (b), the SVC Spatial classification code, not only affect the overall coding efficiency, the basic layer (AVC layer) coding efficiency is also reduced by 10%, the basic layer coding efficiency is reduced due to the basic layer frame prediction is limited.
5.
Technical Extension of SVC to H264, Syntax extension
Syntax Extensions:
(1) The NAL (Network Adaptive Layer) header is extended to describe the hierarchical information of the bitstream. To facilitate the characterization of the classification of AVC-compatible streams, a prefix of nal type 14 nal, which appears in front of the nal of the AVC-compatible stream, is used to describe the hierarchical information of the AVC basic layer stream. See Figure 4, Figure 5.
(2) Use the reserved nal type 14, 20 encoding to enhance the layer code stream.
Figure 4 nal Header Extension
Figure 5 Extending the NAL header content
Technology expansion, layered coding in order to improve coding efficiency, we need to maximize the use of inter-layer correlation. Svc adds a toolset for inter-tier prediction, mainly as follows:
1. Inter-layer Intra-frame prediction (inter-layer Intra prediction).
2. inter-layer macro block mode and motion parameter prediction (inter-layer macroblock mode and motion prediction).
3. inter-layer residual prediction (inter-layer residual prediction)
The following diagram describes the newly added inter-tier prediction technology.
6.
Technology of Svc6.1
Time Domain Classification technology
Figure 6 O'Clock Domain rating
Note: You can get a stream of different frame frequencies by discarding brown, green, and blue in turn.
6.2
Airspace Classification Technology
Figure 7 Airspace classification
6.3
Inter-layer prediction technology
Figure 8 Inter-layer prediction Technology (left) inter-layer Intra-frame prediction (middle) inter-layer type prediction (right) inter-layer residual error prediction
inter-layer Intra-frame prediction ( Inter-layer Intra Prediction ): The image texture is complex and the frame search matches the bad macro block, if the basic layer adopts intra-frame prediction, the enhanced layer can use inter-layer intra-frame prediction mode to improve the coding efficiency. The method is to predict the enhancement layer by sampling the I block of the basic layer, and the reinforcement layer only needs to transmit the residual error of the original image and intra-layer prediction.
inter-layer macro block mode and motion parameter prediction ( inter-layer macroblock mode and motion prediction ): as shown in 7, the macro block type of the enhanced layer can be obtained through the basic layer prediction. The motion parameters of the reinforced layer can also be obtained by sampling the motion parameters of the basic layer. This is one of the differences between H.264SVC and other grading coding techniques. Other graded coding techniques are generally predicted by the upper sampling in the pixel domain, and for h.264svc, the region with large time-domain correlation is predicted by using the motion parameters between the layers, and the motion compensation efficiency is higher in the reinforcement layer. For prediction of inter-layer motion parameters, the particle size supported by the grammar can be a macro block, and the minimum is an 8x8 block.
inter-layer residual prediction ( inter-layer Residual prediction ): 7, for the inter-frame coded macro block, the image residuals of the enhanced layer and the image residuals of the basic layer are correlated, and the residual of the basic layer can be sampled to reduce the image residuals of the enhanced layer coding. The prediction of the inter-layer residual difference in spatial resolution occurs in the residual pixel domain, the computational amount is large, and the prediction of the inter-layer residuals (quality grading) that does not change the spatial resolution occurs in the transformation coefficient or transformation level domain, and the computational amount is small.
6.4
Multilayer code flow, only one motion compensation
The technology ensures that only one motion compensation is required. Because the inter-layer prediction does not take advantage of the reconstruction of the inter-frame block, the reference layer (or the basic layer) does not need decoding and reconstruction, the inter-layer prediction uses the motion vector prediction, and the decoding reconstruction requires only one final motion compensation.
The benefits of this are: (1) saving computational capacity, reducing decoding complexity, and (2) reducing the memory requirements of the decoder.
6.5
description of the grammatical elements of the hierarchical representation
The DEPENDENCY_ID:D layer tag, which is also what we often call the airspace grading layer marker, is from 0 to 7 and has a maximum of 8 D layers. The value of the base layer is 0. The CGS quality grading is a special airspace classification.
QUALITY_ID:MGS quality grading layer markings. from 0 to 15.
TEMPORAL_ID: Time domain rating, from 0 to 7, with a maximum of 8 time domain ratings.
The syntax used by USE_REF_BASE_PIC_FLAG:MGS. Typically, the reconstructed image of the current layer is used as the reference image, and for keyframes, the reconstructed image of the reference layer is used as the reference image. Note the difference, not the reference layer reconstruction using the current image as a reference.
Discardable_flag: The current image is not used as an inter-layer reference layer, the marker is set to 1. When the stream is extracted, if the layer is not the target layer, it is discarded.
6.6
syntax element description using inter-layer prediction
NAL head:no_inter_layer_pred_flag; the switch for inter-layer prediction is enabled for the entire slice
Sliceheader: each macro block adapts to the inter-layer prediction mode or uses the default inter-tier prediction mode
Adaptive_base_mode_flag
Default_base_mode_flag
Adaptive_motion_prediction_flag
Default_motion_prediction_flag
Adaptive_residual_prediction_flag
Default_residual_prediction_flag
Inter-layer Prediction pattern tags for macro blocks:
Base_mode_flag : Whether the macro block uses the inter-layer type and the motion parameter prediction, directly uses the inter-layer Prediction Motion parameter, the code stream no longer passes.
motion_prediction_flag_l0 [Mb_partsize]: whether the macro block segmentation enables motion vector inter-layer prediction, this mode also passes between the layers to predict motion vectors and the actual motion vector residuals.
Residual_prediction_flag : Whether the macro block enables residual prediction.
6.7
inter-layer type prediction calculation
if Base_mode_flag = = 1, the Inter-layer type prediction is required. If the reference layer block for the 16 blocks is block I, the current macro block type is IBL, otherwise it is INTER_BL, the motion vector and the reference index are predicted from the reference layer, the macro block type We now default is marked as p8x8, the sub-block split type can be calculated according to the 6.9 section.
6.8
prediction and calculation of inter-layer motion vectors
For a macro block or motion vector with base_mode_flag equal to 1 to use the macro block of inter-layer prediction, there is a motion vector sampling in the case of transformation of the inter-layer resolution. Follow the steps in the calculation process to explain the principle.
The first step: Calculate the current layer of each 4x4 block at the reference layer corresponding position, if the corresponding position in the reference layer is I block (contains IBL), then return 1, otherwise the reference layer corresponding to the coordinates of the 4x4 block.
The second step: if the current layer macro block of 16 4x4 blocks in the reference layer corresponding position is block I, then the current macro block is the IBL type, no subsequent calculations, otherwise, for the 4x4 block and 8x8 block is the case of the reference is I block, with the adjacent 4x4 block reference layer corresponding block substitution, This prevents the calculation of a sample occurrence of 1 on the reference index.
The third step is to get the motion vector on the sample based on the current layer 8x8 block for the minimum processing unit. After obtaining the motion vector and reference index for each 4x4 block, the minimum non-negative number from the 4 reference indexes is the reference index of the 8x8 block. At the same time, 4 motion vectors were processed after the similarity of 4 motion vectors.
(1) If the absolute value of the difference of 4 motion vectors is less than or equal to 1, then the mean value of the 4 motion vectors is obtained and the final motion vector is achieved.
(2) If in the horizontal direction of the 4x4 block arrangement, the absolute value of the motion vector difference between the two groups of 2 4x4 blocks is less than or equal to 1, then the motion vectors of the 2 4x4 blocks are taken as the mean, and the 8x8 block's segmentation mode is 8x4.
(3) If the vertical direction of the 4x4 block arrangement, the absolute value of the motion vector difference between the two groups of 2 4x4 blocks is less than or equal to 1, then the motion vectors of the 2 4x4 blocks are taken as the mean, and the 8x8 block's segmentation mode is 4x8.
Note: 4 shows that when the spatial resolution changes to 1:2, the 1 4x4 blocks of the reference layer correspond to an 8x8 block of the current layer, so the post-processing of the motion vectors does not exist. Only when the space resolution is limited to 0 (for example, a spatial resolution of 1:1.5), each 8x8 block of the current layer covers more than one 4x4 block of the reference layer before the motion vector post-processing occurs.
Fig. 9 Inter-layer prediction of motion information
6.9
prediction Calculation of inter-layer residual error
In the case of changes in inter-layer resolution, the current block type is the inter-frame block, and residual_prediction_flag==1. residual predictions exist in cases where these conditions are met.
Fig. 10 Theory of prediction and calculation of residual difference between layers
(1) The pixel position of the current layer is assumed to be (x, y), and the pixel position (XREF,YREF) corresponding to the reference layer is obtained according to the calculation formula of the 6.5-pixel inter-layer mapping.
(2) if the (Xref,yref) point and (Xref+1,yref) point belong to the same transform block, an intermediate result, such as a medium black dot, is obtained from bilinear. Similarly, (xref,yref+1) and (xref+1,yref+1) points also calculate an intermediate result.
(3) If the point (Xref,yref) and the point (xref,yref+1) belong to the same transform block, the result of the inter-layer pixel prediction is obtained by bilinear calculation of the intermediate results, otherwise, the Y phase determines which intermediate result to take as the final value.
6.10
inter-layer pixel prediction calculation
Use condition: The current Layer macro block type is IBL type, that is, the current layer macro block of 16 4x4 blocks in the reference layer corresponding to the position is block I, and the current macro base_mode_flag==1.
Figure 11 pixel on sample (brightness)
(1) Firstly, the I macro block and the IBL macro block in the reference layer are reconstructed, and the 8 pixel extension is done in the surrounding P block.
(2) Calculates the position of the current macro block (0,0) at the pixel position in the reference layer, in red dot (xref,yref).
(3) First calculate the middle point of a set of vertical 4-tap filtering. The calculation formula is as follows.
(4) On the basis of step (3), the horizontal 4-tap filter is performed to obtain the final upper sampling result, and the formula is as follows.
Note: The calculation process attention to optimization, but also pay attention to the stability of the code and word length.
6.11
principle of spatial grading coding
For example, two-layer coding, the basic layer is AVC coding, and the enhanced layer adopts inter-layer predictive adaptive coding.
(1) The basic layer adopts AVC encoding method, the restriction condition is that the frame block prediction is limited.
(2) The coding of the enhanced layer can be used to predict the motion vectors on the basic layer, the residual error on the sampling prediction, the pixel on the I block sampling prediction, the macro block type prediction. The macro blocks of the enhanced layer can also be used without inter-layer prediction and are encoded in a similar way to AVC.
6.12
Quality Grading Coding principle
The principle of quality grading coding is illustrated by the example of a 2-layer CGS (coarse-grain scalability), and the SVC-->AVC repro option is not enabled. The basic layer adopts AVC encoding method, the restriction is that the frame block prediction is limited. Because the resolution is constant, the inter-layer prediction information can be better utilized. IBL Type Macro BLOCK: Inter-layer I block, using the basic layer of the I block reconstruction as a prediction, the original image minus the residual of the prediction between layers encoded. Using the motion vector to predict the P block: that is, the motion vectors of the basic layer can be directly used as the motion vectors of the current layer, and the motion vectors of the basic layer can be used as the prediction motion vectors to transfer the motion vectors in the code stream. By using the P-block predicted by the transformation coefficient domain: The residual error of the enhanced layer is transformed, the coefficient of transformation is obtained, minus the inverse of the transformation coefficient of the basic layer, and the residual error of the variation coefficient is quantified, then the entropy-encoded transmission is carried out.
6.13
the principle of SVC------>AVC rewriting tool
Use the diagram in the PPT in the proposal "V-035" to illustrate the principle.
6.14
time Grading Implementation Method
The time domain classification is implemented by the level B frame or the level p frame, in practice, the level B frame is used generally. It is easy to extract the different time domain layers by temporal_id the syntax elements.
.
7.
Reference:
1. https://en.wikipedia.org/wiki/Scalable_Video_Coding
2. Http://ip.hhi.de/imagecom_G1/assets/pdfs/Overview_SVC_IEEE07.pdf
3. JVT Proposal "V-035" http://wftp3.itu.int/av-arch/jvt-site/2007_01_Marrakech/JVT-V035
4. http://www.polycom.com/company/news/press-releases/2012/20121004.html
H. SVC