Implementation Technology of real-time video network transmission system

Source: Internet
Author: User
Abstract:The study of real-time video network transmission technology is of great significance for video applications. In this paper, an implementation diagram of a real-time video network transmission system is provided. Combined with the principle diagram, the key technologies of the real-time video network transmission system are analyzed: video collection, video encoding/decoding, and network transmission control protocols. Specifically, it emphasizes communication-oriented video coding at the sending end, error hiding at the receiving end, error recovery, and network transmission control protocols that provide quality of service (QoS) for video applications, the specific implementation method is also provided. Finally, it is combined with embedded system development to build a user terminal development platform.

  Keywords:Video Acquisition; video communication; video encoding; Embedded Operating System

I. Introduction

The development of video technology and network communication technology has made video streaming more and more widely used. Two main questions need to be considered in the system that provides Streaming Media Services: how to obtain digital video information? How can we transmit the obtained video information effectively and reliably? Information Retrieval: with the emergence of various high-performance video capture chips, such as SAA7111, saa7114, and tvp5145, this makes the video collection system more and more stable and reliable, and can provide a more satisfactory video collection quality. For the latter's key development, it is embodied in the formulation of various video encoding standards (such as H.26x [1, 2] series, jpege [3] and MPEG [4] series) and the development of network transmission technology (including the development of network exchange technology and network transmission control protocol ).

Streaming media applications may need to determine the transmission and receiving synchronization, wide transmission bandwidth, low quality jitter, uncertain transmission latency, not very sensitive to error codes, but very sensitive to error retransmission latency, at present, the TCP/IP-based data transmission network is essentially a best-effort network that provides transmission services for traditional data services. bandwidth fluctuations are inevitable, the transmission latency is also random [5]. Therefore, it is a core issue to consider how to fairly provide streaming media services and traditional data services on TCP/IP networks.

Based on the analysis of the system structure and transmission protocol of the real-time video network transmission system, this paper provides a schematic diagram of the real-time video network transmission system. Combined with the frame diagram, this paper analyzes the three core sub-modules of the sending module: Video Capture sub-modules, which are mainly used to convert analog videos to digital videos, and video encoding sub-modules, encode the obtained original digital video to generate a communication-oriented video stream that complies with certain standards. The video network transmission control sub-module, provides real-time and robust network applications for video encoding. Correspondingly, the receiving module is the inverse of the sending module. Finally, combined with embedded system development, a complete real-time video transmission system user terminal development platform is established.

2. Implementation of real-time video network transmission system

A complete real-time video network transmission system includes video collection, video encoding, transmission control protocol processing, communication network, and video decoding. Its function is to provide video application services on TCP/IP-based communication networks with random latency and packet loss. The schematic diagram 1 is shown in.

In the system shown in figure 1, the video stream processing and transmission process is as follows: on the video sending end, the simulated video is sampled to obtain and encode the digital video, alternatively, you can encode the input digital video directly to generate a network-oriented communication-oriented video stream. Based on the feedback, You can estimate the available transmission bandwidth of the network, adaptive Adjustment of the encoding output rate of the encoder (including source Bit Rate Adjustment and channel bit rate adjustment), so that the video bit rate can meet the current network transmission bandwidth limit; at the receiving end, decode the received video stream, reconstruct the video signal, calculate the current network transmission parameters (such as packet loss rate during transmission), and send feedback control information.

Iii. Video Capture Module

The video acquisition module consists of video A/D, video D/A, synchronization logic control, video processing, and data storage. Figure 2 shows the basic structure of the video collection system.

A/D converts various standard analog video signals into digital video signals and serves as the input data of Video Processing subunits; generally, FPGA or CPLD is used to implement various synchronization logic controls to ensure real-time collection. analyzing and processing video data is the soul of the entire collection module, however, the amount of computing required is often large. To ensure the real-time video processing, video processing chips, high-speed DSP, FPGA, and DSP are often used to complete video processing.

The video sampling chip can use the digital video decoder tvp5145 of Ti, which can convert analog videos of NTSC, pal, and SECAM to digital compound videos. In the system set up in this article, video A/D adopts the programmable video input processing chip SAA7111 of Philips. This device adopts the CMOS process and contains four analog video input channels? 2C bus, the host can easily initialize the device. At the same time, the SAA7111 can be used internally to select analog channels and perform anti-aliasing filtering on the input video. It also contains two 8-bit A/D converters. The chip also implements automatic clamping, automatic gain control (AGC), Clock generation, and Multi-standard decoding. In addition, the brightness, color, and saturation are also controlled in the chip. The main feature of the chip is that only one 24.576 MHz crystal oscillator is required to meet all video standards, and the system is automatically detected in the chip, therefore, it is extremely convenient to develop video desktop systems, digital TV systems, video phones, and image processing systems.

Iv. Video Encoding/Decoding Module

The video encoding module compresses digital video signals into data streams that meet certain visual quality requirements and meet certain standards. In the network communication application of video streams, it is particularly emphasized that the video streams generated by encoder should be adaptive to the random fluctuation of network transmission bandwidth. Currently, a scalable video encoder is often used to encode video signals. Scalable Video Encoding can be performed in the time domain, airspace, or orthogonal transform domain. The basic idea is to divide the code stream into the basic layer and the enhancement layer. The bitrate at the basic layer must be transmitted, including the video bit rate that provides the minimum quality level and the motion vector of the video sequence. You can choose to transfer the added layer, in addition, any truncation can be performed based on the Transmission Conditions of the network. Under ideal conditions, the quality of the video stream will improve with the increase of the received enhancement layer code stream.

A TCP/IP-based computer communication network is a network that provides best-effort (best-effort) services without QoS guarantee, and packet transmission latency is random, packet Loss and bandwidth fluctuation are inherent in the network. To provide stable and smooth streaming media services, the encoding rate of the Video Encoder must be adjusted (including source code stream adjustment and channel Bit Rate Adjustment) and selective frame discard must be performed on the premise of Real-Time Estimation of network transmission bandwidth. The schematic diagram 3 shows the encoding-side data stream processing that is applicable to video network transmission requirements.

?

In contrast to figure 3, the feedback control module adjusts the encoding rate (source Bit Rate Adjustment) and channel error control (Channel bit rate adjustment) of the encoder based on the feedback from the network ). The goal of adjustment is to achieve optimal distribution of the source bit rate and channel bit rate under the condition of limited total output bit rate (≤rmax). The theoretical basis is the rate distortion function (RD function ). For non-uniform Error Correction protection, the RS code with the maximum erasure capability is used for the basic layer (this code is a maximum and minimum distance code, that is, the MDS code. During correction and deletion, the original information can be restored when n = K + r packets receive any data packet.) or the Turbo code with strong information protection capabilities performs channel encoding, encode the general error protection capability of the enhancement layer. To reduce the impact of channel burst codes on video streams, video packets are often intertwined to reduce the probability of simultaneous error codes from neighboring packets, so as to facilitate error hiding and recovery at the receiving end.

The function of the user terminal decoding module is the complement of the encoding module. Packet Loss is inevitable during network transmission of video streams (especially in wireless network transmission environments ). To ensure completely correct data packet transmission, you can adopt a retransmission policy. However, for video stream applications, latency-sensitive is more sensitive than packet loss. Therefore, at the receiving end, the correct data packet transmission is not required. How to provide the most satisfactory video quality based on the correct received packets is the central issue of the decoder module at the receiving end. This problem is equivalent to how to use the redundant information of the received data packets to provide more satisfactory decoded video stream output. The solution is to hide the error at the receiving end and recover the error [6].

The main methods for error hiding are as follows: ① spatial Relevance-based error hiding: correct data of the wrong block in adjacent blocks within the same frame is interpolated to reconstruct the data of the wrong block, to hide errors. This method can effectively recover areas with similar or many details. ② time-related error hiding: this method uses adjacent frames in time for highly correlated error hiding.

Based on the traditional method described above, a new development of error hiding is the adoption of adaptive methods for improvement, that is, you can choose the corresponding restoration methods or the combination of these methods based on the characteristics of the image and the type of the error code. An Adaptive Criterion is to maximize the peak signal-to-noise ratio (SNR) of the restored image. Linear Weighted merge and maximum signal-to-noise ratio merge are supported. At the same time, with the improvement of MPEG-4 Based on Object encoding technology, error Hiding Based on principal component analysis can be used. The specific implementation is: Find the feature model of each object on the model-based image encoding, and then perform projection on the Feature Model to obtain the projection coefficient. Then, the projection coefficient is used to reconstruct the image, as the final result of recovery. The projection process can be cyclical.

On the transmission user terminal platform built in this article, the encoding module uses analog's real-time video compression decompression chip ADV611 and DSP chip adsp2185 to complete video encoding. ADV611 is a Video Codec Chip adv6xx series of ad based on wavelet transform. The series also include ADV601, adv601lc, and adv612. Its biggest feature is its quality window feature, this allows you to extract a part of each video image more efficiently than others. The principle of the Quality window is: the user defines a box of any scale and position, and the image inside the box is compressed according to the original compression ratio, the image outside the box is compressed according to the User-Defined high compression ratio value c. In this way, the image in the box can be more contrast (compared to the image outside the box) during decompression ).

Figure 4 shows the internal function diagram of ADV611.

After the original digital video passes through the I/O interface, the wavelet transform, frame extraction, quantization, travel encoding, and entropy encoding are performed under the control of the quality window to generate the compressed encoded video data stream, first-in-chip FIFO. When the data in FIFO reaches the value set by the host in the register, an interrupt request is sent, and data streams are sent from the host interface to the host. When ADV611 is used for decoding, the direction of the data stream is the opposite.

We can see from the above that various compression and decompression parameters in the encoding or decoding process must be transmitted from the host interface to ADV611 by the host. That is to say, the host needs to provide various encoding parameters for ADV611. Therefore, adsp2185 of an ad company is used to calculate and transmit various parameters required for ADV611. Meanwhile, it performs Network Transmission Encoding for the raw compressed data stream after ADV611 encoding to generate video streams that adapt to network transmission.

V. Transmission Control and protocol processing module

Video Stream Transmission is significantly different from traditional TCP/IP network data transmission, mainly manifested in: traditional data transmission has no strict requirements on transmission latency and transmission jitter, however, there is a strict Error Control and error retransmission mechanism. Video Streams require real-time transmission, high synchronization requirements, and sensitive to transmission latency and jitter. However, under certain circumstances, packet loss can be allowed, that is, a certain degree of Transmission Error code is acceptable. In addition, the streaming media service must meet the needs of broadcast and multicast applications, and must have the ability to adjust the video transmission quality according to the real-time available transmission bandwidth of the network.

To provide streaming media data services over the Internet, you must use RTP/RTCP (Real-time transport protocol/real-time transport control protocol) protocol. The RTP protocol works under one-to-one or one-to-many transmission conditions, providing time information during data packet transmission and implementing stream data synchronization. The RTCP protocol works with the RTP protocol, provides traffic control and congestion control during network transmission .?

Network Congestion Control is the core issue to be addressed when the transmission control protocol provides streaming media services. The basic solution is to estimate the available bandwidth of the network, and then adjust the data output rate of the terminal based on the current network throughput, so that the bit rate of the terminal can change with the change of network transmission conditions. Estimation of available network bandwidth is mainly based on RTT (network transmission round-trip time) and Packet Loss Rate [7]. To adjust the speed, AIMD (additive incresae, multiplication decrease) is often used) algorithm (addition, multiplication and subtraction) [8] Algorithm for congestion control.

Figure 5 shows the transmission control protocol used in video network transmission.

Among them, the application generates streaming media data for packaging, and RTP protocol data encapsulation; RTP protocol data is encapsulated in UDP message fields, and then IP data is encapsulated; data packets are sent over the transmission network. The header of the RTP protocol packet contains the payload type, sequence number, timestamp, and synchronization source identifier of the data packet. The RTP Header information can be properly cached for Video Stream playback and playback. RTCP data packets do not encapsulate streaming media data. They only encapsulate the sending and receiving statistics (such as transmission delay and transmission packet loss rate) of the sending end or the receiving end ). With RTCP feedback, the sender can estimate the transmission bandwidth of the network and adjust the encoding rate of the encoder in real time according to the Transmission Conditions of the network, in this way, stable video stream services can be provided in the case of bandwidth fluctuations.

When you need to provide point-to-point or one-to-many, many-to-many communication services for streaming media, you also need to use the transmission control protocol corresponding to the specific communication network environment. For example, H.320 for video conferencing is provided in the ISDN environment, H.324 for video conferencing is provided in the PSTN environment, and h.32. The supported speed is 64 K, 192 K, 384 K, and K.

On the video network transmission platform we set up, the hardware platform of the transmission control and protocol processing module is developed based on the MPC860 and integrated with the embedded Linux system, in the user program graphic design, applications based on RTP/RTCP use the transport layer UDP protocol provided by the operating system and the Socket network programming interface of Linux [9] to realize network transmission of real-time videos.

Vi. Conclusion

As the video codecs increase in speed and performance, the speed of the video processing terminal will be faster and the size will be smaller. Increasingly increasing user needs make real-time video transmission and network services inevitable. At the same time, the embedded system is based on computer technology, with more emphasis on comprehensive factors such as volume, power consumption, cost, and portability. Therefore, the user terminals of the video transmission system composed of the two will have incomparable advantages. The system implemented in this article is embodied in: fast and efficient front-end image processing (Adaptive Processing of various standard videos and efficient image compression) flexible coding method selection (the encoding program in DSP can be changed flexibly), powerful network processing capability and system control capability (rich network functions can be implemented on the MPC860, for example, you can transfer data from simple point-to-point to complex video conferencing, and you can easily and flexibly expand various applications ), therefore, there is reason to believe that the video transmission system platform built on the basis of the embedded system has broad market prospects. Of course, in the process of application-oriented development, it also inevitably promotes the research and development of embedded operating systems, video processing algorithms, and chips.

On the user terminal transmission platform built in this article, how to provide more satisfactory network traffic control and smooth video stream transmission application services will be further studied in the future.

References

[1] ITU-T recommendation H.263, video coding for Low Bit Rate communication [s].?
[2] Document JVT-C167, Joint Video Team (JVT) of ISO/iec mpeg and ITU-T VCEG. draft ITU-T recommendation H. 264 (. k. A "H.26L") [s].?
[3] ISO/iec jtc 1/SC 29/WG 1 n1803: 2000, 2000requirements and profiles version6.3 [s].?
[4] ISO/IEC/JTC1/sc29/wg11 n2687, MPEG-4 video Verification Model Version 13.0 [s].?
[5] Busse I, deffner B, et al. Dynamic QoS Control of multimedia applications based on RTP [J]. Computer Communications, 1996, 19 (1): 49 ~ 58 .?
[6] Wang Yao, Zhu Qin-Fan.Error control and concealment for Video Communication: A Review [J]. Proceedings of the IEEE, (5): 974 ~ 997 .?
[7] padhye J, firoiu V, towsley D, et al. modeling TCP throughput: A Simple Model and Its Empirical validation [A]. ACM sigcomm 98 [C]. vancouver, 1998 .?
[8] Chiu D, Jani R. analysis of the increase and decrease algorithms for congestion avoidance in computer networks [J]. computer Networks and ISDN systems, 1989, 9 (1): 2 ~ 13.
[9] pomerantz Ori. the Linux kernel module programming guide (version 1.0) [Z]. Linux documentation project, 1999.

Author: Yang Zhiwei, Feng zongzhe, Guo baolong (School of mechanical and electrical engineering, Xi'an University of electronic science and technology, Xi'an 710071, China)

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.