Streaming media transmission, video monitoring, video conferencing, and VoIP are all inseparable from the application of the RTP protocol, but when everyone chooses the RTP protocol based on experience or other people's applications, you may have wondered why we should use RTP for streaming media transmission? Why must we use RTP? Can TCP, UDP, or other network protocols not meet our requirements?
This article is based on my thoughts on the learning and application of the RTP protocol. I hope to inspire you. At the same time, I also welcome you to leave a message to discuss and propose your own ideas and thoughts.
1. Wikipedia
Reliable protocols, such as the Transmission Control Protocol (TCP), guarantee correct delivery of each bit in the media stream. however, they accomplish this with a system of timeouts and retries, which makes them more complex to implement. it also means that
When there is data loss on the network, the media stream stallwhile the protocol handlers detect the loss and retransmit the missing data. clients can minimize this effect by buffering data for display. while delay due to buffering is acceptable in video
On demand scenarios, users of interactive applications such as video conferencing will experience a loss of fidelity if the delay that buffering contributes to exceeds 200 ms.
Reliable transmission protocols such as TCP ensure the correctness of each bit in the transmitted data stream through timeout and retransmission mechanisms, however, this will make the process of both protocol implementation and transmission very complex. In addition, when data is lost during transmission, the transmission of data streams is forced to be paused and delayed due to the detection (timeout detection) and retransmission of data loss.
You may say that we can use the client to construct a buffer that is large enough to ensure normal display. This method is acceptable for playing audio and video from the network, however, for scenarios that require real-time interaction (such as video chats and video conferences), if the buffer exceeds 200 ms, it will produce an unacceptable real-time experience.
2. Why does RTP solve the above latency problem?
RTP is a UDP-based transmission protocol. RTP itself does not provide a reliable transmission mechanism for transmitting data packets in order, nor does it provide traffic control or congestion control, it relies on RTCP to provide these services. In this way, for those lost packets, there is no delay caused by timeout detection. At the same time, for those discarded packets, the upper layer can also choose to re-transmit the packets based on their importance. For example, for data of I frame, P frame, and B frame, the importance of the data decreases sequentially, so in the case of poor network conditions, you can avoid re-transmission when B or P frames are lost. In this way, although there may be a short unclear picture on the client side, but it ensures real-time experience and requirements.
3. multicast Function
Multicast is widely used in network video conferencing. It is mainly used in such an environment:
Assume that the red circle is the Streaming Media Server that stores video data, and the other circles are the clients connected to the server, when all the green clients require watching a video on the red server at the same time, if the server separately establishes a connection for each client for data transmission, this obviously does not reasonably waste bandwidth, therefore, the multicast technology can solve this problem well, that is, the same data is sent from the server to a public multicast address, and each client listens to the same multicast address to obtain data, this not only saves bandwidth, but also ensures the synchronization of the videos watched by each client.
Such multicast applications are not supported by the TCP protocol, and the RTP protocol was first born to implement similar video conferencing applications and has very good support for it.
4. streaming media features in RTP Headers
First, let's look at the RTP Header.
V-version. Recognize the RTP version.
P-fill. When set to 1, the data packet contains one or more additional fill bits, and the fill BITs do not belong to the payload.
X-extended bit. When set to 1, follow a header extension after the Fixed Header.
M-flag. The flag is defined by the profile file. Mark important events, such as frame boundaries, in a bit stream.
Payload type-load type. The specific application determines its interpretation. Some profile files specify the default static ing from payload encoding to payload format. In addition, payload type encoding can also be dynamically defined using non-RTP methods.
Sequence Number-serial number. Each time an RTP packet is sent, the serial number increases by 1. The receiver can detect packet loss and reconstruct packet sequence accordingly.
Timestamp-timestamp. Indicates the sampling time of the first byte in the RTP data packet. The clock frequency depends on the load data format and is described in the profile.
According to the RTP Header, we can clearly see that in the RTP protocol, a lot of features specifically used for streaming media transmission are added, which is more helpful for streaming media transmission.
For example, the M flag used for frame boundary marking makes it easy for the receiver to quickly locate the frame boundary. For example, the load type field is used to tell the receiver (or player) which type of media is transmitted (for example, g.729, h. 264, MPEG-4, etc.), so that the receiver (or player) will know the format of the data stream, and then use the corresponding decoder to decode or play; such as the timestamp field, identifies the timestamp of the data stream. the receiver can use this timestamp to remove the jitter of information packets caused by the network, and provide synchronization functions for playback at the receiver.
Therefore, compared to the direct use of TCP or UDP for streaming media transmission, this network protocol is more suitable for the transmission of audio and video.
5. RTP profile mechanism
RTP provides great flexibility for specific applications. It separates the transmission protocol from the specific application environment and specific control policies. The transmission protocol itself only provides a mechanism for real-time transmission, developers can select the appropriate configuration environment and appropriate control policies based on different application environments.
The control policy mentioned here means that you can implement specific RTCP control algorithms based on your specific application requirements, for example, the packet loss detection algorithm, the packet loss retransmission policy, and some control schemes in video conferencing applications (these policies may be described in subsequent articles ).
The appropriate configuration environment mentioned above mainly refers to the RTP configuration and the definition of the load format. RTP protocol supports a wide range of multimedia formats (such as H. 264, MPEG-4, MJPEG, MPEG), does not reflect the specific application configuration in the protocol, but is provided through the profile configuration file and the load type format description file form. For any specific application, RTP defines a profile file and related load format descriptions. The related files are as follows:
RTP profile for audio and video conferences with minimal control (rfc3551)
RTP payload format for H.264 video (rfc3984)
RTP payload format for MPEG-4 audio/visual streams (rfc3016)
Wait, want to know more can click here: http://en.wikipedia.org/wiki/RTP_audio_video_profile
Note: If the application uses a standard RTP protocol instead of a proprietary scheme to provide payload type, sequence number, or timestamp, applications are easier to run with other network applications. For example, if two different companies are developing Internet phone software and they all integrate RTP into their products, there is hope: users who use telephone software from different companies can communicate with each other.
6. Other good features of RTP
(1) the RTP protocol is designed with security functions in mind and supports data encryption and authentication.
(2) there are fewer head expenses
TCP and xtp have excessive header overhead compared with RTP (TCP and xtp3.6 are 40 bytes, xtp4.0 is 32 bytes, and RTP is only 12 bytes)
(3 )...... (Waiting for Supplement)
7. Related resource list
Here are some relevant RTP resources, which may help you.
(1) Introduction to RTP in Wikipedia:
Http://en.wikipedia.org/wiki/Real-time_Transport_Protocol
(2) Wikipedia's introduction to streaming media:
Http://en.wikipedia.org/wiki/Streaming_media
(3) stackoverflows forum discussion on RTP vs TCP
Http://stackoverflow.com/questions/361943/why-does-rtp-use-udp-instead-of-tcp
(4) Introduction to RTP load types and timestamps
Http://ticktick.blog.51cto.com/823160/350142
(5) RTP FAQ
Some Frequently Asked Questions about RTP
Reposted from: "three shadows" blog http://ticktick.blog.51cto.com/823160/462746