1. Applicable
H.264 Video Transmission Mechanism
RTP is discussed earlier.
Protocol and the basic stream structure of H.264, how can we use RTP to transmit H.264 videos? An effective method is to strip each NALU from the H.264 video, add the corresponding RTP Header before each nalu, and then send packets containing the RTP Header and NALU. The following describes the RTP Header and NALU respectively.
Complete RTP The Fixed Header Format has been pointed out in figure 1 above. According to rfc3984 [3], the specific settings of each bit are given here.
V: version number, two digits. According to rfc3984, the current RTP version number should be set to 0x10.
P: Fill bit, 1 bit. Currently, no special encryption algorithm is used, so this bit is set to 0.
X: Extended bit, 1 bit. The current Fixed Header does not follow the header extension, so this bit is also 0.
Cc: CSRC count, 4 bits. Indicates that The number of CSRC after the Fixed Header. For the basic Streaming Media Server to be implemented in this article, the mixer is not used, and the bit is set to 0x0.
M: 1 digit. If the current NALU is the NALU at the end of an access unit, the m position is 1; or when the current RTP packet is the last part of a NALU (the NALU fragment is described later ), m position 1. In other cases, the M bit remains 0.
PT: load type, 7-bit. For H.264 video formats, no default PT value is specified currently. Therefore, you can select a value greater than 95. The value is 0x60 (96 in decimal format ).
Sq: serial number, 16 bits. The starting value of the sequence number is a random value. Here it is set to 0. Each time an RTP packet is sent, the sequence number value is added to 1.
TS: Timestamp, 32 bits. Like the serial number, the start value of the timestamp is also a random value, where it is set to 0. According to rfc3984, the clock frequency corresponding to the time stamp must be 90000Hz.
SSRC: synchronization source ID, 32-bit. SSRC should be randomly generated so that no two synchronization sources in the same RTP session have the same SSRC identifier. There is only one synchronization source, so it is set to 0x12345678.
The size of each NALU varies depending on the data volume it contains. In an IP network, when the IP packet size to be transmitted exceeds the maximum transmission unit MTU (maximum transmission unit), IP fragmentation occurs. The maximum size of IP messages (MTU) that can be transmitted over the Ethernet is 1500 bytes. If the IP packet sent is greater than MTU, the packet will be split for transmission, which will generate many data packet fragments, increase the packet loss rate and reduce the network speed. For video transmission, if the RTP package is larger than MTU and the package is split randomly by the underlying protocol, it may cause delayed playback or even abnormal playback by the receiver player. Therefore
The NALU unit must be split.Rfc3984
Different RTP packaging schemes in 3 are provided: (1) single
NALU packet: Only one NALU is encapsulated in an RTP package. This package is used for NALU smaller than 1400 bytes in this article.
(2) Aggregation packet: Multiple NALU is encapsulated in an RTP package, which can be used for smaller nalu, thus improving transmission efficiency.
(3) fragmentation unit: a nalu is encapsulated in multiple RTP packets. In this article, a NALU larger than 1400 bytes is used for unpacking.2. H.264
Implementation of streaming media transmission system
A complete streaming media transmission system consists of two parts: server and client [5] [6]. For the server, the main task is to read H.264
Video, separate each NALU unit from the code stream, analyze the NALU type, and set the corresponding RTP Header, encapsulate RTP Send and receive data packets. For the client, the main task is to receive RTP data packets, parse the NALU unit from the RTP packet, and send it to the decoder for decoding and playback. The frame 3 of the streaming media transmission system is shown in.