Introduction to Streaming Media protocol (Rtp/rtcp/rtsp/rtmp/mms/hls

Source: Internet
Author: User documents rfc3550/rfc3551

Real-time Transport Protocol) is a Transport layer protocol for multimedia traffic on the Internet. The RTP protocol details the standard packet format for transmitting audio and video over the Internet. RTP protocols are commonly used in streaming media systems (with the RTCP protocol), video conferencing and a Push-to-talk system (with either a/p or SIP), making it the technical foundation of the IP telephony industry. The RTP Protocol and RTP Control protocol RTCP are used together, and it is built on the UDP protocol.

RTP itself does not provide on-time delivery mechanisms or other quality of service (QoS) guarantees, and it relies on low-level services to implement this process. RTP does not guarantee the transmission or prevention of out-of-order transmission, nor does it determine the reliability of the underlying network. RTP carries out an orderly delivery, the serial number in RTP allows the receiver to reorganize the sender's packet sequence, and the serial number can also be used to determine the appropriate packet location, for example: In video decoding, there is no need for sequential decoding.

RTP consists of two tightly linked parts: rtp― transmits data with real-time properties, RTP Control Protocol (RTCP)-monitors the quality of the service and transmits information about ongoing session participants.


The real-time Transport Control Protocol (Real-time Transport control Protocol or RTP control protocol or shorthand RTCP) is a sister protocol for real-time Transport Protocol (RTP). The RTCP provides out-of-channel (Out-of-band) control for RTP media streams. The RTCP itself does not transmit data, but it collaborates with RTP to package and send multimedia data. RTCP periodically transfers control data between participants in a streaming multimedia session. The main function of RTCP is to provide feedback on the quality of service provided by RTP (Quality of services).
RTCP collects statistics on related media connections, such as: number of bytes transferred, number of packets transmitted, number of packets lost, jitter, unidirectional and bidirectional network latency, and so on. Network applications can use the information provided by RTCP to try to improve the quality of services, such as restricting information traffic or using smaller codecs instead of compression. The RTCP itself does not provide data encryption or identity authentication. SRTCP can be used for such purposes.

SRTP & SRTCP Reference documentation RFC3711

The secure real-time transport protocol (secure real-time Transport protocol or SRTP) is a protocol defined on the basis of the real-time transport protocol (Real-time Transport protocol or RTP). Designed to provide encryption, message authentication, integrity assurance, and replay protection for data in real-time transport protocols in unicast and multicast applications. It was developed by David Oran (Cisco) and Rolf Blom (Ericsson) and was first released by the IETF in March 2004 as RFC3711.
Because of the close connection between the real-time transport protocol and the real-time Transport Control Protocol (RTP Control Protocol or RTCP) that can be used to control the real-time transport protocol, the secure real-time transport protocol also has a companion protocol called Secure Real-time Transport Control Protocol (secure RTCP or SRTCP); The secure real-time transport Control Protocol provides similar security-related features for real-time transport control protocols, just as the secure real-time transport protocol provides for real-time transport protocols.
It is optional to use the real-time transport protocol or the real-time transport control Protocol to make it possible to not use secure real-time transport protocols or secure real-time transport control protocols, but even with secure real-time transport protocols or secure real-time transport control protocols, all of the features they provide, such as encryption and authentication, are optional. These features can be used independently or disabled. The only exception is that the message authentication feature must be used when using the secure real-time transport control Protocol.


Reference Document RFC2326

are presented jointly by Real Networks and Netscape. This protocol defines how a one-to-many application can efficiently transmit multimedia data over an IP network. RTSP provides a scalable framework that enables real-time data, such as audio and video, to be controlled and on-demand. Data sources include field data and data that is stored in a clip. The protocol is designed to control multiple data-sending connections, providing a way to select Send channels, such as UDP, multicast UDP, and TCP, and provide a way to select the RTP-based send mechanism.

RTSP (Real time streaming Protocol) is a multimedia streaming protocol used to control sound or imagery, and allows simultaneous multiple streaming requirements control, and the network protocol used to transmit it is not within its definition. The server side can choose to use TCP or UDP to transmit the stream content, its syntax and operation is similar to HTTP 1.1, but does not particularly emphasize time synchronization, so the comparison can tolerate network latency. In addition, the previously mentioned allow for simultaneous multiple stream demand control (multicast), besides reducing the server-side network usage, it can support multi-video conferencing (video Conference). Because it works in the same way as the HTTP1.1, the caching of proxy server "cache" also applies to RTSP, and because RTSP has a re-directed function, the actual load situation can be converted to serve the server, To avoid excessive load concentration on the same server and cause delays.

The relationship between RTSP and RTP

RTP is not like HTTP and FTP can complete the download of the entire film and television files, it is a fixed data rate on the network to send data, the client is also at this speed to watch the film and television files, when the movie screen playback, you can not repeat the playback, unless the server side to request data.

The biggest difference between RTSP and RTP is that RTSP is a two-way, real-time data transfer protocol that allows clients to send requests to the server, such as playback, fast-forward, and backward operations. Of course, RTSP can be based on RTP to transmit data, but also can choose TCP, UDP, multicast UDP and other channels to send data, has a very good extensibility. It is a network application layer protocol similar to the HTTP protocol. An application currently encountered: server-side real-time acquisition, encoding and sending two video, the client receives and displays two video. Because the client does not need to do any replay, backward operation of the video data, it can be implemented directly by udp+rtp+ multicast.

RTP: Real-time Transport protocol (real-time Transport Protocol)
RTP/RTCP is the protocol that actually transmits the data
RTP transmits audio/video data, if the play,server is sent to the client side, if it is a record, it can be sent by the client to the server
The entire RTP protocol consists of two closely related parts: RTP data Protocol and RTP control Protocol (i.e. RTCP)
RTSP: Live Streaming Protocol (real time streaming protocol,rtsp)
The request of RTSP mainly has describe,setup,play,pause,teardown,options and so on, as the name implies can know the dialogue and control function
During the RTSP conversation, Setup can determine the port used by the RTP/RTCP, Play/pause/teardown can start or stop the RTP send, etc.
RTP/RTCP is the protocol that actually transmits the data
The RTCP includes Sender report and receiver report for audio/video synchronization and other purposes, and is a control protocol


The Session Description Protocol (SDP) provides multimedia session descriptions for session notifications, session invitations, and other forms of multimedia session initialization.
The Session Directory is used to assist in the notification of multimedia conferencing and to transmit relevant setup information for session participants. The SDP is used to transmit this information to the receiving end. The SDP is completely a session description format-it is not a transport protocol-it uses only different appropriate transport protocols, including session Notification Protocol (SAP), Session Initiation Protocol (SIP), real-time Streaming Protocol (RTSP), MIME Extension Protocol e-mail, and Hypertext Transfer Protocol (HTTP).

The SDP is designed for versatility and can be applied to a wide range of network environments and applications, not just to the multicast Session directory, but the SDP does not support negotiation of session content or media encoding.
In the Internet multicast Backbone (Mbone), the Session Directory tool is used to advertise multimedia conferencing and to transfer meeting-specific tool information for participants to conference addresses and participants, which is done by the SDP. After the SDP connects to the session, it transmits enough information to the session participants. SDP information is sent using the Session Notification Protocol (SAP), which periodically multicast notification packets to known multicast addresses and ports. This information is a UDP packet that contains the SAP protocol header and text payload (textbox payload). Here the text payload refers to the SDP session description. Information can also be sent via email or WWW (World Wide Web).

SDP text information includes:

    1. Session name and intent;
    2. Session duration;
    3. The media that constitutes the session;
    4. Information about receiving media (address, etc.).
    5. Protocol structure

SDP information is textual information, using the ISO 10646 character set in UTF-8 encoding. The SDP session is described as follows: (indicates an optional field for the callout * symbol):
v = (Protocol version)
o = (owner/creator and session identifier)
s = (session name)
i = * (Session information)
u = * (URI description)
E = * (email address)
p = * (phone number)
c = * (connection information-if included in all media, this field is not required)
b = * (Bandwidth information)

One or more time descriptions (see below):
z = * (Time zone Adjustment)
k = * (encryption key)
A = * (0 or more session property rows)
0 or more media descriptions (as shown below)

Time description
T = (Session activity time)
R = * (0 or more repetitions)

Media description
m = (media name and transport address)
i = * (media title)
c = * (Connection information-This field is optional if included in the session layer)
b = * (Bandwidth information)
k = * (encryption key)

A = * (0 or more session property rows)


The RTMP (real Time Messaging Protocol) Live Messaging Protocol is an open protocol for the development of audio, video, and data transmission between flash players and servers by Adobe Systems.
It has three variants:

1) The PlainText protocol working on TCP, using port 1935;

2) rtmpt package in the HTTP request, can pass through the firewall;

3) Rtmps similar to rtmpt, but using HTTPS connection;

The RTMP protocol (Real time Messaging Protocol) is used by Flash for object, video, and audio transmission. This protocol is based on the TCP protocol or the polling HTTP protocol.

The RTMP protocol is like a container for data packets, which can be either in AMF format or as video/audio data in FLV. A single connection can transmit multiple network streams through different channels. The packets in these channels are transmitted in a fixed-size package.


MMS (Microsoft Media Server Protocol), Chinese "Microsoft Server Protocol", is used to access and stream a protocol to the. asf file in a Windows media server. The MMS protocol is used to access unicast content on the Windows Media publishing point. MMS is the default method for connecting to the Windows Media unicast service. If viewers type a URL in Windows Media Player to connect to the content instead of accessing the content through a hyperlink, they must refer to the stream using the MMS protocol. The default port (ports) of MMS is 1755

When you use the MMS protocol to connect to a publishing point, use protocol rollover to get the best connection. "Protocol rollover" starts with trying to connect to the client through MMSU. MMSU is the MMS protocol combined with UDP data transfer. If the MMSU connection is unsuccessful, the server tries to use MMST. MMST is the MMS protocol combined with TCP data transfer.
If you are connecting to an indexed. asf file and want to fast forward, rewind, pause, start, and stop streaming, you must use MMS. You cannot fast forward or backward with a UNC path. If you are connecting to a publishing point from a standalone Windows Media Player, you must specify the URL of the unicast content. If content is published on-demand at the main publishing point, the URL consists of the server name and the. asf file name. For example: mms://windows_media_server/sample.asf. Where Windows_media_server is the Windows Media server name, SAMPLE.ASF is the. asf file name that you want to convert to a stream.
If you have live content to be published via broadcast unicast, the URL consists of a server name and a publishing point alias. For example: Mms://windows_media_server/liveevents. Here Windows_media_server is the Windows Media server name, and Liveevents is the release name


HTTP Live Streaming (HLS) is Apple Inc. (Apple Inc) The implementation of the HTTP-based streaming media transmission protocol, can achieve streaming media live and on-demand, mainly in the iOS system, for iOS devices (such as the iphone, IPad) to provide audio and video live and on-demand programs. HLS on-demand, basically is the common segment HTTP on-demand, the difference is that its segmentation is very small.

Compared to the common streaming live broadcast protocol, such as RTMP protocol, RTSP protocol, MMS protocol, the biggest difference of HLS live is that the live client obtains, not a complete data stream. The HLS protocol stores the live stream as a continuous, short-length media file (mpeg-ts format) on the server side, while the client constantly downloads and plays these small files because the server side always generates new small files with the latest live data. So that the client as long as the sequential playback of the files obtained from the server, the implementation of the live broadcast. It can be seen, basically, that HLS is on-demand technical way to achieve live. Because the data through the HTTP protocol transmission, so completely do not consider the firewall or proxy problems, and the length of the fragmented file is very short, the client can quickly select and switch the bitrate to adapt to different bandwidth conditions of playback. However, this technical feature of HLS determines that its delay will always be higher than the normal streaming live protocol.

Based on the above understanding to implement the HTTP live streaming live, you need to study and implement the following technical key points

      1. Capture data from video sources and audio sources
      2. H264 encoding and AAC encoding of raw data
      3. Video and audio data encapsulated as Mpeg-ts package
      4. HLS segmentation Generation Strategy and m3u8 index file
      5. HTTP Transport Protocol

Introduction to Streaming Media protocol (Rtp/rtcp/rtsp/rtmp/mms/hls

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.