Real-time streaming media programming in Linux (RTP, RTCP, RTSP)

Source: Internet
Author: User

Streaming media refers to the continuous time-based media transmitted using stream technology in the network. It features that the entire file does not need to be downloaded before playback, but is played by downloading and playing, it is the technical basis for video conferences, IP phones, and other applications. RTP is a standard protocol and Key Technology for Real-Time Streaming Media transmission. This article describes how to program Real-Time Streaming Media Using jrtplib in Linux.

1. Introduction to streaming media
With the increasing popularity of the Internet, the data transmitted over the network is no longer limited to text and graphics, but gradually transitioned to multimedia formats such as sound and video. Currently, when transmitting audio/video (audio/video, A/V) and other multimedia files over the network, there are basically two options: Download and stream transmission. Generally, A/V files occupy a large amount of storage space. It may take several minutes or even hours to download files in a network with limited bandwidth, therefore, this processing method has a high latency. For streaming transmission, audio, video, animation, and other multimedia files will be sent continuously and in real time by a dedicated Streaming Media Server, so that you do not have to wait until all the files have been downloaded, it takes only a few seconds to start the delay. When the multimedia data is played on the client, the remaining part of the file will continue to be downloaded from the Streaming Media Server.
Streaming is a new concept that has emerged on the Internet in recent years. It is widely defined and is a general term for multimedia data transmission over the network. Streaming media has two meanings: Broadly speaking, streaming media refers to a series of techniques, methods, and protocols that allow audio and video to form a stable and continuous transmission stream and playback stream, that is, streaming media technology. In a narrow sense, streaming media is relative to the traditional download-playback method. It refers to a new method for obtaining audio, video, and other multimedia data from the Internet, it supports real-time transmission and real-time playback of multimedia data streams. By using streaming media technology, the server can send stable and continuous multimedia data streams to the client, and the client can play back the data at a stable rate while receiving the data, you don't have to wait until all the data has been downloaded before playback. Due to network bandwidth, computer processing capabilities, and protocol specifications, to download a large amount of audio and video data from the Internet, in terms of download time and storage space, it is unrealistic, and the emergence of streaming media technology is a good solution to this problem. Currently, there are two main methods to achieve streaming media transmission:
Streaming and realtime streaming are suitable for different applications.

Sequential stream transmission
Ordered stream transmission transmits data by means of sequential download. When downloading, you can play back multimedia data online. However, you can only view the downloaded data at a given time point. You cannot skip to the undownloaded data point, the download speed cannot be adjusted based on network conditions during transmission. Because the standard HTTP server can send this form of streaming media without the support of other special protocols, it is often called HTTP streaming transmission. Ordered stream transmission is suitable for high-quality multimedia clips, such as titles, credits, and advertisements.

Real-time stream transmission
Real-time stream transmission ensures that the media signal bandwidth can match the current network conditions, so that streaming media data is always transmitted in real time. Therefore, it is particularly suitable for on-site events. Real-time stream transmission supports random access, that is, users can view the content in front or back through fast forward or backward operations. Theoretically, real-time streaming media does not pause once it is played, but in fact it is still possible to pause periodically, especially when the network condition deteriorates. Different from Sequential stream transmission, real-time stream transmission requires a specific Streaming Media Server and supports specific network protocols.

Ii. Streaming Media Protocol
Real-Time Transport Protocol (PRT) is a network protocol used to process multimedia data streams over the Internet. It can be used in one-to-one (unicast, unicast) scenarios) or you can transmit streaming media data in real time in a one-to-multiple (Multi-play) network environment. RTP usually uses UDP for multimedia data transmission, but other protocols such as TCP or ATM can be used if needed. The entire RTP protocol consists of two closely related parts: RTP data protocol and RTP control protocol. Real-time stream Protocol
Time Streaming Protocol, RTSP) was first proposed by Real Networks and Netscape. It is located on RTP and RTCP and aims to transmit multimedia data effectively through the IP network.

2.1 RTP data protocol
The RTP data protocol is used to package streaming media data and implement real-time transmission of media streams. Each RTP data packet consists of header and payload, the first 12 bytes in the header are fixed, while the load can be audio or video data. The Header Format of the RTP datagram is 1:

The important domains and their meanings are as follows:
CSRC notation (CC) indicates the number of CSRC identifiers. The CSRC Mark follows the fixed RTP Header to indicate the source of the RTP datagram. the RTP protocol allows multiple data sources in the same session, which can be combined into a data source through the RTP mixer. For example, a CSRC list can be generated to represent a teleconference, which combines the voice data of all speakers into a RTP data source through an RTP mixer.
The load type (PT) indicates the RTP load format, including the encoding algorithm, sampling frequency, and bearer channel used. For example, type 2 indicates that the RTP data packet carries voice data encoded using the ITU g.721 algorithm. The sampling frequency is 8000Hz and the single channel is used.
Serial numbers are used to detect data loss for the receiver, but how to handle the lost data is the application's own business. RTP protocol itself is not responsible for data retransmission.
The timestamp records the sampling time of the first byte in the load. The receiver can use the timestamp to determine whether the arrival of data is affected by latency jitter, but how to compensate for latency jitter is the application's own business. It is not difficult to see from the RTP datagram format that it contains the type, format, serial number, timestamp, and whether there is additional data, these provide a foundation for Real-Time Streaming Media transmission. The RTP protocol is designed to provide end-to-end transmission services for real-time data (such as interactive audio and video). Therefore, there is no connection concept in RTP, it can be built on the underlying connection-oriented or non-connection-oriented transmission protocols. RTP does not depend on the special network address format, but only needs the underlying transmission protocol to support frame (framing) and segment (segmentation). In addition, RTP itself does not provide any reliability mechanisms, which must be ensured by the Transport Protocol or application itself. In typical application scenarios, RTP is generally implemented as part of an application over the transport protocol, as shown in Figure 2:

2.2 RTCP Control Protocol
The RTCP control protocol must be used together with the RTP data protocol. When an application starts an RTP session, both ports are used for RTP and RTCP respectively. RTP itself does not provide a reliable guarantee for data packets transmitted in sequence, nor does it provide traffic control and congestion control, which are completed by RTCP. Generally, RTCP uses the same distribution mechanism as RTP to periodically send control information to all session members. The application receives the data and obtains relevant information of the session participants, and network conditions, packet loss probability and other feedback information, so as to control service quality or diagnose network conditions.

 

The functions of the RTCP protocol are implemented through different RTCP datagram, mainly including the following types:
Sr sending end report. The so-called sending end refers to the application or terminal that sends the RTP datagram, And the sending end can also be the receiving end.
Rr receiving end report. The so-called receiving end refers to an application or terminal that only receives but does not send RTP datagram.
The sdes source description mainly serves as a carrier for the identity information of session members, such as user name, email address, and phone number. In addition, it also provides the ability to send session control information to session members.
The main function of the bye notification is to indicate that one or more sources are no longer valid, that is, other members in the notification session will quit the session.
The app is defined by the application itself, which solves the problem of RTCP scalability and provides great flexibility for the Protocol Implementers.
RTCP datagram carries the necessary information of service quality monitoring, which can dynamically adjust service quality and effectively control network congestion. Because RTCP datagram adopts the multicast mode, all members in the session can use the control information returned by the RTCP datagram to understand the current situation of other participants.
In a typical application, the application that sends a media stream periodically generates the sender Report SR. The RTCP datagram contains synchronization information between different media streams, as well as the sent datagram and byte count, the receiving end can estimate the actual data transmission rate based on the information. On the other hand, the receiving end sends the RR report to all known senders. The RTCP datagram contains the maximum serial number of the received datagram, the number of lost datagram, latency jitter, timestamp, and other important information, based on this information, the sender application can estimate the round-trip latency and dynamically adjust the transmission rate based on the datagram loss probability and latency Jitter to improve network congestion, or, you can smoothly adjust the service quality of your application based on the network conditions.

2.3 RTSP real-time stream Protocol
As an application layer protocol, RTSP provides a scalable framework, which makes real-time streaming media data controlled and On-Demand Streaming possible. In general, RTSP is a Streaming Media Protocol mainly used to control data transmission with real-time characteristics. However, RTSP does not transmit data, but must rely on some services provided by the lower-layer transmission protocol. RTSP can provide streaming media operations such as playing, pausing, and fast forward. It defines specific control messages, operation methods, status codes, and other operations. It also describes the interaction with RTP.

The RTSP has many references to the HTTP/1.1 protocol during the preparation, and even many descriptions are identical with HTTP/1.1. RTSP uses similar syntaxes and operations as HTTP/1.1 to be compatible with the existing web infrastructure, most HTTP/1.1 extensions can be directly introduced into RTSP.
A media stream set controlled by RTSP can be defined by the presentation description. The so-called representation refers to the set of one or more media streams provided by the Streaming Media Server to the client, the description contains information about each media stream, such as the data encoding/decoding algorithm, network address, and media stream content.
Although the RTSP server also uses identifiers to differentiate each session, the RTSP connection is not bound to a transport layer connection (such as TCP ), that is to say, during the entire RTSP connection, the RTSP user can open or close multiple reliable transmission connections to the RTSP server to send an RTSP request. In addition, RTSP connections can also be based on connectionless transmission protocols (such as UDP ).

The RTSP protocol currently supports the following operations:
Retrieving media allows users to submit a description to the Media Server through HTTP or other methods. For example, if the description is multicast, the description includes the multicast address and port number used for the media stream. If the description is unicast, to ensure security, only the target address is provided in the description.
The invited Media Server can be invited to an ongoing meeting, play back the media in the presentation, or record all media or its subsets in the presentation, which is very suitable for distributed teaching.
Adding media to notify users of new available media streams is particularly useful for on-site lectures. Similar to HTTP/1.1, RTSP requests can also be handled by proxy, channel, or cache.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.