I. About ortp
Ortp is an open-source software that implements the RTP and RTCP protocols. Currently, the software that uses the ortp library is mainly Linphone (a software for video and voice calls based on IP addresses ).
As the RTP Library of Linphone, ortp guarantees the transmission of voice and video data based on the RTP protocol.
Ii. Source Code construction framework
Similar to the filter in mediastream2, there is also an important structure in RTP, namely payload type, which is used to specify the encoding type, as well as related clock rate, sampling rate, and other parameters, see.
Figure 2-1 actually has a special domain in the RTP Header to define the encoding type of the currently transmitted data. In the code, different media types correspond to different payloadtype structs, such as h263, g729, and MPEG4. Because each encoding has its unique characteristics and many parameters are different, the payload field is used in the RTP Header to mark the load type. On the one hand, the receiving end can determine the load type, in this way, you can select the corresponding decoder for decoding and playback. On the other hand, it is more convenient to calculate the time stamp of the Code.
The payloadtype struct defines many attributes of payload, such as audio or video data, clock sampling rate, bit number of each sample, normal bit rate, MIME type, and channel. The Code already has the payloadtype struct implementation corresponding to common audio/video codecs. When initializing the ortp library, the application can select a part of it as needed to add it to the system. All the payload types currently supported by the system are put in an array and are directed by the global variable av_profile struct instance, as shown in:
Figure 2-2 The Position of the payloadtype struct in the payload array is indexed by the definition of the encoding type. The encoding type value is defined in rfc3551 section 6 "payload Type Definitions. The avprofile. c file defines all payload types. The payload type and profile operations are implemented in the payloadtype. c file.
In addition to the payloadtype structure, a more important struct is rtpsession. This struct is the abstraction of a session. All session-related information is defined on this struct or can be found through this struct. To use ortp for media data transmission, you must first create a session, and then all data transmission is completed on the session or based on the session. The rtpsession struct is defined as follows:
Figure 2-3 shows that this is a very large struct. It shows from the aspect that the number of sessions to be maintained is large.
A detailed description of the struct is provided later. Session Initialization is completed through the rtp_session_init interface, and a new session is obtained externally by calling the rtp_session_new interface. You can find the definition in rtpsession. C.
When using ortp for data transmission, you can receive and send multiple session streams on a single task. This benefits from the support of the scheduling module in ortp. To use the scheduling module, the application needs to initialize the scheduling during ortp initialization, register the session to be scheduled and managed to the scheduling module, so that when receiving and sending operations are performed, first, ask the scheduling system whether the current session can be sent or received. If the current session cannot be sent or received, process the next session. This is similar to the Select Operation on the I/O interface. The scheduling module uses the rtpscheduler data structure, as shown in:
Figure 2-4 list stores all sessions to be processed. The meaning of r \ W \ e is similar to that of select. Here, it indicates receiving, sending, and exception respectively. Posixtimer. C, rtptimer. C, schedmer. C, sessionset. C and other files implement the scheduling module. Data is actually received and sent at the underlying layer through the socket interface, which is implemented in the rtpsession_inet.c file. To facilitate ortp porting to different platforms, ortp encapsulates operating system interfaces, including the creation and destruction of common tasks, condition variables and mutex locks, inter-process pipeline communication mechanism. These are implemented in the port. c file.
In addition to the operating system interfaces, ortp implements some data structures to facilitate internal operations. One is a two-way linked list in the file utils. c. One is the queue in the str_utilis.c file. The implementation of the linked list is relatively simple, and the implementation of the queue is more complicated. The queue data structure consists of the queue header, message block, and data block, as shown in the figure below:
Figure 2-5 shows the queue headers, message blocks, and data blocks from left to right. The queue header points to the Message Block, which can constitute a two-way linked list. This is the basic element of the queue. The message block itself does not contain a buffer. data is stored by a dedicated data block and directed by the message block. Is an initialized status. The read/write pointer of the Message Block points to the starting position of the buffer of the data block. The base and Lim pointers of the data block point to the start and end addresses of the buffer space respectively. The status changes after data is written and read to the buffer are as follows:
Figure 2-6 in addition to adding a message block to the queue, the above data structure design also supports adding a new message block to a message block. This allows a message block to store larger data blocks, as shown in:
Figure 2-7 The B _cont pointer of the Message Block is used to connect the New Message Block.
Before sending the payload data of the Upper-layer application, ortp constructs a message block, and the Data Pointer Points to payload, which avoids data copying. The lower-layer interfaces depend on the message block structure when processing data. The received data is copied from the message block to the user buffer. The parsing and processing functions of received RTP and RTCP packages are implemented in the rtpparse. C and rtcpparse. c files. In addition, the RTCP. c file implements the construction and processing of RTCP data packets.
In IP-based audio/video stream transmission, anti-jitter is an important feature, which ensures a good user experience to a certain extent. In ortp, this part of work is completed through the jitter module. Shows the related data structure:
Figure 2-8 uses the jitter function and enables the enabled variable. to support adaptive compensation, enable the adaptive variable. Some events generated during data transmission (such as changes to SSRC and DTMF Data) are processed through signaltable in ortp. Signaltable associates the event type with the callback function on it. Ortp uses signaltable to handle the following events: ssrc_changed (SSRC changed), payload_type_changed (payload type changed), telephone-event_packet (telephone event package arrived), telephone-event (telephone event ), timestamp_jump (timestamp jump event), network_error (network error event), and rtcp_bye (RTCP bye packet event ). You can register a callback handler for these events. When the underlying receiving function receives the RTP packet, it checks the packet and finds that the preceding events trigger the execution of the callback function. The rtpsignaltable. c file implements operations on the table, including initialization, addition of callback to delete callback, and execution of callback.
Ortp processes events based on event structures and event queues. The queue is used to store the event struct, And the struct is used to store event data. Related Processing is defined in the event. c file. In particular, processing telephone events is stored in the telephone_event.c file, which includes how to construct the RTP package for transmitting telephone_event, how to add telephone events to the package, and how to send DTMF Data, and what to do after receiving the corresponding data packet. Shows the composition of telephone_event:
Figure 2-9 the leftmost structure is the data about the telephone event stored in the RTP package. The detailed information about the telephone event can be found through the packet pointer. What is finally put into the event queue is what packet points.
Before using the RTP Library provided by ortp, You need to initialize it first. This part of implementation is in the ortp. c file. Ortp initialization mainly calls two interfaces: ortp_init and ortp_scheduler_init. Ortp_init registers the payload and ortp_scheduler_init initializes the scheduling task.
Iii. timestamp description
1. Description of the time stamp in RTP transmission (this part comes from the Network)
Timestamp Unit: The timestamp used in the RTP protocol. The unit is based on the sampling frequency instead of the second. The purpose is to make the timestamp Unit more accurate. For example, if the sampling frequency of an audio is 8000Hz, we can set the timestamp unit to 1/8000.
Timestamp increment: the time difference between two adjacent RTP packets (based on the timestamp unit ). Sampling frequency: the number of samples taken per second. For example, the audio sampling rate is generally Hz frame rate: The number of frames transmitted per second or displayed, for example, 25f/s does not specify the timestamp granularity in RTP, which depends on the type of the payload. Therefore, the RTP timestamp is also called a media timestamp to emphasize that the granularity of the timestamp depends on the signal type. For example, if a voice signal sampled at 8 kHz forms a data block every 20 ms, a data block contains 160 samples (0.02 × 8000 = 160 ). Therefore, the timestamp value of each RTP group is increased by 160.
If the sampling frequency is 90000Hz, we can see from the above discussion that the unit of the timestamp is 1/90000. We assume that 90000 time blocks are divided in 1 S. If 25 frames are sent per second, so how many time blocks do each frame send? Of course it is 90000/25 = 3600. Therefore, according to the definition, "the timestamp increment is the time interval when the second RTP packet is sent separately." Therefore, the timestamp increment should be 3600.
Concerning the calculation of NTP timestamp in RTCP: the number of seconds that have elapsed since January 1, 1900 is assigned to a high 32-bit NTP timestamp, the low 32 bits of this time are calculated based on the obtained nanoseconds. Divide 1 second into the power of 2 to the power of 32, then the duration of a part is about 232 second. If the current time is X second 232 milliseconds, then 232 milliseconds are 232000 milliseconds, 232000000 nanoseconds, 232000 000 milliseconds, that is, more than 1000 232 milliseconds. That is to say, the NTP timestamp's low 32-bit division of 2's 32 power 232 second blocks occupy 1000 000 000 blocks, which are converted to hexadecimal notation 3b9aca00, that is to say, when the low position of the current time is 232 milliseconds, the low 32-bit NTP timestamp is set to 3b9aca00.
In Linux, one common time is the number of seconds that have elapsed since January 1, January 1, 1970. In RTCP, the preceding time plus 83aa7e80 (hexadecimal) is the number of seconds since January 1, January 1, 1900. In decimal format, the value is 2208988800. The calculation method is (70*365 + 17) * 24*60*60.
2. Description of Timestamp variables in the Code many time-recorded variables are used in the process of receiving and sending data. With these time variables, ortp completes the stream control function for RTP data. All these variables are defined in the rtpstream struct, as shown in: (only time-related variables are intercepted here)
Figure 3-1 describes the meanings of these variables in a centralized manner:
Uint32_t snd_time_offset; The scheduler time when the application sends its first Timestamp
Uint32_t snd_ts_offset; the first application timestamp sent by the application
Uint32_t snd_rand_offset; a random number added to the user offset to generate the stream timestamp.
Uint32_t snd_last_ts; the last timestamp sent on the stream
The preceding three time variables end with offset, respectively marking the first timestamp, including the time offset of the scheduler. When the application starts to send data, the time offset of the data sent by the application, that is, your own timestamp. There is also a random number to add to the offset, and the fourth is to truly mark the timestamp of the latest data sent in the stream.
Uint32_t rcv_time_offset; the application queries the scheduling time of the first timestamp. The query here refers to obtaining received packets-this should be the time when the scheduler starts receiving data
Uint32_t rcv_ts_offset; timestamp of the first stream-this should be the timestamp value of the first RTP packet when it comes to the stream
Uint32_t rcv_query_ts_offset; the first user timestamp requested by the application-this should be the time when the application receives data streams.
Uint32_t rcv_last_ts; the last timestamp of the Stream Obtained by the application-the timestamp of the last RTP packet received by the application, which is the timestamp value of the packet, instead of the application's own time.
Uint32_t rcv_last_app_ts; the last application timestamp queried by the application-the time when the application received the last package, it is a timestamp record that the application increases according to the payload type and its sampling rate. It is neither the system time nor the package time.
Uint32_t rcv_last_ret_ts; the time stamp of the last returned sample, only for consecutive audios
There is a problem between receiving and sending, that is, when receiving data packets, the current system has a time, the time in the data packet also has a timestamp record, and the scheduler also has a record time. For sending, the current application time is the timestamp time to the package. These two values are the same for sending.
Uint32_t hwrcv_extseq; serial number of the extension received last on the socket
Uint32_t hwrcv_seq_at_last_sr; this variable is updated to hwrcv_extseq after each report packet is sent, so it is the highest extension serial number when the RTCP report packet is recently sent.
Uint32_t hwrcv_since_last_sr; each time a RTP packet is received, the variable is added with 1. After the RTCP report is constructed, the variable is cleared to zero, therefore, this variable counts the number of RTP packets received since the previous report package.
The packet loss rate can be calculated based on the preceding three variables. First, the number of recently lost packets (since the last Sr or RR was sent) is calculated using hwrcv_extseq-hwrcv_seq_at_last_sr-hwrcv_since_last_sr. However, it is strange to divide the packet loss rate by hwrcv_since_last_sr. This value is the total number of packets received since the last report packet was sent. This value should not be the number of packets to be received. (Maximum serial number minus initial serial number)
The cumulative number of packet loss is calculated by the total number of packet loss each time. Uint32_t last_rcv_sr_ts; the NTP timestamp of the last received Sr, which is in the middle of 32bit. This value is also the source of the LSR value in the report package.
Struct timeval last_rcv_sr_time; the time when the last SR was received. This time is expressed by the current time of the system. This value records the system time when the last Sr is received. When the current report package is sent, the current system time is obtained again, and then the two are subtracted, multiply the obtained value by 65536 to obtain the time value in the unit of 1/65536.
Uint16_t snd_seq; sends the serial number. Accumulate the variable to save the growth of the session serial number.
Uint32_t last_rtcp_report_snt_r; the last RTCP report sent time, in the unit of receipt timestamp. In the program, this value is updated using the value of the rcv_last_app_ts variable. The timestamp of the last time the application receives RTP. This is the value whether it is received or not?
Uint32_t last_rtcp_report_snt_s; the last RTCP report's sending time, in the unit of sending timestamp. In the program, this value is updated using the value of the snd_last_ts variable, which is the value that the timestamp of the last RTP sending operation of the application increases. Whether or not the RTCP report package is sent?
Uint32_t rtcp_report_snt_interval; interval of RTCP report sending according to the timestamp unit. This value program uses the product of the default time value 5 seconds and the payload clockrate. Is computing too simple?
Uint32_t last_rtcp_packet_count; the total number of RTP packets sent by the sender is recorded in the last RTCP Sr package. This variable records this value. This value is recorded for the purpose of implementing the Protocol: If the RTCP package is sent again to the current RTCP package after the previous RTCP package is sent, if the RTP package is sent during this period, the rtcp sr report package is sent, otherwise, you only need to send the rtcp rr package.
Uint32_t sent_payload_bytes; the number of payload bytes reported by RTCP senders, data source. This variable stores the total number of bytes sent from the beginning to the RTCP report packet sent, excluding the header and fill.
The preceding time-related variables are used in the RTCP package.
Unsigned int sent_bytes; used for bandwidth Evaluation
Struct timeval send_bw_start; the above two variables are used to calculate the sending bandwidth, start record start time, sent_bytes records the number of bytes sent, this value is updated after the rtp api is called to send data. After a bandwidth value is recorded, the value is cleared to zero, and then the next bandwidth estimation is calculated.
Unsigned int recv_bytes; same as struct timeval recv_bw_start; Same as above, the role and processing logic are both the same as the sending part.
4. To implement scheduling, use the ortp scheduling function. You must call the ortp_scheduler_init interface when initializing the ortp library to initialize the scheduling module. Create a struct _ ortp_scheduler of the rtpscheduler type in this interface (see Figure 2-4) and call rtp_scheduler_init to initialize it.
In rtp_scheduler_init, assign the timer posix_timer (rtptimer type struct, see Figure 2-4) and mount it to the scheduling struct. (The initial timer interval is set to posixtimer_interval ). Then initialize the other parts of _ ortp_scheduler, including the mutex lock initialization and conditional variables. During the entire process of running the scheduling module, related operations are centered around this struct, where __ortp_scheduler is defined as a global variable.
After initialization, call rtp_scheduler_start to start the scheduling task. The execution body of the scheduling task is rtp_scheduler_schedule, and the parameter is the scheduling structure itself.
After the scheduled task is executed, timer is initialized first. In this process, set timer to the running status and save the current time value of the system. Next, go to the while loop of the task and traverse all sessions registered on scheduler. If it is not null, the application requires scheduling and management for sessions. In this case, rtp_session_process is called for processing. After all the sessions to be scheduled and managed are processed according to the preceding logic, the broadcast semaphores unblock_select_cond wake up all the tasks waiting for select to sleep, that is to say, let these tasks check whether their sessions need to be processed. This will be explained later. At this time, the scheduler completes its current work and starts preparing to enter the sleep state. Other tasks start to check the mask result to determine whether to send and receive data or wait for the next scheduling.
The scheduled sleep is completed by calling the timer_do interface of timer, which is the posix_timer_do interface. In this interface, calculate the current time of the system and perform a difference operation with the initial start time (saved when the scheduler is initialized). The result is converted to the unit of milliseconds. Posix_timer_time records the time when the next timer times out. Each time, posix_timer_time is subtracted from the difference between the current system time and the start time. If it is greater than zero, the scheduling time has not yet arrived, call the select wait (posix_timer_time-difference value) time, and then obtain the current system time again to calculate the new difference value. The flowchart is as follows:
Figure 4-1 intuitively speaking, the scheduling precision of the scheduler is determined by posixtimer_interval. If each time the scheduler runs, if the processing time of the session set exceeds this interval, the next scheduling will be processed. If it is not used up, that is, the remaining diff time will be consumed by the Select system call. Therefore, the time point for each scheduler to schedule is basically determined. The diff time varies according to the time consumed by the processing session set, and the size of each time is different.
Each scheduling task basically checks all sessions that need to be managed at a fixed point, that is, all sessions added to the session set by the application. If the time spent in processing these sessions exceeds the default interval set by the scheduler, the scheduler proceeds to the next round after processing the cycle. Otherwise, it will wait, until the next scheduling point arrives.
The scheduler checks that each session is completed through the rtp_session_process interface. For a session, calling this interface will follow the following logic: First, check the waitpoint structure of the sent part of the session, compare the time with the current time of the scheduler (the time in the above struct is the time point set by the sending and receiving interface to be awakened ). If the session needs to be woken up, that is, waiting for the wake-up, and the waiting wake-up point is reached (that is, the time of the current scheduler has exceeded the wake-up point) the identifier to be awakened is cleared, and the mask position of the session is set in the w_session set of the scheduler structure (the global variable created during the scheduler initialization, and wake up the task using the conditional variable. The same logical check r_session set. In general, the scheduler checks whether the wake-up points set for each session have arrived. If so, wake up and set its mask flag in the set. In this way, the receiving and receiving task checks the mask to see if it can continue sending and receiving. Once you can send and receive packets, the application will clear these mask locations again, so that you need to wait for the scheduler to check and set up the next time before sending and receive packets.
Upper-layer applications add a session to the scheduler by calling the rtp_session_set_scheduling_mode interface. The add process first obtains the global data structure of the scheduler, and points the sched pointer of the session to the data structure of the global scheduler. Adds rtp_session_scheduled to the session flags, that is, let the scheduler manage sessions. Finally, call the rtp_scheduler_add_session interface to register sessions to the session set managed by the scheduler. In the rtp_scheduler_add_session interface, the session is first mounted to the session linked list of the scheduler's data structure (the scheduler obtains the session to be processed from the linked list each time it loops ), find an idle location in the all_sessions set, record the mask location, and set the current session mask location in the set. In this way, the scheduler can find the session to be scheduled through the session linked list, and then find the mask location recorded on the session to set the session in the set. Similarly, the interface for removing a session from the set is rtp_scheduler_remove_session. The basic processing logic is to find the session in the session List and remove it from the linked list, and the position in the set is cleared.
The upper-layer application checks whether data needs to be sent and received by checking the session set. First, the application calls the session_set_new interface to create a new set. In this interface, we create a sessionset struct and initialize it. Subsequent operations are completed on this struct. For the session to be scheduled, call the session_set_set interface to set its mask bit in the set to 1, that is, mark it. Before receiving or sending a message, the application calls session_set_select to check whether the message can be sent or received. This API suspends caller until an event arrives. Session_set_select is similar to the commonly used system call select, and its usage is similar.
Session_set_select is an important interface for applications to deal with schedulers. Let's take a look at its implementation:
First, call ortp_get_scheduler to obtain the global struct of the scheduler and enter the while (1) loop.
If the receiving set is not empty (that is, to check whether there are receive events), call session_set_init to initialize a set that temporarily stores results and call session_set_and to check the session set. Processing is based on three quantities. One is the session set r_sessions that is added to the scheduling during initialization for receiving detection (this set represents the sessions that the scheduler can process ), one is the set of sessions checked when the user calls the SELECT statement, that is, the set to be processed by the application (this set represents the sessions to be processed by the user ), one is the maximum value of the currently scheduled session set all_max (the scheduler checks all session mask bits to be checked from small to large check to the all_max position ). In processing, a set is an array. each bit of each element in the array represents a session. In this way, the all_max is used as the upper limit to check the bit corresponding to each session, and the receiving set and user set on the scheduler struct are executed and computed (note: the receiving set is completed by the scheduler, And the set session indicates that the event is received .), The result is not only a session that can be received after being processed by the scheduler, but also a session to be processed added in the application environment, which is recorded as result set. At the same time, the bit added to the result set in the receiving set is cleared (because it has been obtained ). The session_set_and interface returns the number of BITs set in the result set, that is, the number of sessions that can be processed. If a session event arrives, that is, the returned value is greater than zero, the new result is copied back to the user set to inform the user that the session event has arrived.
Perform the same processing for the sending and error sets. If the final three sets have an event after processing (whether receiving, sending, or error), the system will return directly. Otherwise, the system will wait on the condition variable, wait until the scheduler returns an event.
Jump to while (1) for next loop Processing
In addition to the session_set_select interface, ortp also provides the Select Interface with time-out processing: session_set_timedselect, which can set the exit time instead of the dead mode like session_set_select.
Based on the processing of the application and scheduler, we can see that the accuracy (scheduling interval) of the scheduler can affect the data receiving speed to a certain extent. If data cannot be sent or received during this check Session, the next check will have to wait until the next scheduling point, even if the data has just arrived during the current check, the application must wait until the next scheduling point, and the application can only know after the scheduler checks the data. In this case, if the scheduling interval is too large, the receiving speed will inevitably slow down.
When sending and receiving data, an application can use the scheduler to manage sessions and set blocking and non-blocking modes. The relationship between the scheduler and the blocking mode: If the scheduler is used, you do not need to worry about the blocking mode. That is, the scheduler can work in the blocking mode or non-blocking mode. If you want to use the blocking mode, you need to start the scheduler. This is required, that is, the blocking mode must work when the scheduler is used. (Because the implementation of the blocking function itself depends on the Scheduler ). If the scheduler is started in non-blocking mode and data cannot be sent or received, the upper-layer task can wait for other operations at the application layer. When the scheduler starts and is set to blocking mode, when the data cannot be sent or received, the upper-layer application task will wait for the condition variable. The upper-layer task can continue to run only after the scheduler signal. Therefore, if the upper-layer application starts multiple sending or receiving ports, if one or more ports cannot be sent or received in non-blocking mode, other ports are allowed to be sent, if none of them can be used, a loop can be empty. In blocking mode, if a port is blocked, no other port can send or receive data, that is, you must wait for an event on the port and triggered by the scheduler before sending or receiving other ports. Therefore, the blocking mode should not be used in the case of multiple receiving and sending applications.
In non-blocking mode, the application wait time is consumed in the session_set_select interface. In blocking mode, the application may be blocked in the sending and receiving interfaces.
There is a problem when using the current library. Opening the blocking mode when scheduling is used will cause the program to be suspended. The specific cause is that, in blocking mode, packet time is later than schedtime time when the package is sent, so that the scheduler does not need to wake up when checking, because it is old than the scheduler. The root cause is that the scheduler waits for the scheduler to run during blocking, And the scheduler time exceeds packet time. In non-blocking mode, the package will be sent directly. In this case, the next time the package is sent, that is, when the next select wait, the scheduler will catch up with the packet sending time, then wake up the packet sending, and in blocking mode, the scheduler has caught up with the next select and exceeds the packet sending time.
Shows the relationship between the scheduler and the application:
Figure 4-2 The scheduler checks the session set, wakes up the received stream to the time, and sets the mask bit. The application checks the mask bit to see if the receiving stream is awakened and then receives the received stream. The mask bit set by the scheduler is cleared during the receiving process.
Ortp usage 1