Libjingle source code parsing (5)-[pseudo TCP] TCP over UDP (3): processing block data streams

Source: Internet
Author: User

Processing of block data streams Using Pseudo TCP

In the previous article, we talked about how TCP and ptcp Process Interactive Data streams. This article talks about another data stream-block data stream. Block data streams use the Sliding Window Protocol and slow start algorithm to control block data traffic.

Sliding Window

The sliding window allows the sender to send multiple groups consecutively before stopping and waiting for confirmation. Therefore, the sender does not have to stop and wait for each sending, which can accelerate data transmission. Does this Nagle algorithm conflict? No, because the grouping of block data streams is fully loaded for transmission. According to the Nagle algorithm, when the size and window size of the data to be sent are greater than that of the MSS, the data will be sent immediately.

If the sender continuously transmits data, packet loss occurs frequently, especially when the sender sends data to the slow receiver. When the receiver has not processed the data, the sender sends the data in succession to fill the receiver's buffer, and the subsequent data will be discarded. In order to reduce the number of packet loss on the network, A mechanism is used to restrict the sender from transmitting data.

Therefore, a sliding window appears, such:


The sliding window is divided into four parts:

1 ~ 3 is the data segment sent and confirmed

4 ~ 6 is the data segment that has been sent but has not been confirmed

7 ~ 9 is an available window, that is, a sliding window, and the data segment space that the sender can send

More than 10 cannot be sent.


After the receiver confirms the data, the two sides of the sliding window continue to move to the right.

Window combination: When the sender sends the data and waits for confirmation, the left side of the sliding window moves to the right.

Window Opening: When the receiver receives the data and confirms and releases the buffer data, the right side moves to the right.

Window contraction: When the buffer size of the receiver changes to an hour, the right side moves to the left side, but this method is not recommended.

The window size is updated when the window is sliding. After receiving the data, the receiver recalculates the size of the receiving buffer and notifies the sender. If the size of the notification window is 0, the sender cannot send data until the window size is not 0, which can effectively avoid group loss caused by the buffer of the receiver.

How is ptcp implemented?

Ptcp uses m_rbuf_len to indicate the size of the receiving buffer. If the buffer size is smaller than 65536, m_rwnd_scale is 0 and m_rcv_wnd indicates the window size. If the buffer size is greater than 65535, use the following algorithm to adjust m_rbuf_len and m_rwnd_scale. After adjustment, the window size m_rcv_wnd is updated based on the available space in the buffer zone. Why do we select 65535 as the limit? Because the length of the window field in the ptcp header is 16 bits, the window can be limited to 0 ~ 65535 (including 65535 ).

Voidpseudo TCP: resizereceivebuffer (uint32 new_size) {uint8 scale_factor = 0; // process a buffer greater than 65536 bytes and update scale_factor while (new_size> 0 xFFFF) {++ scale_factor; new_size >>=1;} new_size <= scale_factor; // when the buffer size is greater than 65535, the bool result = m_rbuf.setcapacity (new_size) is adjusted ); // update the buffer m_rbuf_len = new_size; // update the buffer size m_rwnd_scale = scale_factor; // update the window expansion factor m_ssthresh = new_size; size_t available_space = 0; trim (& available_space ); m_rcv_wnd = available_space; // update the available window size}

When ptcp performs three-way handshake, tcp_opt_wnd_scale is used to notify the other party of the m_rwnd_scale value.

Voidpseudo dotcp: queueconnectmessage () {talk_base: bytebuffer Buf (talk_base: bytebuffer: order_network); Buf. writeuint8 (ctl_connect); If (m_support_wnd_scale) {// determines whether the window expansion option enables Buf. writeuint8 (tcp_opt_wnd_scale); // added the window expansion option Buf. writeuint8 (1); Buf. writeuint8 (m_rwnd_scale); // window expansion factor} m_snd_wnd = Buf. length (); Queue (BUF. data (), Buf. length (), true );}

After the control package corresponding to the ptcp Receiving Window expansion factor, the parseoptions method is used to parse the package as follows:

Voidpseudo dotcp: parseoptions (const char * data, uint32 Len) {STD: Set <uint8> options_specified; talk_base: bytebuffer Buf (data, Len); While (BUF. length () {uint8 kind = tcp_opt_eol; Buf. readuint8 (& kind); If (kind = tcp_opt_eol) {// determines whether the break at the end of the buffer;} else if (kind = tcp_opt_noop) {// null option continue ;} unused (LEN); uint8 opt_len = 0; Buf. readuint8 (& opt_len); If (opt_len <= Buf. length () {applyoption (Kind, Buf. data (), opt_len); // The value of the update option Buf. consume (opt_len);} else {return;} options_specified.insert (kind);} If (options_specified.find (locate) = consume () {If (m_rwnd_scale> 0) {resizereceivebuffer (default_rcv_buf_size); // If the peer does not support window expansion factor and the local buffer size exceeds 65535, change it to 60 K, because both ends support window expansion factor to use m_swnd_scale. M_swnd_scale = 0 ;}}}

The receiver adjusts the window size as follows:

Window combination: When the receiver receives the data, it will subtract the data size consumed by the receiving buffer from the window size.

Bool pseudo TCP: Process (segment & SEG ){...... uint32 noffset = seg. SEQ-m_rcv_nxt; talk_base: streamresult result = m_rbuf.writeoffset (SEG. data, SEG. len, noffset, null); Assert (result = talk_base: sr_success); unused (result); If (SEG. SEQ = m_rcv_nxt) {// if the current received group happens to be the next required group m_rbuf.consumewritebuffer (SEG. len); // consume the receiving buffer m_rcv_nxt + = seg. len; // update the next required group m_rcv_wnd-= seg. len; // update the window size, minus the consumed buffer bn. Ewdata = true; rlist: iterator it = m_rlist.begin (); While (it! = M_rlist.end () & (IT-> seq <= m_rcv_nxt) {If (IT-> seq + It-> Len> m_rcv_nxt) {sflags = sfimmediateack; // (fast recovery) uint32 nadjust = (IT-> seq + It-> Len)-m_rcv_nxt; m_rbuf.consumewritebuffer (nadjust); m_rcv_nxt + = nadjust; // The Group received earlier contains the next required seq number, and the adjusted m_rcv_nxt m_rcv_wnd-= nadjust; // m_rcv_nxt is added, and the receiving buffer is filled, and the window size is updated accordingly.} It = m_rlist.erase (IT) ;}} the else {// group is not required, but the valid group rsegment rseg; rseg. SEQ = seg. seq; rseg. len = seg. len; rlist: iterator it = m_rlist.begin (); While (it! = M_rlist.end () & (IT-> seq <rseg. SEQ) {++ it;} m_rlist.insert (it, rseg); // update the list of receiving groups. It is used for reorganization and recovery when the next group is received. }......}

Window Opening: When the application layer calls Recv to obtain the data received by ptcp, ptcp clears this part of data, empties the buffer zone, and expands the window size.

Int pseudo TCP: Recv (char * buffer, size_t Len ){...... talk_base: streamresult result = m_rbuf.read (buffer, Len, & read, null );...... size_t available_space = 0; m_rbuf.getwriteremaining (& available_space); // obtain the available space of the receiving buffer if (uint32 (available_space)-m_rcv_wnd> = talk_base :: _ min <uint32> (m_rbuf_len/2, m_mss) {bool bwasclosed = (m_rcv_wnd = 0 );//!?! Not sure about this was closed business m_rcv_wnd = available_space; // update the window size. This is the process of window opening if (bwasclosed) {attemptsend (sfimmediateack ); // if the window size changes from 0 to available space, immediately notify the recipient to continue sending data. } Return read ;}

Announcement window size to recipient:

Ipseudotcpnotify: writeresult pseudo dotcp: Packet (uint32 seq, uint8 flags, uint32 offset, uint32 Len) {assert (header_size + Len <= max_packet); uint32 now = now (); uint8 buffer [max_packet]; long_to_bytes (m_conv, buffer); long_to_bytes (SEQ, buffer + 4); long_to_bytes (m_rcv_nxt, buffer + 8); buffer [12] = 0; buffer [13] = flags; short_to_bytes (static_cast <uint16> (m_rcv_wnd> m_rwnd_scale), buffer + 14 ); // here we will calculate the window expansion factor ......}

After receiving the size of the window sent by the receiver, the sender can calculate the size of the window minus the size of the sent but unconfirmed data.

Void pseudo dotcp: attemptsend (sendflags sflags ){...... uint32 nwindow = talk_base: _ min (m_snd_wnd, cwnd); // receiver window size uint32 ninflight = m_snd_nxt-m_snd_una; // The size of the data that has been sent but not confirmed uint32 nuseable = (ninflight <nwindow )? (Nwindow-ninflight): 0; // the size of data sent before sending ......}

Slow Start

When there are multiple routers and slow links between the receiver and the sender, some intermediate routers must cache the packets. At first, the sender sends multiple groups to the receiver, which may fill up the cache, which seriously reduces the TCP throughput.

TCP uses the slow start algorithm to solve the above problem: first, set the congestion window cwnd to 1. When the sender receives an Ack congestion window, add one packet segment. The minimum value of the sender's congestion window and notification window is the maximum sending limit. The congestion window is the traffic control used by the sender, and the traffic control used by the receiver when the notification window is published.

The sender first sends a packet segment. When Ack is received, cwnd becomes 2. Two packet segments can be sent. When two ACK packets are received, cwnd becomes 4, the sender can send four message segments, and so on. The slow start algorithm increases exponentially.

Ptcp implements the following slow start algorithm:

The initial value of cwnd is two MSS. When Ack is received, cwnd adds an MSS.

Bool pseudo TCP: Process (segment & SEG ){...... // check if this is a valuable ack if (SEG. ack> m_snd_una) & (SEG. ack <= m_snd_nxt) {If (m_dup_acks> = 3 ){......} else {m_dup_acks = 0; // slow start, congestion avoidance if (m_cwnd <m_ssthresh) {m_cwnd + = m_mss; // when a valid Ack is received, cwnd adds an MSS. } Else {m_cwnd + = talk_base: _ max <uint32> (1, m_mss * m_mss/m_cwnd );}}}......}

When the sender sends data, the minimum value of the window size is the advertised window (m_snd_wnd) and the congestion window (cwnd, then subtract the unconfirmed size that has been sent to the current data size that can be sent (nuseable ).

Void pseudo dotcp: attemptsend (sendflags sflags ){...... while (true) {uint32 cwnd = m_cwnd; If (m_dup_acks = 1) | (m_dup_acks = 2) {// limited transmit cwnd + = m_dup_acks * m_mss ;} uint32 nwindow = talk_base: _ min (m_snd_wnd, cwnd); // set the window size to the minimum value of the announcement window and the congestion window uint32 ninflight = m_snd_nxt-m_snd_una; uint32 nuseable = (ninflight <nwindow )? (Nwindow-ninflight): 0; // subtract the unconfirmed size sent from the size_t snd_buffered = 0; m_sbuf.getbuffered (& snd_buffered); uint32 navailable = talk_base:: _ min (static_cast <uint32> (snd_buffered)-ninflight, m_mss); // The data size that can be sent to cached data ......}

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.