utorrent Transport Protocol

Source: Internet
Author: User
Tags ack bitmask rand

Turn from: http://bittorrent.org/beps/bep_0029.html

BEP: ,
Title: utorrent Transport protocol
Version: 11189
last-modified: 2010-01-22 01:49:42 +0000 (Fri
Author: arvid Norberg <arvid at Bittorrent.com> ;
Status: Draft
TYP E: standards Track
Content-type: text/x-rst
Created: 22-jun-2009
post-history:  

Contents utorrent Transport Protocol Credits Rationale Overview header format version connection_id Timestamp_microsecond s timestamp_difference_microseconds wnd_size extension selective ACK extension bits type SEQ_NR ACK_NR connection setup PA Cket Loss Timeouts Packet sizes Congestion Control utorrent Transport Protocol Credits

The utorrent transport protocol is designed by Ludvig Strigeus, Greg Hazel, Stanislav Shalunov, Arvid Norberg and Bram Co Hen. rationale

The motivation for BitTorrent clients to not disrupt Internet connections while still utilizing the unused ban Dwidth fully.

The problem is-that-DSL and cable modems typically have a send buffer disproportional to their max send rate, which can ho LD Several seconds worth of packets. BitTorrent traffic is typically background transfers, and should have, lower priority than checking email, phone calls and Browsing the web, but when using regular TCP connections BitTorrent quickly to up the send buffer fills adding NDS delay to all interactive traffic.

The fact that BitTorrent uses multiple TCP connections gives it a unfair when advantage and other services for bandwidth, which exaggerates the effect of BitTorrent filling upload. The reason is because TCP distributes the available bandwidth evenly across One application uses, the larger share of the bandwidth it gets.

The traditional solution to this problem are to caps the upload rate of the BitTorrent client to 80% of the Up-link . 80% leaves some head room for interactive traffic.

The main drawbacks with this solution are:the user needs to configure his/her client, it BitTorrent ' t won work -box. The user needs to know his/her Internet connection ' upload capacity. This capacity could change, especially on laptops and may connect to a large number of different networks. The headroom of 20% is arbitrary and wastes bandwidth. Whenever there is no interactive traffic competing with BitTorrent, the extra 20%, are. Whenever there is competing interactive traffic, it cannot with more than 20% of the capacity.

UTP solves this problem by using the modem queue size as a controller to its send rate. When the queue is grows too large, it throttles back.

This lets it utilize the full upload capacity when there are no competition for it, and it lets it throttle Ly Nothing when there is a lot of interactive traffic. Overview

This document assumes some knowledge the how TCP and Windows based congestion control works.

UTP is a transport protocol layered to top of UDP. As such, it must (and has the ability to) implement their own control.

The main difference compared to TCP are the delay based congestion control. The Congestion control section.

Like TCP, the UTP uses window based congestion control. Each socket has a max_window which determines the maximum number of bytes the socket could have in-flight at all given time. Any packet this has been sent, but not yet acked, are considered to be in-flight.

The number of bytes in-flight is Cur_window.

A socket may only send a packet if Cur_window + packet_size be less than or equal to min (Max_window, wnd_size). The packet size may vary, and the packet sizes section.

Wnd_size is the "advertised window" from the "other" end. It sets a upper limit on the number of packets in-flight.

An implementation could violate the above rule if the Max_window is smaller than the packet size, and it paces the packets s O that the average cur_window are less than or equal to Max_window.

Each socket keeps a to the last delay measurement to the other endpoint (Reply_micro). Whenever a packet is received, this is updated by subtracting Timestamp_microseconds from the hosts microseconds (the header format).

Every time a packet was sent, the sockets Reply_micro value is put in the Timestamp_difference_microseconds field of the PA Cket header.

Unlike TCP, sequence numbers and ACKs in UTP refers to packets, not bytes. This means UTP cannot repackage the data when resending it.

Each socket keeps a of the "Next sequence number to" when sending a packet, SEQ_NR. It also keeps a state of the sequence number that is last received, ACK_NR. The oldest unacked packet is Seq_nr-cur_window. Header Format

Version 1 Header:

0       4       8
+-------+-------+---------------+---------------+---------------+
| ver |   type |  extension |     connection_id
| +-------+-------+---------------+---------------+---------------+
| timestamp_microseconds                                        |
+---------------+---------------+---------------+---------------+
| timestamp_difference_microseconds                             |
+---------------+---------------+---------------+---------------+
| wnd_size                                                      |
+---------------+---------------+---------------+---------------+
| seq_nr |                        ACK_NR |
+---------------+---------------+---------------+---------------+

All fields are in network byte (big endian). version

This is the protocol version. The current version is 1. connection_id

This is a random, unique, number identifying all of the packets that belong to the same connection. Each socket has one connection ID to sending packets and a different connection ID for receiving. The endpoint initiating the connection decides which ID to use, and the return path has the same ID + 1. Timestamp_microseconds

This is the ' microseconds ' parts of the ' timestamp ' of the ' in ' when this packet was sent. This are set using Gettimeofday () on POSIX and Queryperformancetimer () on Windows. The higher resolution this timestamp has, the better. The closer to the actual transmit time it is set, the better. Timestamp_difference_microseconds

This is the difference between the "local" and "timestamp in" received packet, at the time of the last packet W As received. This is the latest one-way delay measurement of the "link from" The remote peer to the local machine.

When a socket was newly opened and doesn ' t have any delay the samples yet, this must was set to 0. wnd_size

Advertised Receive window. This is the bits wide and specified in bytes.

The window size is the number of bytes currently in-flight, i.e. sent but not acked. The advertised receive window lets the other end cap the window size if it cannot receive any faster, if it receive Buffe R is filling up.

When sending packets, this should is set to the number of bytes left in the socket ' s receive buffer. extension

The type of the the extension in a linked list of extension headers. 0 means no extension.

There are two extensions:selective acks Extension bits

Extensions are linked, just like TCP options. If the extension field is Non-zero, immediately following the UTP header are two bytes:

0               8
+---------------+---------------+
| extension |     len
| +---------------+---------------+

Where extension specifies the type of the next extension in the linked list, 0 terminates the list. And Len Specifies the number of bytes of this extension. Unknown extensions can be skipped by simply advancing Len bytes. Selective ACK

Selective ack is a extension that can selectively ACK packets non-sequentially. Its payload was a bitmask of at least-bits, in multiples. Each bit represents one packet in the Send window. Bits that are outside of the the send window are ignored. A set bit specifies that packet has been received, a cleared bit specifies this packet has not been. The header looks like this:

0               8
+---------------+---------------+---------------+---------------+
| extension     | len           | Bitmask
+---------------+---------------+---------------+---------------+
                                |
+---------------+---------------+

The Len field of extensions refer to bytes, which in this extension must is at least 4, and in multiples of 4.

The selective ACK is only sent while at least one sequence number was skipped in the received stream. The Mask therefore represents ACK_NR + 2. ACK_NR + 1 is assumed to have been dropped or being missing when this packet was sent. A set bit represents a packet that has been received, a cleared bit represents a packet this has not yet been.

The bitmask has reverse byte order. The represents packets [Ack_nr + 2, ACK_NR + 2 + 7] in reverse order. The least significant bit in the byte represents ACK_NR + 2, the most significant bit in the byte represents ACK_NR + 2 + 7. The next byte in the mask represents [Ack_nr + 2 + 8, ACK_NR + 2 +] in reverse order, and. The bitmask is isn't limited to (bits but can) of any size.

This is the layout of a bitmask representing the ' packet ACKs represented in a selective ACK Bitfield:

0               8
+---------------+---------------+---------------+---------------+
| 9 8 ...   3 2 | A..   10 | ...   18 | ...   num |
+---------------+---------------+---------------+---------------+

The number in the diagram maps the "bit in the" bitmask to the "offset to" to "add to" ack_nr in order to calculate the sequence Nu Mber that's bit is acking. EXTENSION BITS

The extension bits are intended to communicate support for extensions. Currently it ' s always set to 0. It is a 8 byte bitmask where each bit specifies support for a specific feature:

0               8
+---------------+---------------+---------------+---------------+
| extension     | len           | Extension bitmask +
---------------+---------------+---------------+---------------+ +

------------ ---+---------------+---------------+---------------+
                                |
+---------------+---------------+
type

The Type field describes the type of packet.

It can be one of:st_data = 0 regular DATA packet. The Socket is in connected state and has the data to send. An St_data packet always has a DATA payload. St_fin = 1 Finalize the connection. This is the last packet. It closes the connection, similar to TCP FIN flag. This connection would never have a sequence number greater than the sequence number in this packet. The socket records this sequence number as  eof_pkt. This lets the socket to packets that might still is missing and arrive out of order even after receiving the St_fin Packet. St_state = 2 State packet. Used to transmit a ACK with no data. Packets that don ' t include any payload does not increase the SEQ_NR. St_reset = 3 Terminate connection forcefully. Similar to TCP RST flag. The remote host does not have "any" to this connection. It is stale and should being terminated. St_syn = 4

Connect syn. Similar to TCP SYN Flag, this packet initiates a connection. The sequence number is initialized to 1. The connection ID is initialized to a random number. The SYN packet is special, all subsequent packets sent to this connection (except for re-sends of the St_syn) are sent wit H the connection ID + 1. The connection ID is what the ' other ' is expected to the ' in ' its responses.

When receiving an St_syn, the new socket should is initialized with the ID in the packet header. The send ID for the socket should is initialized to the ID + 1. The sequence number for the "return" channel is initialized to a random number. The other end expects a st_state packet (only a ACK) in response. SEQ_NR

This is the sequence number of this packet. As opposed to TCP, UTP sequence numbers are not referring to bytes, but packets. The sequence number tells the "other" which order packets should is served to the application. ACK_NR

This is the sequence number the sender of the packet last received to the other direction. Connection Setup

This is a diagram illustrating the exchanges and States to initiate a connection. The c.* refers to a state in the socket itself, pkt.* refers to a field in the packet header.

Initiating Endpoint Accepting endpoint |
          C.state = Cs_syn_sent | |
          C.SEQ_NR = 1 | |
          C.CONN_ID_RECV = rand () | |
          C.conn_id_send = c.conn_id_recv + 1 |                                               |
          |                                               |
          | |
          St_syn |   |
          seq_nr=c.seq_nr++ |   |
          ack_nr=* |   |
          conn_id=c.rcv_conn_id | |
          >-------------------------------------------> |             |
          c.receive_conn_id = pkt.conn_id+1 |             |
          c.send_conn_id = pkt.conn_id |             |
          C.SEQ_NR = rand () |             |
          C.ACK_NR = PKT.SEQ_NR |             | C.state = cs_connected |                                               |
          |                                               |
          |                                               |
          |                                               |
          |                     |
          St_state |                       |
          seq_nr=c.seq_nr++ |                       |
          ACK_NR=C.ACK_NR |                       | conn_id=c.send_conn_id |

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.