Peer wire protocol (TCP)
Overview
The Peer protocol makes it easy to exchange slices (piece). For the description of the slices, see the metadata file.
Note: The original specification also uses the term piece "(slice)" when describing the Peer protocol, but this is different from the term "piece (slice)" in the metadata file )", for this reason, this specification uses the term block to describe the data exchanged between peers (ENDS.
A client must maintain the status of its connection to each remote peer (client:
L choked: whether the remote peer (client) has choke the client. When a peer (client) choke client is notified to the client, it will not respond to any requests sent by the client unless it unchoke the client. This client should not try to send data requests to remote peer, and should think that all the requests that have not been answered have been discarded by remote peer.
L interested: whether the remote peer is interested in the data provided by the client. This is where the remote peer notifies the client. When the client unchoke them, the remote client starts to request blocks ).
Note that this also means that the client needs to record whether it is interested in remote peer and whether it is choke/unchoke remote peer. So the real list looks like this:
L am_choking: the client is choke remote peer.
L am_interested: this client is interested in remote peer.
L peer_choking: the remote peer is choke this client.
L peer_interested: Remote peer is interested in this client.
When the client connection starts, the status is choke and not interested (not interested ). In other words:
L am_choking = 1
L am_interested = 0
L peer_choking = 1
L peer_interested = 0
When a client is interested in a remote peer and the remote peer does not have the choke client, the client can download blocks from the remote peer ). When a client does not have a choke peer and the peer is interested in the client, the client uploads a block ).
It is important that the client continuously notifies its peers whether it is interested in them. The status information of the client and each client must be up-to-date, even if the client is choke. This allows all peers to know whether the client starts downloading after they unchoke the client (and vice versa ).
Data Type
If no other method is specified, All integers in the Peer wire protocol are encoded as 4-byte big-Endian values. This also includes the length prefix of all messages after handshaking.
Message Flow)
(Note: because the message in the ICMP-Internet Control Packet protocol is translated into a report, the data transmitted in the IP/TCP layer is also translated into a report, and the data transmitted at the application layer is translated into a report, therefore, here the message is translated into a report)
The peer wire protocol is composed of an initial handshake. After the handshake, the peers communicates through the exchange of messages prefixed with the length. The length prefix is the integer described above.
Handshake)
Handshake is a required packet and must be the first packet sent by the client. The length of the handshake packet is (49 + Len (pstr) bytes.
Handshake: <pstrlen> <pstr> <reserved> <info_hash> <peer_id>
L pstrlen: <pstr> string length, in a single byte.
L pstr: Protocol identifier, string type.
L reserved: 8 reserved bytes. All current implementations use 0. Each byte in these bytes can be used to change the Protocol behavior. For emails from Bram, we recommend that you use the following bits so that you can use the preceding bits to change the meaning of the subsequent bits.
L info_hash: 20-byte sha1 hash of the value corresponding to the info key (key) in the metadata file. This info_hash is the same as info_hash In the tracker request.
L peer_id: a 20-byte string that uniquely identifies the client. This peer_id is usually the same as the peer_id transmitted in the tracker request (but not all. For example, in Azureus, there is an anonymous option ).
In BitTorrent 1.0, pstrlen = 19, pstr = "BitTorrent protocol ".
The initiator of the connection should immediately send a handshake packet. If the receiver can serve multiple torrent instances at the same time, it will wait for the handshake packet of the initiator (torrent is uniquely identified by infohash ). However, once the receiver sees the info_hash part of the handshake packet, the receiver must respond as soon as possible. The Nat-checking feature of tracker does not send the peer_id field of the handshake packet.
If a client receives a handshake packet and the client does not have info_hash serving the packet, the client must discard the connection.
If a connection initiator receives a handshake packet and the peer_id in the message does not match the expected peer_id, the connection initiator should discard the connection. Note that the initiator may receive peer information from tracker, which contains the peer registered peer_id. The peer_id from the tracker must match the peer_id In the handshake packet.
Peer_id
Peer_id is 20 bytes long. As for how to encode the client and Client Version Information into peer_id, there are two main conventions: Azureus style and shadow style.
The Azureus style uses the following encoding method: '-', followed by the Client ID with two characters, followed by the version number of four numbers, '-', followed by a random number.
Example: '-az2060 -'...
Well-known clients using this encoding style are:
L 'ag'-Ares
L'a ~ '-Ares
L 'ar '-Arctic
L 'at'-Artemis
L 'ax '-bitpump
L 'az'-Azureus
L 'bb'-bitbuddy
L 'bc'-bitcomet
L 'bf'-bitflu
L 'bg '-BTG (uses rasterbar libtorrent)
L 'bp '-BitTorrent Pro (Azureus + spyware)
L 'br '-bitrocket
L 'bs '-btslave
L 'bw '-bitwombat
L 'bx '-~ BitTorrent x
L 'cd'-enhanced ctorrent
L 'ct '-ctorrent
L 'de'-delugetorrent
L 'dp '-propagate Data Client
L 'EB '-ebit
L 'els'-Electric Sheep
L 'fc'-filecroc
L 'ft '-foxtorrent
L 'gs '-gstorrent
L 'hl '-Halite
L 'hn '-hydranode
L 'kg '-kget
L 'kt '-ktorrent
L 'lc '-leechcraft
L 'lh '-LH-ABC
L 'lp'-lphant
L 'lt '-libtorrent
L 'lt '-libtorrent
L 'lw'-LimeWire
L 'mo'-monotorrent
L 'mp'-moopolice
L 'Mr '-Miro
L 'mt'-moonlighttorrent
L 'nx '-net transport
L 'ot'-omegatorrent
L 'pd '-Pando
L 'qb'-qbittorrent
L 'qd '-qqdownload
L 'qt '-QT 4 torrent example
L 'rt '-retriever
L 'rz'-reztorrent
L's ~ '-Incluaza alpha/beta
L 'SB '-~ Swiftbit
L 'ss'-swarmscope
L 'st'-symtorrent
L 'st'-sharktorrent
L 'sz'-zhangaza
L 'tn'-torw.dotnet
L 'tr'-Transmission
L 'ts'-torw.storm
L 'TT'-tuotu
L 'ul '-uleecher!
L 'Um '-µtorrent for Mac
L 'ut'-µtorrent
L 'vg '-vagaa
L 'wt '-bitlet
L 'wy'-firetorrent
L 'xl '-Xunlei
L 'xt '-xantorrent
L 'xx'-xtorrent
L 'zt '-ziptorrent
Clients that need to be identified include:
L 'bd '(for example,-bd0300 -)
L 'np '(for example,-np0201 -)
L 'sd' (for example,-sd0100 -)
L 'wf '(for example,-wf2200 -)
L 'hk '(for example,-hk0010-) China IP address, IP address, unrequestedly sends info dict in message 0xa, reconnects immediately after being disconnected, reserved bytes = 01,01, 01,01, 00,00, 02,01
The Shadow style uses the following encoding method: An ASCII letter or number used for the client identity, with a version number of up to five characters (if less than five characters are used, it is filled ), followed by three characters (usually '---', but not always like this), followed by a random number. Each character in the version string represents a number ranging from 0 to 63. '0' = 0 ,..., '9' = 9, 'A' = 10 ,..., 'Z' = 35, 'A' = 36 ,..., 'Z' = 61 ,'. '= 62,'-'= 63.
Here you can find detailed descriptions about the shadow encoding style (including the three character usage habits after the version string.
Example: 's58b ----- 'for shadow 5.8.11 -----'...
Well-known clients using this encoding style are:
L 'a'-ABC
L 'O'-Osprey permaseed
L 'q'-btqueue
L 'r'-tribler
L's '-Shadow's client
L 'T'-bittornado
L 'U'-UPNP Nat Bit Torrent
Bram clients now use this style: 'm3-4-2 -- 'or 'm4-20-8 -'.
Bitcomet uses different encoding styles. Its peer_id consists of four ASCII characters 'exbc', followed by two bytes of X and Y, and finally random characters. In the version number, X is before the decimal point, and Y is the two digits after the version number. Bitlord uses the same scheme, but adds 'lord' after the version number '. An informal patch of bitcomet used 'B B' to replace 'exbc '. Since version 0.59, bitcomet peer ID encoding uses the Azureus style.
The xbt client also uses its own style. Its peer_id is composed of three uppercase letters 'xbt 'and the three ASCII numbers that follow it to represent the version number. If the client version is debug, the seventh byte is the lowercase character 'D'; otherwise, it is '-'. Followed by '-', followed by random numbers, uppercase letters and lowercase letters. For example, if the start part of peer_id is 'xbt054d-', it indicates that the client is a debug version of 0.5.4.
Opera 8 preview and opera 9. x release use the following peer_id scheme: the first two characters are 'op', and the last four digits are development codes. The subsequent characters are random lowercase hexadecimal numbers.
The mldonkey uses the following peer_id scheme: the start character is '-ml', followed by the point version, followed by a'-', followed by a random string. For example, '-ml2.7.2-kgjjfkd '.
Bit on wheels uses the '-bowxxx-yyyyyyyyyyyy' mode, where Y is random (uppercase letters), and X depends on the version. If the version is 1.0.6, xxx = AOC.
Queen Bee uses a new Bram style: 'q1-0-0 -- 'or 'q1-10-0-' followed by random bytes.
Bittyrant is a branch of Azureus. In its version 1.1, its peer ID uses the 'az2500bt '+ random byte method.
Torrentopia 1.90 claims to be or originated from Mainline 3.4.6. Its Peer ID starts with "346.
Bitspirit can be used to encode peer IDs. One mode is to read its peer ID and reconnect it with the first eight bytes as the basis of its peer ID. Its actual ID uses '\ 0 \ 3bs' (c Tag Method) as version 3. the first four bytes of X, using '\ 0 \ 2BS' as version 2. the first four bytes of X. All methods use 'udp0 'as the end.
Rufus uses its decimal ASCII version value as the first two bytes. The third and fourth bytes are 'rs '. Followed by users' nicknames and random bytes.
The peer ID of C3 torrent starts with '-g3', and then append up to nine characters indicating the nickname of the user.
Flashget uses the Azureus style, but the first character is 'fg' '-'. Version 1.82.1002 still uses the version number '123 '.
BT next evolution originated from bittornado, but tried to imitate the Azureus style. The result is that its peer ID starts with '-nee', followed by the version number of four numbers, and three characters of the client type are described in the shadow peer ID style.
Allpeers takes the sha1 hash of a user dependent string (this is not easy to translate, to be translated). Use "AP" + version string + "-" to replace the START character.
The ID of qvod starts with the four letters "qvod", followed by the development code of four decimal numbers (currently "0054 "). The last 12 characters are random uppercase hexadecimal numbers. There is a modified version in China, which replaces the first four characters with random bytes.
Many clients use random numbers or random numbers followed by 12 full 0 (such as the old version of the Bram client ).
Message)
The following structure is used for all the packets in the next protocol: <length prefix> <Message ID> <payload>. Length prefix is a 4-byte big-Endian value. Message ID is a single decimal value. Playload is related to messages.
L keep-alive: <Len = 0000>
The keep-alive message is a zero-byte message, and the length prefix is set to 0. There is no message ID or payload. If the peers does not receive any messages (keep-alive or any other messages) within a fixed period of time, the peers should disable this connection, therefore, if no command is issued within a given period of time, peers must send a keep-alive message to keep the connection active. Generally, this time is 2 minutes.
L choke: <Len = 0001> <id = 0>
The choke message is fixed in length and has no payload.
L unchoke: <len= 0001> <id = 1>
The length of the unchoke message is fixed and there is no payload.
L interested: <Len = 0001> <id = 2>
The length of the interested message is fixed and there is no payload.
L not interested: <len= 0001> <id = 3>
The length of the not interested message is fixed and there is no payload.
L have: <Len = 0005> <id = 4> <piece index>
The have packet length is fixed. Payload is the index of piece (slice) starting from scratch. The slice has been successfully downloaded and has passed hash verification.
Note: In fact, some clients must strictly implement this definition. Because peers is unlikely to download the piece they already own, one peer should not notify another peer that it owns a piece if the other peer owns the piece ). The "have suppresion" Operation minimizes the number of have packets. In general, the number of have packets is roughly reduced by 25-35%. At the same time, it is worthwhile to send an have packet to a peer with a piece (CHIP), because it helps determine which piece is scarce.
A malicious peer may broadcast a piece of piece that they cannot download to another peer ). Due to this attempting to model peers using this information is a bad idea.
L bitfield: <Len = 0001 + x> <id = 5> <bitfield>
Bitfield messages may be sent only after the handshake sequence is sent and before other messages are sent. It is optional. If a client does not have a piece (CHIP), it does not need to send the message.
The length of a bitfield packet is variable. X indicates the length of a bitfield. Payload is a bitfield, which indicates that the downloaded piece has been successful ). The first byte is equal to the piece index 0. The bit set to 0 indicates that there is no piece, and the bit set to 1 indicates valid and available piece. The last digit is set to 0.
Bitfield with incorrect length will be considered as an error. If the client receives a bitfield with a wrong length or bitfield with any delimiter set, it should discard the connection.
L request: <lename = 0013> <id = 6> <index> <begin> <length>
A fixed request message length is used to request a block ). Payload contains the following information:
N index: an integer that specifies the piece index starting from scratch.
N begin: an integer that specifies the byte offset from zero in piece.
N length: an integer that specifies the length of the request.
L piece: <Len = 0009 + x> <id = 7> <index> <begin> <block>
The length of the piece packet is variable, where X is the block length. Payload contains the following information:
N index: an integer that specifies the piece index starting from scratch.
N begin: an integer that specifies the byte offset from zero in piece.
N block: data block, which is a subset of the piece specified by the index.
L cancel: <Len = 0013> <id <= 8> <index> <begin> <length>
The Cancel message is fixed in length and is used to cancel block requests. Playload is the same as the playload of the request message. It is generally used to end the download.
L port: <Len = 0003> <id = 9> <listen-port>
the port packet is sent by the new mainline version, and the new mainline version implements a DHT tracker. The listening port is the port that the Peer's DHT node is listening. This peer should be inserted into the local route table (if DHT tracker is supported ).