Original address: http://bbs.nju.edu.cn/bbstcon? Board = BitTorrent & file = M.1209531185.A
Thanks to the original author
BitTrrent (BT) is a file distribution protocol that identifies content through URLs and
Joint. Its advantage on the HTTP platform is that, at the same time, the downloader of an object continuously downloads each other
This allows you to upload data to a large number of file sources with limited loads.
A bt file distribution requires the following entities:
· A common network server
· A static Metadata File
· A bt Tracker
· A "original" downloader
· Network terminal Viewer
· Network terminal downloader
It is assumed that there are multiple Downloaders in the next file.
To set up a BT server, follow these steps:
1. Start running Tracker (skip this step if it is already running );
2. Start running common network server programs, such as Apache, skip this step;
3. Associate the. torrent file with the Mimetype type application/x-bittorrent (already
Skip this step );
4. Create a metadata file (. torrent file) with the complete file to be released and the Tracker URL );
5. Place the metadata file on the network server;
6. Publish the meta-information file (. torrent file) link on the webpage;
7. The original downloader provides the complete file (originally ).
The steps for downloading through BT are as follows:
1. Install the BT Client Program (skip this step if it is already installed );
2. surfing the internet;
3. Click a link to the. torrent file;
4. Select the local storage path and select the files to be downloaded (for BT client users with the download function selected );
5. Wait until the download is complete;
6. the user exits the download process (the previous user does not stop uploading ).
The connection status is as follows:
· The website normally provides static file connections and starts the BT program on the client;
· Tracker instantly receives information from all Downloaders and sends a random peer list to each Downloader. Use HTT
P or HTTPS protocol implementation;
· The downloader connects to the Tracher once every other time, informs the progress of the progress, and directly connects to the p
The eer uploads and downloads data. These connections follow the BitTorrent peer protocol and communicate over TCP.
· The original downloader only uploads and does not download the entire file, so it is necessary to transmit all
. In some popular downloads, the original downloader can usually exit the upload within a short period of time.
It has been downloaded to the entire file and continues to provide the upload.
Both the metadata file and the Tracker response information are in a simple, efficient, and scalable format (Bencoding, B encoding)
Transfer. B-encoded information is nested with dictionaries and lists containing string and integer data (like in Python
), Scalability is to add new features by reducing the key value ignored by the dictionary.
B encoding rules are as follows:
· String indicates the length of a given string in decimal number plus a colon and then the original string.
For example, 4: spam is equivalent to 'spam '.
The integer data is expressed as the first plus 'I' followed by 'E' in the middle of the decimal number, such as i3e is equivalent to 3, i-3e is-3
. Integer Data has no length limit. The i-0e is invalid, and all i0e starting with 'i0', except for 0, are invalid
.
· The list is encoded as a 'l' followed by the project (which has been encoded) and then appended with 'e', for example
L4: spam4: eggse is equal to ['spam', 'egg'].
· The dictionary code starts with a 'D' and ends with a list of alternating key values (KEYS) and their corresponding values.
'E '.
For example, d3: cow3: moo4: spam4: eggse is equivalent to {'cow': 'moo', 'spam': 'egg '}
D4: spaml1: a1: bee equivalent to {'spam': ['A', 'B']}
The key value must be a processed string (encoded with the original string, and not mixed-coded with numbers and letters)
.
Meta-information files are B-encoded dictionaries with the following key values:
Announce (statement)
The URL of the Tracker.
Info)
This key value corresponds to a dictionary that contains the following key values:
The key value name corresponds to a string that represents the default name of the downloaded file or saved as a directory. It is purely built
.
The key value piece length (Block length) corresponds to the number of bytes that the file is divided. For transmission purposes, the file is divided
Cut into blocks of equal size, except the last one is usually smaller. Generally, the block length is the weight of 2.
The length is 256 K (18 power of 2 ).
The key value pieces (Block) corresponds to a string, which is a multiple of 20. It can be further divided into 20 words
Multiple strings in the section, corresponding to the SHA1 checksum (hash) of the block in the index respectively ).
There are also key values length and files, which cannot appear at the same time or both. When l
The appearance of ength indicates that this meta-information file is only a single file download; otherwise, it indicates that it is a multi-file directory structure download.
In a single file, length corresponds to the number of bytes of the file length.
Multi-file mode is considered to integrate many single files into one large file for download in the order of the file list, and the key value
Files corresponds to the file list, which is a dictionary list. Each dictionary contains the following key values:
Length)
The number of bytes in the file length.
Path)
A list containing strings. the string is the subdirectory name, and the last string is the file name.
(A zero-length form is incorrect .)
In the case of a single file, the key value name is the file name; in the case of multiple files, it becomes the directory name.
The Tracker question is bidirectional. Tracker obtains information through the http get parameter, and then returns a B-encoded message.
. Although Tracker needs to be executed on the server side, it runs smoothly like a module of Apache.
Tracker's GET request has the following key values:
Info_hash
A 20-byte SHA1 verification code. The value of info in the meta-information file encoded by B is one of the meta-information files.
Branch. This value is automatically converted.
Peer_id
A 20-byte string is a random ID generated when each user starts downloading. This value is also automatically converted.
.
Ip
An optional parameter indicates the IP address (or DNS host name) of the peer, which is generally the same as that of the Tracker.
The downloader obtains the file to distribute the file.
Port
Listener port. The official default is to start from Port 6881. If the port is occupied, push it backward to find the port.
Idle port until Port 6889.
Uploaded
Currently, the total number of uploads is in decimal ASCII code.
Downloaded
Currently, the total downloads are encoded in decimal ASCII code.
Left
Number of undownloaded bytes, encoded as ASCII code. This number is not calculated based on the file length and downloaded number.
Because the file may be resumed and some downloaded data cannot pass the integrity check and must be re-downloaded.
.
Event
This is a key value for selectivity. Options include started, completed, or stopped (or empty, equivalent to none
Run ). If the statement is not run, the statement will be issued at regular intervals. The started value is issued when the download starts.
To complete the download. When the file is complete and then starts again, no completed is sent, the downloader stops
Stopped is issued during download.
Tracker's response is also a B-encoding dictionary. If the Tracker response has a key value failure reason (cause of failure)
.
Otherwise, the response must have two key values: interval (interval), which corresponds to the number of seconds at which the downloader regularly sends requests
; Peers: the string and port number of the peer's self-selected ID, IP address, or DNS host name. Remember, peers does not follow
Send requests at scheduled intervals if they have an event or want more peers.
If you want to expand the metadata file or Tracker question, Please coordinate with Bram Cohen to ensure that all
All extensions are compatible.
BitTorrent peer operates over TCP. It can run smoothly without adjusting any socket options.
Line.
The connections between peers are symmetric. The information sent from both directions must be consistent, and the data can flow to either party.
The peer protocol indicates that a peer is downloaded from scratch. Each time a peer obtains a block described in the metadata file index and the Verification Code
If the block is consistent, it is declared to all peer that the block has been obtained.
The two connected terminals have two status indicators: whether the connection is blocked or not, whether the connection is concerned or not, and whether the connection is blocked indicates
No longer sends notifications before the data is restored. The Cause and technical problems of blocking will be mentioned later.
Data transmission occurs when one party pays attention to the other party and the other party is not blocked. Follow status must be consistent-if 1
A non-blocking peer does not have the data that others need, and others will lose their attention to it, instead, they will focus on the data that is being blocked.
. It takes great care to fully execute such conditions, but this does allow the Downloader to know which peers are blocking
After the plug disappears, you can download it immediately.
The connection will gradually disconnect the peer that is not interested or blocked.
When transmitting data, the downloader must prepare multiple requests in a queue to achieve high TCP transmission efficiency (this is called
Request "). On the other hand, requests that cannot be written into the TCP buffer should be immediately discharged into the memory, instead of a response
Use Program-level network buffering. Once blocking occurs, all these requests are discarded.
The peer connection protocol includes a handshake that keeps the same size and determines the information flow. The handshake starts with 10 characters.
9 (decimal) followed by the string 'bittory '. The START character is a fixed length.
Other new protocols can also be distinguished.
After that, All integers sent to the Protocol are encoded as 4-byte and end-to-end.
In existing applications, the header data is followed by eight bytes reserved to 0. If you want to change these 8 reserved words
Section to extend the protocol, Please coordinate with Bram Cohen to ensure compatibility with all extensions.
Then there is a SHA1 Verification Code 20 bytes in the info value encoded by B in the meta information file (and info_hash to Trac
The values declared by the ker are the same, but here the original values are referenced ). If the values are different, the connection is disconnected. One
The exception is that the downloader wants to download multiple connections using only one port. They will first obtain a verification code from the access connection.
And then compare it with the one in the list.
The verification code is followed by a 20-byte peer id, which is included in the peer list of the Tracker response.
Is reported. If the recipient's peer id does not meet the sender's expectation, the connection is closed.
The handshake is complete. Followed by a fixed length of interactive information flow. The zero-length information is used to maintain connections and is ignored. Such a letter
Messages are usually sent once every 2 minutes, but it is easy to time out while waiting for data.
All bytes starting with information for non-persistent connections are of the type. The possible values are as follows:
· 0-blocking
· 1-unobstructed
· 2-follow
· 3-ignore
· 4-Yes
· 5-bit groups
· 6-Request
· 7-Blocks
· 8-cancel
"Blocking", "unobstructed", "Follow", and "don't pay attention" information without loads.
The "bit group" class information is only sent as the first message. It loads a bit group, and the downloadable index is set to 1.
0. The "bit group" information is skipped when no data downloader starts to download. First byte high to low position corresponding Cable
0-7, and so on. The second byte corresponds to 8-15, and so on. The remaining bits at the end are set to 0.
There is one "existing" type of information load, that is, the number of indexes that have just been downloaded and verified.
The "request" class information includes an index, start, and length. The latter two are byte offsets. The length is generally 2.
Value unless it is truncated at the end of the file. The current value is usually the 15 power of 2, and the connection of 17 power lengths greater than 2 is closed.
The "cancel" class information load has the same load as the "request" class information. It is usually close to the completion of the download, that is ,"
Phase. When the download is almost complete, the remaining several parts will be downloaded from the same thread, which will
Slow. To ensure that the remaining parts are quickly downloaded
He is sending a request for all the remaining parts from the connector of the data downloaded from the other party. To avoid inefficiency
A cancellation message is sent to other peer when the download starts.
The "Block" class information contains an index, start, and block. Remember that it is related to the "request" class information. When the transmission speed
Slow or "blocking" or "unobstructed" type information is frequently sent out or both of them occur at the same time.
Required block.
The order of the downloaded parts is random. This prevents the downloaded parts from having only the same subset or super
Set.
Blocking occurs for many reasons. Information congestion control of TCP protocol is manifested in the process of sending information to multiple connections in real time.
Very poor. At the same time, the existence of blocking allows the Downloader to use a tooth-return algorithm to ensure a stable download rate.
The blocking algorithm described below is the current basic configuration. What's important is that all new algorithms should not only include all extended computing
The algorithm runs well in the network, but also in the network that mainly includes this basic algorithm.
A good blocking algorithm has many standards. It must block a certain number of simultaneous uploads for good TCP Performance
To avoid frequent congestion and smooth alternation, that is, "fibrosis ". It should use data exchange to repay itself
Data peer. Finally, it should try to connect to the unused peer occasionally to find out
A good connection is called a trial dredging.
The current blocking algorithm avoids fibrosis by converting the blocked list every 10 seconds. Clear up four of your own concerns and
They get the peer with the highest download rate for upload and data exchange. High upload rate but not disabled
Note: The downloader's peer is blocked. Once these peers are concerned, those with the lowest upload rate will be blocked.
. If the downloader has a complete file, he/she uses his/her own upload rate instead of the download rate to determine who is connected.
In trial dredging, a peer is purged at any time regardless of its upload rate (if it is noticed,
It will become one of the four download peers ). This type of peer is rotated every 30 seconds. To
They are an opportunity to upload the entire block. The new connection starts to be connected three times the number of attempts to clear the block during rotation.