Translated from Yoram kulbak and Danny Bickson the eMule Protocol Specification
Translation: lzcx
QQ: 402722857
Email: lzcx_cn@yahoo.com.cn
For learning. For more information, see the source.
1
Introduction
1.1
Purpose and scope
EMule is a popular file sharing program based on the edonkey protocol. This report describes the network behavior of Emule and explains the basic terms required to understand the protocol. This report also provides complete specifications for the eMule network protocol, including an appendix that provides message formats. This document is based on the open-source eMule client. The next introduction aims to provide basic background knowledge for readers to read and understand this document. More eMule messages are found here.
1.2
Overview
The eMule network is composed of hundreds of Emule servers and millions of Emule clients. The client must connect to a server to obtain the network service. As long as the client is in the system, the server connection remains open. These servers mainly perform clustered index services (as if in Napster), which are unrelated to each other. Each eMule client is pre-configured with a server list and a shared file list of the local file system. The client connects to an eMule server using TCP to log on to the network and obtain the desired file information and client. The eMule client also uses hundreds of TCP connections to connect to other clients to upload and download files. Each eMule client maintains an upload queue for each shared file. The client to be downloaded is first added to the bottom of the queue, and then gradually advances until it reaches the top of the queue and starts to download its files. A client can download different file blocks of the same file from several different eMule clients. The client can also upload the file blocks of unfinished files. Finally, eMule extends the eDonkey capability and allows clients to exchange information about servers, other clients, and files. Note that the communication between the client and the server is based on TCP. The server uses an internal database to store information about the client and files. An eMule server does not store any files. It performs clustered indexes on the storage information of file locations. Another feature of the server began to be protested, because the connection failed to receive two connected clients due to the firewall connection. This connection function increases the server load. Compared with servers and other clients, eMule uses UDP to enhance client capabilities. The client's ability to send and receive UDP information is not mandatory in daily use. It can also run without flaws when a Firewall prevents it from sending and receiving UDP information.
1.2.1
Client-to-server connection
When starting, the client uses TCP to connect to an eMule server. The server provides a client ID, which is valid throughout the client-server connection lifecycle (Note: if the client has a high ID, it will receive the same ID from all servers until its IP address changes ). After the connection is established, the client sends its shared file list to the server. The server stores this list in its internal database, which usually contains hundreds of valid files and active clients. The eMule client also sends a download list containing the files it wants to download. The second chapter provides a detailed description of the TCP information exchange between the eMule client and the server. After the connection is established, the eMule server sends a list of other clients that use the files it wants to download to the client (these clients are called "sources "). From this point on, the eMule client starts to establish a connection with other clients, as described in 1.2.2. Note that the client/service TCP connection remains in the connection state during the entire client session. Primary account after the first handshake
User Activity inspires transactions: Sometimes, the client sends a file search request, which is responded to by the search result. A search transaction generally uses the source (IP address and port) after querying the specified file in the source) list to answer this query. The queryer can download files from this list. UDP is used for communication between the client and the server that it is not connected. The purpose of UDP information is to enhance file search, enhance source search, and maintain the connection status (ensure that the eMule server in the client server list is valid ). In chapter 3, you can find more details about customer-service UDP information exchange.
1.2.2
Client-to-client connectionOne eMule client connects to another emuel client (source) to download files. A file is divided into many parts for further fragmentation. The client can download the same file from several (different) clients to obtain different file fragments. When two clients are connected, they exchange capacity information and negotiate a download (or upload, as per your opinion. Each client has a download list. Remember a list of clients waiting to download files. When the eMule client's download queue is empty, a download request may lead to a download start (unless, for example, the requester is forbidden ). When the download queue is not empty, the requested client is added to the queue. Within a given period of time, a minimum bandwidth of 2.4 K/s cannot be provided for each client. A download client may be preemptible by a client waiting for a higher queue level than it. Within the first 15 minutes of the download session, the queue level of the eMule client that is being downloaded is increased until it can be prevented from being defeated. When the downloaded client reaches the header of the download queue, the uploaded client initializes a connection to send it the required file block. The eMule client can register and download blocks of the same file in the waiting queue of several other clients. When a waiting client actually downloads the file block (from one of them), it will not notify other clients to delete it in its queue, when it reaches their queue header, it simply rejects their upload intent. Emuley uses a credit system to encourage uploads. To prevent counterfeiting, the RSA public key password system is used to protect the credit system. The client connection may use a set of information not defined by the edonkey protocol, which is called the extended protocol. The extended protocol is used to implement credit systems. Common Information Exchange (such as server and source list updates) improves performance by sending and receiving compressed file blocks. When the eMule client is waiting to start downloading files, it uses UDP to periodically check the status of the upload queue client on its peer client.
1.3
Customer ID
The customer ID is a 4-byte identifier provided by the server when they connect to the handshake. The customer ID is valid only during the life cycle of the client-server TCP connection. Even if the client has a high ID, all servers will allocate the same ID until the IP address changes. Client IDs are divided into low IDs and high IDs. When a client cannot receive an input connection, the eMule server assigns a low ID to the client. Having a low ID limits the client's use of the eMule network, and may cause the server to reject a client connection. The calculation of high ids is based on the Client IP address, as described below. This section describes the distribution and importance of customer IDs from the perspective of the eMule protocol. Clients that allow other clients to freely connect to the eMule TCP port on their local machine (the default port number is 4662) will be assigned a high ID. There is no restriction on the use of Emule networks for clients with high IDs. When the server cannot open a TCP eMule port connected to the client, a low ID is assigned to the client. This mainly occurs when a client with a firewall is installed on the machine and the input connection is blocked. When the following conditions occur, the client will also receive a low ID: l when the client connects to the server through NAT or proxy server l when the server is busy (resulting in server reconnection timer timeout) the high ID is calculated using the following method: assume that the Host IP address is x.y. z. w, ID is x + 2 ^ 8 * Y + 2 ^ 16 * z + 2 ^ 24 * w. Low IDs are always less than 16777216 (0x1000000). I cannot find any clue about how it is calculated and get different low IDs in different servers. The low-ID client does not have any public IP addresses that other clients can connect to, so that all communication must be completed through the eMule server. This increases the load on the server's computing capability and causes the server to barely receive low-ID clients. This also means that the low-ID client cannot connect to other low-ID clients not on the same server, because eMule does not support pipe connections between servers. A callback mechanism is introduced to support low-ID clients. Using this mechanism, the high-ID client requests (through the eMule server) and the low-ID client connects to it to exchange files.
1.4
User IDEMule supports credit systems to encourage users to share files. The more files a user uploads to other clients, the more credit it receives, and the faster it advances in their waiting queue. User IDs are 128-bit (16 bytes) guid created by connecting random numbers. 6th and 15th bytes are not randomly generated and their values are 14 and 111, respectively. The customer ID is valid for the entire client and the specified server session. However, the user ID (also called user hash) is unique and is used to identify the client (User ID Recognition workstation) when the session is crossed ). The User ID plays an important role in the credit system, which provides motivation for hackers to impersonate other users to obtain the priority granted by their credit. EMule provides an encryption solution designed to prevent spoofing and impersonation. This implementation is a simple response exchange that relies on RSA public/private key encryption.
1.5
File ID
The file ID uniquely identifies files and files in the network for Damage Detection and repair. Note: eMule does not rely on the file name to uniquely identify and catalog the file, and calculates the guid identification file based on the hash file content. There are two types of file IDS-one is mainly used to generate a unique file ID, and the other is used to detect and repair damage.
1.5.1
File hashFiles are identified by the client and the 128-bit guid hash calculated based on the file content. GUID is calculated by applying the md4 algorithm to the file data. When calculating the file ID, the file is divided into 9.28mb in length. A guid is calculated for each part, and all the hashes are combined into a unique file ID. When the downloaded client downloads a part of the file, it calculates the hash and compares it with the sent hash. If this part is damaged, the client tries to gradually replace the bits in this part (kb each) to fix the damaged part until the hash calculation is OK.
1.5.2
Root hashThe sha1 algorithm is used to calculate the root hash for each part. The size of each part is kb. It provides higher levels of reliability and maintainability. More information can be found on the official website of Emule.
1.6
EMule
Protocol Extension
Although eMule is fully compatible with eDonkey, it implements several extensions that allow the two clients of Emule to provide users with additional functions. As long as the extension is concentrated on the communication between the client and the client, especially in the security and UDP usage fields. In this document, all information displayed by the information flow icon is extended by the eMule, which is displayed in gray.
1.7
Software and Hardware restrictions
There are two restrictions on the number of active users in the server configuration-software and hardware. Hardware restrictions are far greater than software restrictions. When the number of active users reaches the software limit, the server stops receiving new low-ID customer connections. When the number of users reaches the hardware limit, the server is full and no client connection is received.
Trackback:
Http://tb.blog.csdn.net/TrackBack.aspx? Postid = 599748