How emule file corruption is handled

Source: Internet
Author: User
Tags file size hash sha1 hash sha1 hash algorithm valid

EMule (electric Mule) uses a variety of methods to ensure that files downloaded and shared on the network are not faulty. When an error occurs, or when it is called a corruption, the advanced functionality that emule has can fix the damage by simply downloading the smallest amount of data.

File hash and ICH-intelligent corruption handling

File hash, file segment hash value and HashSet

Each file that is shared in the network is computed by the MD4 mathematical encryption algorithm for a theoretically unique eigenvalue. This value is called a file hash value (also: hash) and is included in each standard ed2k link. For example: ed2k://|file|name|12043984|6744fc42eda527b27f0b2f2538728b3e|/

Where 6744fc42eda527b27f0b2f2538728b3e is the file hash value, which is the unique feature value of the file in the network.

The file hash value is computed by dividing the file into each 9.28 MB file segment, and each file segment uses the same MD4 algorithm to obtain a file segment hash value. Using these hash values, called HashSet, you can then calculate the final hash value of the file. For example, a 600MB file will be cut into 65 pieces of file segments, each with its own file segment hash value, and then use them to calculate the final file hash value.

To ensure that emule always receives the correct hashset, you can create a special link that contains HashSet, for example: ed2k://|file|name|12043984|6744fc42eda527b27f0b2f2538728b3e| p=264e6f6b587985d87eb0157a2a7baf40:17b9a4d1dce0e4c2b672df257145e98a|/

The value behind the p= indicates the hashset. Each file segment hash value is separated by a half-angle colon ":". The file size in the example is 12043984 bytes (=11.49 MB), which means it is split into two file segments, a complete 9.28MB file segment, and a file segment of the remaining size, each with its own file segment hash value.

ICH Intelligent Corruption processing

When emule completes a download of a file segment, it checks to see if the data matches the file segment hash value for that section. If correct, this section can be uploaded to help file propagation.

If the error occurs, the damage is reported and the section is downloaded again. To avoid downloading the entire 9.28MB of data, ich downloads the portion of the first 180KB size at the beginning of the file segment, and then checks again for the correct hash value of the file segment. If the error persists, the next 180KB will continue to be downloaded and checked again. Until the hash value of this section is correct. Ideally, emule only needs to download the correct data for the beginning 180KB of the file segment. At worst, if the damaged part is at the end of the file segment, the entire file segment will be downloaded again. On average, Ich can save 50% of the need for a download when the file segment is corrupted.

AICH-Advanced Smart corruption handling

The standard ICH function has been very efficient, but there is a limitation that only the entire 9.28MB portion can be calibrated, not finely sliced. If more than one location is damaged, or if some harmful clients deliberately send corrupted data over and over again, and even falsify the entire file segment hash value, the ICH will no longer be valid.

The Aich, through finer hashes, only requires minimal overhead or a renewed consumption to take care of the integrity of the entire data.

Root hash, block hash & AICH HashSet

This time from the 9.28MB file section. Each file segment is sliced into a 180KB-size block of files so that each file segment can be sliced into 53 blocks of files, and each file block uses the SHA1 hash algorithm to compute a hash value. These values are called block hash values and form the lowest level in a complete Aich hashset.

The figure above shows how a file with 4 file segments constructs a complete hash tree. Each file segment contains 53 blocks totaling 212 hash values, all the while to the root hash constitutes a level 7 hash tree. The entire hash tree is called Aich hashset.

These green and yellow dots show the smallest chunk hash and the mathematical dependencies between the root hash. This means that if we trust the root hash value, then the entire hash tree can be validated in turn.

emule can create a link that contains a root hash value, for example: ed2k://|file|name|12043984|6744fc42eda527b27f0b2f2538728b3e|h= a2nwotyuruu3p3gcub6kcnw3ftyyelqb|/

Where the h= followed by the root hash value. Providing a trusted root hash value for a publication can significantly improve the repair power of file corruption.

Recovery of corrupted data

When emule detects data corruption in one file segment, it randomly requests a data recovery package that contains a full Aich Hash set. The recovery package contains all 53 hash values for the corrupted file segment, and a hash tree's checksum hash number. The figure above shows a data recovery package for a file containing 4 file segments. The number of checksum hashes is computed by the file segment (2^x >= ' number of file segments ', where x = Check the number of hashes).

After receiving the data recovery package, emule checks the checksum hash to reverse the trusted root hash. If they match, emule checks to see if all 53 blocks of the corrupted portion correspond to the block hash values in the recovery package. Aich then saves all the blocks of blocks that match their hash values and simply downloads the corrupted blocks of files.

A successful data recovery in the log displays information similar to the following:

09.09.2004 02:43:43: The downloaded file segment 6 is corrupted: ([filename])

09.09.2004 02:43:46:aich successfully recovered the 8.22 mb/9.28 MB file segment: 6 files: [filename]

Trust root hash (root hash)

It is best to download from a link with a root hash. And assuming that the source of the link is reliable, the root hash of the file is trusted once and saved to disk.

If no root hash is provided in the link, emule will have to trust the root hash value sent from the file source. You will only trust the same root hash sent by at least 10 different sources, and you must have 92% of the source trust this value. Because this root hash is unreliable and valid only in the current session, it will not be saved and will not be used to create a link with a root hash.

When emule builds a Aich hashset, for example, when the file is downloaded, it propagates the root hash to other customers.

Attention:

Newly published or rare files may not have sufficient sources to generate a trusted root hash. It is strongly recommended that the Publisher add the associated hash value when publishing the file.

If there is no root hash or even a forged one, emule can also successfully download and complete the file in general. However, the Aich feature will not be available at this time.

Because the Aich hashset values are very large, they are not loaded into memory, but are saved to the Known2.met file and read only when needed.

Aich only supports emule v.44a and above, but preserves compatibility with legacy clients.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.