What exactly is a hash? How does it work?

Source: Internet
Author: User
Tags md5 md5 hash rfc sha1 file transfer protocol
It's been about two years since the birth of emule, with the popularity of emule, like his people are more and more, but because the emule of technology corresponding to a threshold, not as easy to start with BT, so many friends have been for a long time have such or such a question, Today is the weekend I also offer shortcoming, write an article about the hash.

Everyone is using the word emule,hash is the highest frequency in the emule, so what is the hash?

Let's take a look at some basic knowledge and make a warm-up so that we can better understand the hash.

Hash, the general translation to do "hash", there is a direct transliteration to "hash", is the arbitrary length of the input (also known as the pre-image), through hashing algorithm, transform into a fixed length of output, the output is hash value. This conversion is a compression map, in which the space of the hash value is usually much smaller than the input space, and different inputs may be hashed out into the same output, and it is not possible to uniquely determine the input value from the hash value.

Simply put, a function that compresses messages of any length into a message digest of a fixed length.

Hash is mainly used in the field of information security encryption algorithm, he put some different lengths of information into a cluttered 128-bit code, called the hash value. It can also be said that the hash is to find a data content and data stored in the map between the address of the relationship between

Understanding the basic definition of hash, we can not but mention some well-known hash algorithm, MD5 and SHA1 is the most widely used hash algorithm, and they are based on the MD4 design. So what do they all mean?
Here's a quick word:

1) MD4
MD4 (RFC 1320) was designed by MIT's Ronald L. Rivest in 1990, and MD is the abbreviation for message Digest. It is implemented on a 32-bit word processor with high-speed software-it is based on a bit operation of 32-bit operands.

2) MD5
MD5 (RFC 1321) is an improved version of Rivest in 1991 for MD4. The input is still grouped with 512 bits, and its output is a cascade of 4 32-bit characters, same as MD4. MD5 is more complex than MD4, and slower, but safer, and better at resistance to analysis and differentiation.

3) SHA1 and others
The SHA1 is designed by the NIST NSA to work with DSA, which produces a 160bit hash value for inputs that are less than 264 in length, and therefore has a better anti-poor (brute-force) nature. The SHA-1 design is based on the same principle as the MD4 and imitates the algorithm.

So what's the use of these hash algorithms?
The application of hash algorithm in information security is mainly embodied in the following 3 aspects:

1) File verification
We are familiar with parity check algorithm and CRC checksum, these 2 kinds of checksums do not have the ability to tamper with data, they can detect and correct the channel error in data transmission to some extent, but can not prevent malicious damage to the data.
The "digital fingerprint" feature of MD5 hash algorithm makes it become one of the most widely used file integrity checksum (Checksum) algorithms, and many UNIX systems provide the command to compute MD5 Checksum.
2) Digital Signature
Hash algorithm is also an important part of modern cryptography system. Because of the slow operation speed of the asymmetric algorithm, the one-way hash function plays an important role in the digital signature protocol. The Hash value, also known as a "digital digest", is digitally signed and can be considered to be statistically equivalent to the digital signature of the file itself. And there are other advantages to such an agreement.
3) Authentication Agreement
The following authentication protocol is also known as the "Challenge-authentication mode: This is a simple and secure way to be able to be listened to, but not tamper with, a transmission channel."

These are some basic preliminary knowledge about hashing and its related. So what exactly does he do in emule?

What is the hash value of the file?

As we all know, emule is based on Peer-to-peer (Peer-to-peer), which uses the "multiple source file Transfer Protocol" (Mftp,the Multisource filetransfer Protocol). In the Protocol, a series of criteria for transmission, compression, and packaging and integration are defined, and emule has a md5-hash algorithm setting for each file, which makes the file unique and can be tracked throughout the network.

The Digital Digest of md5-hash-file is computed by Hash function. Regardless of the length of the file, its hash function evaluates to a fixed-length number. Unlike the encryption algorithm, this hash algorithm is an irreversible one-way function. With a high security hash algorithm, such as MD5, Sha, two different files are almost impossible to get the same hash results. Therefore, once the file has been modified, it can be detected.

When our files are put into the emule for shared publishing, emule will automatically generate the hash value of the file according to the hash algorithm, which is the only identity of the file, it contains the basic information of the file, and then submit it to the connected server. When someone wants to make a download request for the file, the hash value lets others know if the file he is downloading is the one he wants. This value is especially important when the other properties of the file are changed (such as name, etc.). The server also provides information such as the address, port, and so on of the current user of the file, so emule knows where to download it.

Generally speaking, we will search for a file, emule after this information, will be added to the server to send a request for the same hash value of the file. The server returns the user information that holds the file. This way our clients can communicate directly with the user who owns the file and see if they can download the required files from him.

The hash value for the file in emule is fixed, is also unique, it is equivalent to this file information digest, regardless of the file on whose machine, his hash value is constant, no matter how long, this value consistently, when we are in the process of downloading and uploading files, emule is the value that is used to determine the file.

So what is Userhash?

The truth is the same, when we use emule for the first time, emule will automatically generate a value, this value is the only, it is our mark in the emule world, as long as you do not uninstall, do not delete config, your Userhash value will never change, The integral system works through this value, emule inside the integral preservation, identity recognition, is to use this value, and your ID and your username is irrelevant, you can change these things, your Userhash value is unchanged, which also fully guarantee the fairness. In fact, he is also a summary of information, but not to save the file information, but each of our information.

So what is a hash file?

We often see in emule day, emule is in the hash file, here is the use of the hash algorithm file check this function, the article has already said some of these functions, in fact, this part is a very complex process, currently in the FTP, BT and other software inside are used in this basic principle, emule inside is the use of File block transmission, so that each piece of transmission to be compared to check, if the error is to be downloaded, this period of relevant information written to the Met file until the entire task completed, this time part file to rename, Then use the move command, transfer it to the incoming file, and then the Met file is automatically deleted, so we sometimes encounter the hash file failure, that is, the information in the Met is not enough to match the part file and the other time the boot will be crazy hash, There are two situations in which you use the hash to extract all the file information at the first time, and the last time you shut down the computer illegally, this is the time to make a wrong check.


On the hash algorithm research, has always been a frontier in information science, especially in the popularization of network technology today, his importance is more and more prominent, in fact, our daily Online information exchange Security verification, we are using the operating system key principle, which has its figure, Especially for those who are interested in the study of information security, this is a key to open the information world, he is in the hack world is also a focus of research

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.