1. What is MD5?
MD5 is short for message-Digest algorithm 5 (Message Digest algorithm Fifth Edition, it is one of the hash algorithms (also translated hash algorithms and digest algorithms) Widely used in the current computer field to ensure the integrity and consistency of information transmission. mainstream programming languages generally have MD5 implementations.
2. What is a hash algorithm?
In information security technology, it is often necessary to verify the integrity of messages. The hash function provides this service, which produces fixed-length output for input messages of different lengths. The fixed-length output is called the "hash" or "Message Digest" of the original input message ).
3. Basic Features of Hash Functions
Hash functions must have two basic features: unidirectional and collision constraints.
3.1 unidirectional refers to the irreversible operation direction. In a hash function, it refers to an output that can only be pushed and exported from the input, rather than computed from the output;
3.2 collision constraint means that an input cannot be found to make its output result equal to a known output result or two different inputs cannot be found at the same time to make the output result completely consistent.
Only by strictly having such features can a function recognize such a hash.
4. Typical use of unidirectional data:
4.1. Password Encryption
With the unidirectional feature of hash functions, we can securely store passwords, passwords, and other security data. We need to store a lot of key data such as passwords in the database, but in the actual use process, we only compare the operation, so we can compare the hash results.
5. typical use of collision constraints:
5.1. Use a dictionary key (hash allowed)
In python, the hash value of the dictionary key is used to correspond to the value address in the memory. Therefore, two keys with the same hash are the same, objects that cannot be hashed cannot be used as Dictionary keys.
5.2. Information summary and Digital Signature
A message-digest is generated for a piece of information to prevent tampering. For example, many software in UNIX have a file with the same file name and the file extension. MD5 when downloading. This file usually contains only one line of text, with the approximate structure as follows:
Md5(wenjian.tar.gz) = 0ca175b9c0fda-a831d895e269332461
The digital signature of the wenjian.tar.gz file. MD5 treats the entire file as a large text, and generates this unique MD5 information digest through its irreversible String Conversion Algorithm. If the content of the file is changed in any form (including transmission errors caused by manual modification or unstable lines during the download process) during the future propagation of the file ), when you re-calculate the MD5 value for this file, you will find that the information digest is different. It can be determined that all you get is an incorrect file. If there is another third-party authentication organization, MD5 can also prevent the file author's "credit". This is called a digital signature application.
6. MD5 Vulnerabilities
The MD5 algorithm is old and the hash length is fixed to 128 bits. As the computing capability of the computer increases, it is possible to quickly find a "Collision. Therefore, MD5 should not be used in scenarios with high security requirements.
In 2004, Wang Xiaoyun proved that the MD5 digital signature algorithm may be quickly generated to generate a "Collision ". In February 2007, Marc Stevens, Arjen K. Lenstra, and benne de weger further pointed out that the MD5 algorithm was repeatedly attacked by forging software signatures. Researchers use the chosen-Prefix collision method to enable the program front-end to contain malicious programs and use the space to add junk code to generate the same MD5 hash value.
in 2007, scientists from the Netherlands University of einhoven Technology successfully collided two executable files with MD5, so that the two programs with different running results would be computed with the same MD5 value. In December 1, 2008, a group of researchers successfully generated a forged SSL certificate through the MD5 collision, which allowed the server to forge some Root CA signatures in the HTTPS protocol.