Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞
I saw someone Discussing MD5 cracking on the Forum a few days ago. Many people's understanding of MD5 surprised me: Some people think that MD5 is an encryption algorithm, some people think that because it is impossible to return plain text from MD5 hash, it is meaningless to study the attack. It is a pity that Professor Wang Xiaoyun's achievements are meaningless. So today, we decided to talk about MD5 cracking, which is an obligation to popularize it. For more information about MD5, see Wikipedia/MD5.
Generally, attacks against a hash algorithm can be divided into three levels:
1. preimage attack (original image attack ?) : Given H, find the plaintext m so that H = hash (M). If a hash algorithm is used to find the preimage attack, the algorithm will be finished. 2, second preimage attack ?) Given the plaintext M1, find another plaintext m2 (not equal to M1) so that hash (M1) = hash (m2); 3, collision attack (collision attack ): locate M1 and M2 so that hash (M1) = hash (m2 ). With regard to MD5, Professor Wang Xiaoyun's achievement was to find collision within the computing time. The current progress is that, based on the improved algorithm proposed by Professor Wang Xiaoyun, someone can use a notebook to find collision within a few hours. What does this mean? Many people think that collision is meaningless because M1 and M2 cannot be specified at will in actual applications. In fact, this is only to be known. Like other popular hash algorithms, MD5 has a well-known weakness called length extension. It can be described in mathematical languages as follows:
If MD5 (M1) = MD5 (m2)
Then, MD5 (M1 | m') = MD5 (M2 | m ')
| Indicates a string connection. The current collision search algorithm can specify the initial hash state at will, which means any prefix can be constructed. In addition, length extension means any suffix can be constructed. Therefore, based on any collision, we can construct two strings with the same MD5 hash, so that:
MD5 (Preamble + R1 + suffix) = MD5 (Preamble + r2 + suffix ),
Among them, how does MD5 (Preamble + R1) = MD5 (Preamble + R2) Use collision attack? One example is to use random collision to construct two applications with different functions and keep their MD5 values equal. We know that applications on Windows use the PE format. The pe program consists of the following parts:
PE Header, PE File Header
. Text section, code segment
. Data section, Data Segment
The construction process of other sections (. reloc,. RDATA,. TLS, etc) is as follows:
1. Write an application to implement the two different functions you want. 2. manually adjust the PE file so that its layout is as follows:
. Text Section
. Data Section
Other sections 3, with PE Header as preamble, searches for random collision so that:
X1 = PE Header | R1;
X2 = PE Header | R2;
MD5 (X1) = MD5 (X2) 4: Fill in the content of R1 with reserved block2, then add the content of R1 and R2 to reserved block1, respectively, to get two application files. Its layout is as follows:
File 1 file 2
. Text section. Text Section
. Data section. Data Section
Other sections other sections 5: at the beginning of the application code, compare the content of reserved block2 with that of reserved block1. if the content is equal to the subsequent process implementation function 1, function 2 is implemented if not equal. In this way, two programs with arbitrary functions are implemented and their MD5 values are kept the same. 6,
Preamble = PE Header
MD5 (PE Header | R1) = MD5 (PE Header | R2)
Common suffix =. text section | R1 |. data Section | Other sections can be imagined. If function 1 is very attractive for some normal purpose, and function 2 implements a Trojan, what are the consequences.
This attack may be used in the ongoing plug-in Attack and Defense War on the famous game Diablo, because the most famous plug-in maphack in Diablo uses MD5 as a digital signature, check the plug-in sent by the server to detect code features. In addition. The procedure is described as follows:
1. Select an advanced Document Language (such as postscript );
2. Construct preamble based on postscript requirements, and search for MD5 random collision based on this:
X1 = preamble; put (R1 );
X2 = preamble; put (R2 );
MD5 (X1) = MD5 (X2)
3. Add the same suffix s to X1 and X2. Obviously:
MD5 (X1 | S) = MD5 (X2 | S)
4. Prepare two texts T1 and T2. T1 is a normal text (used to obtain a signature) and T2 is a text for other purposes.
5. Form two postscript documents:
Y1 = preamble; put (R1); Put (R1); If (=) Then T1 else T2
Y2 = preamble; put (R2); Put (R2); If (=) Then T1 else T2
In this way, when Y1 is opened, T1 is displayed; When Y2 is opened, T2 is displayed. The MD5 hash values of Y1 and Y2 are the same. In short, for now, "MD5 is definitely not practically collision-free ". Although it is only collision attack, this attack is practical when attackers can control and construct plaintext at will. Professor Wang Xiaoyun's achievement is to find a practical collision search algorithm. Of course, her work is also based on others.
This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
and provide relevant evidence. A staff member will contact you within 5 working days.