MD5 Algorithm Description
Author: rufi 2004.06.22
When I want to write an MD5 Algorithm Program, I find that the description of both Chinese and English languages has some inaccuracy and some details.
Not clear, or confusing. Finally, I had to come up with the C language source program for debugging, which is not very good at understanding the algorithm.
. So I summarized some of the points I found.
1. Origin
The full name of MD5 is message-Digest algorithm 5 (Information-Digest algorithm, which was developed by MIT laboratory in Early 1990s ).
Developed by Ronald L. Rivest of for computer science and RSA Data Security Inc,
Developed by md2, md3, and md4. Http://www.ietf.org/rfc/rfc1321.txt, is the most comprehensive document,
Submitted by Ronald L. Rivest to Ieft on April 9, August 1992.
2. Purpose
MD5 is used to generate a message digest for a piece of information.
Uniqueness. It can be used as a digital signature. Used to verify the validity of the file (whether there is lost or damaged data), for the user
Password Encryption: Calculate the hash value in the hash function.
3. Features
Enter a byte string of any length to generate a 128-bit integer. Due to some irreversible features of the algorithm
It provides better security. In addition, you do not have to pay any copyright fee for using the MD5 algorithm.
4. Description
Uniqueness and irreversible are not absolute. Theoretically, the analysis is a multi-to-one relationship, but two different pieces of information are generated.
The probability of the same abstract is very small. Irreversible means that the calculation amount and computing time required to push the input from the output are too large.
The typical method requires too much storage space.
5. Algorithm Description
Algorithm input is a byte string, each of which is 8 bits.
The algorithm is executed in the following steps:
Step 1: complement:
The MD5 algorithm first supplements the input data so that the length of the data (in bytes) is 56 for the 64-bit result.
That is, the data is extended to Len = K * 64 + 56 bytes, and K is an integer.
Complement Method: Fill in 1, and then fill 0 to meet the above requirements. It is equivalent to supplementing a byte 0x80 and then supplementing the value.
0 bytes. In this step, the total number of supplemented bytes is 0 ~ 63.
Step 2: append the Data Length:
Use a 64-bit integer to indicate the original length (in BIT) of the data, and sort the 8 bytes of the number in front of the low position,
The order after the position is high is appended to the data after the position is filled. In this case, the total length after the data is filled is:
Len = K * 64 + 56 + 8 = (k + 1) * 64 bytes.
※Note that the 64-bit integer is the original length of the input data rather than the length after the byte is filled. I have planted a heel here.
Step 3: Initialize the MD5 parameter:
There are four 32-bit integer variables (A, B, C, D) used to calculate the information digest, each variable is initialized to the following
The value in hexadecimal notation, with the low byte at the beginning.
Word A: 01 23 45 67
Word B: 89 AB CD ef
Word C: Fe DC Ba 98
Word D: 76 54 32 10
※Note that the low-level bytes refer to the memory byte arrangement on the little endian platform,
When writing in a program, it should be written:
A = 0x67452301
B = 0xefcdab89
C = 0x98badcfe
D = 0x10325476
Step 4: Define four basic MD5 bitwise operation functions:
X, Y, and Z are 32-bit integers.
F (x, y, z) = (X and Y) or (not (X) and Z)
G (x, y, z) = (X and Z) or (Y and not (z ))
H (x, y, z) = x XOR y XOR Z
I (x, y, z) = y XOR (X or not (z ))
Define four functions for four-wheel transformation.
If MJ is set, it indicates the J sub-group of the message (from 0 to 15). If <s indicates that the cycle shifts the S bit left, the following four operations are performed:
Ff (a, B, c, d, MJ, S, Ti) indicates a = B + (a + (f (B, c, d) + MJ + Ti) <s)
GG (a, B, c, d, MJ, S, Ti) indicates a = B + (a + (G (B, c, d) + MJ + Ti) <s)
HH (a, B, c, d, MJ, S, Ti) indicates a = B + (a + (H (B, c, d) + MJ + Ti) <s)
II (a, B, c, d, MJ, S, Ti) indicates a = B + (a + (I (B, C, D) + MJ + Ti) <s)
Step 5: Convert the input data.
Processing Data. n is the total number of bytes. 64 bytes are used as a group. Each group performs a loop and each cycle performs four rounds of operations.
The 64 bytes to be converted are represented by a 16 32-bit integer array M [0... 15. The array T [1... 64] represents a group of constants,
T [I] is a 32-bit integer of 4294967296 * ABS (sin (I). The unit of I is radian and the value of I ranges from 1 to 64.
The specific process is as follows:
/* Set the main cycle variable */
For I = 0 to N/16-1 do
/* Store the original data in array X of 16 elements every cycle .*/
For J = 0 to 15 do
Set X [J] to M [I * 16 + J].
End/end the loop on J
/* Save a as AA, B as BB, C as CC, and D as DD.
*/
AA =
BB = B
Cc = C
Dd = d
/* 1st rounds */
/* Use [abcd k s I] to indicate the following operations:
A = B + (a + F (B, c, d) + X [k] + T [I]) <s ).*/
/* Do the following 16 operations .*/
[ABCD 0 7 1] [dabc 1 12 2] [cdab 2 17 3] [BCDA 3 22 4]
[ABCD 4 7 5] [dabc 5 12 6] [cdab 6 17 7] [BCDA 7 22 8]
[ABCD 8 7 9] [dabc 9 12 10] [cdab 10 17 11] [BCDA 11 22 12]
[ABCD 12 7 13] [dabc 13 12 14] [cdab 14 17 15] [BCDA 15 22 16]
/* 2nd rounds **/
/* Use [abcd k s I] to indicate the following operations:
A = B + (a + g (B, c, d) + X [k] + T [I]) <s ).*/
/* Do the following 16 operations .*/
[ABCD 1 5 17] [dabc 6 9 18] [cdab 11 14 19] [BCDA 0 20 20]
[ABCD 5 5 21] [dabc 10 9 22] [cdab 15 14 23] [BCDA 4 20 24]
[ABCD 9 5 25] [dabc 14 9 26] [cdab 3 14 27] [BCDA 8 20 28]
[ABCD 13 5 29] [dabc 2 9 30] [cdab 7 14 31] [BCDA 12 20 32]
/* 3rd rounds */
/* Use [abcd k s I] to indicate the following operations:
A = B + (a + H (B, c, d) + X [k] + T [I]) <s ).*/
/* Do the following 16 operations .*/
[ABCD 5 4 33] [dabc 8 11 34] [cdab 11 16 35] [BCDA 14 23 36]
[ABCD 1 4 37] [dabc 4 11 38] [cdab 7 16 39] [BCDA 10 23 40]
[ABCD 13 4 41] [dabc 0 11 42] [cdab 3 16 43] [BCDA 6 23 44]
[ABCD 9 4 45] [dabc 12 11 46] [cdab 15 16 47] [BCDA 2 23 48]
/* 4th rounds */
/* Use [abcd k s I] to indicate the following operations:
A = B + (a + I (B, C, D) + X [k] + T [I]) <s ).*/
/* Do the following 16 operations .*/
[ABCD 0 6 49] [dabc 7 10 50] [cdab 14 15 51] [BCDA 5 21 52]
[ABCD 12 6 53] [dabc 3 10 54] [cdab 10 15 55] [BCDA 1 21 56]
[ABCD 8 6 57] [dabc 15 10 58] [cdab 6 15 59] [BCDA 13 21 60]
[ABCD 4 6 61] [dabc 11 10 62] [cdab 2 15 63] [BCDA 9 21 64]
/* Perform the following operations */
A = a + AA
B = B + BB
C = C + CC
D = d + dd
Next I/* terminate the loop on I */
Step 6: output the result.
A, B, C, and D are continuously stored in 16 bytes, 128 bits. Output the 16 bytes in hexadecimal order.
Finally, after implementing the algorithm using the program language, you can enter the following information to perform a simple test on the program,
Check whether the program has any errors.
MD5 ("") = d41d8cd98f00b204e9800998ecf8427e
MD5 ("A") = 0cc175b9c0f1b6a831c399e269772661
MD5 ("ABC") = 900150983cd24fb0d6963f7d28e17f72
MD5 ("Message Digest") = f96b697d7cb7938d525a2f31aaf161d0
MD5 ("abcdefghijklmnopqrstuvwxyz") = c3fcd3d76192e4007dfb496cca67e13b
MD5 ("abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz0123456789") =
D174ab98d277d9f5a5611c2c9f419d9f
MD5 ("123456789012345678901234567890123456789012345678901234567890123456789"
01234567890 ") = 57edf4a22be3c955ac49da2e21_b67a
C # program of MD5 Algorithm
MD5 algorithms are quite special and are most suitable for writing in assembly languages. Many advanced languages are powerless or inefficient.
For example, I first tried to write it in Python and euphoria and found it was not easy. In comparison, C # serves as a C family cluster.
A. NET language provided by Xinxing with comprehensive functions. It took one night to finally use C # To implement MD5 first.
The main reason is that some details of the algorithm are not paid much attention, and the output is always incorrect. debugging takes a long time.
[Code]
// Source file: md5.cs
// MD5 alogrithm
// By rufi 2004.6.20 http://rufi.yculblog.com/
Using system;
Using system. collections;
Using system. IO;
Public class MD5 {
// Static state variables
Private Static uint32;
Private Static uint32 B;
Private Static uint32 C;
Private Static uint32 D;
// Number of BITs to rotate in tranforming
Private const int S11 = 7;
Private const int S12 = 12;
Private const int S13 = 17;
Private const int S14 = 22;
Private const int S21 = 5;
Private const int s22 = 9;
Private const int S23 = 14;
Private const int S24 = 20;
Private const int s31 = 4;
Private const int s32 = 11;
Private const int s33 = 16;
Private const int s34 = 23;
Private const int s41 = 6;
Private const int S42 = 10;
Private const int s43 = 15;
Private const int s44 = 21;
/* F, G, H and I are basic MD5 functions.
* Four non-linear functions:
*
* F (x, y, z) = (X & Y) | ((~ X) & Z)
* G (x, y, z) = (X & z) | (Y &(~ Z ))
* H (x, y, z) = x ^ y ^ Z
* I (x, y, z) = y ^ (X | (~ Z ))
*
* (&, | Or ,~ Non, ^ exclusive or)
*/
Private Static uint32 F (uint32 X, uint32 y, uint32 Z ){
Return (X & Y) | ((~ X) & Z );
}
Private Static uint32 g (uint32 X, uint32 y, uint32 Z ){
Return (X & z) | (Y &(~ Z ));
}
Private Static uint32 H (uint32 X, uint32 y, uint32 Z ){
Return x ^ y ^ Z;
}
Private Static uint32 I (uint32 X, uint32 y, uint32 Z ){
Return y ^ (X | (~ Z ));
}
/* Ff, GG, HH, and II transformations for Rounds 1, 2, 3, and 4.
* Rotation is separate from addition to prevent recomputation.
*/
Private Static void ff (ref uint32 A, uint32 B, uint32 C, uint32 D, uint32 MJ, int S, uint32 Ti ){
A = a + F (B, c, d) + MJ + Ti;
A = A <S | A> (32-s );
A + = B;
}
Private Static void Gg (ref uint32 A, uint32 B, uint32 C, uint32 D, uint32 MJ, int S, uint32 Ti ){
A = a + g (B, c, d) + MJ + Ti;
A = A <S | A> (32-s );
A + = B;
}
Private Static void HH (ref uint32 A, uint32 B, uint32 C, uint32 D, uint32 MJ, int S, uint32 Ti ){
A = a + H (B, c, d) + MJ + Ti;
A = A <S | A> (32-s );
A + = B;
}
Private Static void II (ref uint32 A, uint32 B, uint32 C, uint32 D, uint32 MJ, int S, uint32 Ti ){
A = a + I (B, C, D) + MJ + Ti;
A = A <S | A> (32-s );
A + = B;
}
Private Static void md5_init (){
A = 0x67452301; // in memory, this is 0x01234567
B = 0xefcdab89; // in memory, this is 0x89abcdef
C = 0x98badcfe; // in memory, this is 0xfedcba98
D = 0x10325476; // in memory, this is 0x76543210
}
Private Static uint32 [] md5_append (byte [] input ){
Int zeros = 0;
Int ones = 1;
Int size = 0;
Int n = input. length;
Int M = n % 64;
If (M <56 ){
Zeros = 55-m;
Size = N-M + 64;
}
Else if (M = 56 ){
Zeros = 0;
Ones = 0;
Size = N + 8;
}
Else {
Zeros = 63-m + 56;
Size = N + 64-m + 64;
}
Arraylist BS = new arraylist (input );
If (ones = 1 ){
BS. Add (byte) 0x80); // 0x80 = $10000000
}
For (INT I = 0; I <zeros; I ++ ){
BS. Add (byte) 0 );
}
Uint64 n = (uint64) N * 8;
Byte H1 = (byte) (N & 0xff );
Byte H2 = (byte) (n> 8) & 0xff );
Byte h3 = (byte) (n> 16) & 0xff );
Byte h4 = (byte) (n> 24) & 0xff );
Byte h5 = (byte) (n> 32) & 0xff );
Byte h6 = (byte) (n> 40) & 0xff );
Byte H7 = (byte) (n> 48) & 0xff );
Byte h8 = (byte) (N> 56 );
BS. Add (H1 );
BS. Add (H2 );
BS. Add (H3 );
BS. Add (h4 );
BS. Add (H5 );
BS. Add (h6 );
BS. Add (H7 );
BS. Add (H8 );
Byte [] Ts = (byte []) BS. toarray (typeof (byte ));
/* Decodes input (byte []) into output (uint32 []). Assumes Len is
* A multiple of 4.
*/
Uint32 [] Output = new uint32 [size/4];
For (int64 I = 0, j = 0; I <size; j ++, I + = 4 ){
Output [J] = (uint32) (TS [I] | ts [I + 1] <8 | ts [I + 2] <16 | ts [I + 3] <24 );
}
Return output;
}
Private Static uint32 [] md5_trasform (uint32 [] X ){
Uint32 a, B, c, d;
For (int K = 0; k <X. length; k + = 16 ){
A =;
B = B;
C = C;
D = D;
/* Round 1 */
Ff (Ref A, B, C, D, X [K + 0], S11, 0xd76aa478);/* 1 */
Ff (ref D, a, B, c, X [k + 1], S12, 0xe8c7b756);/* 2 */
Ff (ref C, D, a, B, X [K + 2], S13, 0x242070db);/* 3 */
Ff (ref B, c, d, A, X [K + 3], S14, 0xc1bdceee);/* 4 */
Ff (Ref A, B, C, D, X [K + 4], S11, 0xf57c0faf);/* 5 */
Ff (ref D, a, B, c, X [K + 5], S12, 0x4787c62a);/* 6 */
Ff (ref C, D, a, B, X [K + 6], S13, 0xa8304613);/* 7 */
Ff (ref B, c, d, A, X [K + 7], S14, 0xfd469501);/* 8 */
Ff (Ref A, B, C, D, X [K + 8], S11, 0x698098d8);/* 9 */
Ff (ref D, a, B, c, X [K + 9], S12, 0x8b44f7af);/* 10 */
Ff (ref C, D, a, B, X [K + 10], S13, 0xffff5bb1);/* 11 */
Ff (ref B, c, d, A, X [K + 11], S14, 0x895cd7be);/* 12 */
Ff (Ref A, B, C, D, X [K + 12], S11, 0x6b901122);/* 13 */
Ff (ref D, a, B, c, X [K + 13], S12, 0xfd987193);/* 14 */
Ff (ref C, D, a, B, X [K + 14], S13, 0xa679438e);/* 15 */
Ff (ref B, c, d, A, X [K + 15], S14, 0x49b40821);/* 16 */
/* Round 2 */
GG (Ref A, B, C, D, X [k + 1], S21, 0xf61e2562);/* 17 */
GG (ref D, a, B, c, X [K + 6], s22, 0xc040b340);/* 18 */
GG (ref C, D, a, B, X [K + 11], S23, 0x265e5a51);/* 19 */
GG (ref B, c, d, A, X [K + 0], S24, 0xe9b6c7aa);/* 20 */
GG (Ref A, B, C, D, X [K + 5], S21, 0xd62f105d);/* 21 */
GG (ref D, a, B, c, X [K + 10], s22, 0x2441453);/* 22 */
GG (ref C, D, a, B, X [K + 15], S23, 0xd8a1e681);/* 23 */
GG (ref B, c, d, A, X [K + 4], S24, 0xe7d3fbc8);/* 24 */
GG (Ref A, B, C, D, X [K + 9], S21, 0x21e1cde6);/* 25 */
GG (ref D, a, B, c, X [K + 14], s22, 0xc33707d6);/* 26 */
GG (ref C, D, a, B, X [K + 3], S23, 0xf4d50d87);/* 27 */
GG (ref B, c, d, A, X [K + 8], S24, 0x455a14ed);/* 28 */
GG (Ref A, B, C, D, X [K + 13], S21, 0xa9e3e905);/* 29 */
GG (ref D, a, B, c, X [K + 2], s22, 0xfcefa3f8);/* 30 */
GG (ref C, D, a, B, X [K + 7], S23, 0x676f02d9);/* 31 */
GG (ref B, c, d, A, X [K + 12], S24, 0x8d2a4c8a);/* 32 */
/* Round 3 */
HH (Ref A, B, C, D, X [K + 5], s31, 0xfffa3942);/* 33 */
HH (ref D, a, B, c, X [K + 8], s32, 0x8771f681);/* 34 */
HH (ref C, D, a, B, X [K + 11], s33, 0x6d9d6122);/* 35 */
HH (ref B, c, d, A, X [K + 14], s34, 0xfde5380c);/* 36 */
HH (Ref A, B, C, D, X [k + 1], s31, 0xa4beea44);/* 37 */
HH (ref D, a, B, c, X [K + 4], s32, 0x4bdecfa9);/* 38 */
HH (ref C, D, a, B, X [K + 7], s33, 0xf6bb4b60);/* 39 */
HH (ref B, c, d, A, X [K + 10], s34, 0xbebfbc70);/* 40 */
HH (Ref A, B, C, D, X [K + 13], s31, 0x289b7ec6);/* 41 */
HH (ref D, a, B, c, X [K + 0], s32, 0xeaa127fa);/* 42 */
HH (ref C, D, a, B, X [K + 3], s33, 0xd4ef3085);/* 43 */
HH (ref B, c, d, A, X [K + 6], s34, 0x4881d05);/* 44 */
HH (Ref A, B, C, D, X [K + 9], s31, 0xd9d4d039);/* 45 */
HH (ref D, a, B, c, X [K + 12], s32, 0xe6db99e5);/* 46 */
HH (ref C, D, a, B, X [K + 15], s33, 0x1fa27cf8);/* 47 */
HH (ref B, c, d, A, X [K + 2], s34, 0xc4ac5665);/* 48 */
/* Round 4 */
II (Ref A, B, C, D, X [K + 0], s41, 0xf4292244);/* 49 */
II (ref D, a, B, c, X [K + 7], S42, 0x432aff97);/* 50 */
II (ref C, D, a, B, X [K + 14], s43, 0xab9423a7);/* 51 */
II (ref B, c, d, A, X [K + 5], s44, 0xfc93a039);/* 52 */
II (Ref A, B, C, D, X [K + 12], s41, 0x655b59c3);/* 53 */
II (ref D, a, B, c, X [K + 3], S42, 0x8f0ccc92);/* 54 */
II (ref C, D, a, B, X [K + 10], s43, 0xffeff47d);/* 55 */
II (ref B, c, d, A, X [k + 1], s44, 0x85845dd1);/* 56 */
II (Ref A, B, C, D, X [K + 8], s41, 0x6fa87e4f);/* 57 */
II (ref D, a, B, c, X [K + 15], S42, 0xfe2ce6e0);/* 58 */
II (ref C, D, a, B, X [K + 6], s43, 0xa3014314);/* 59 */
II (ref B, c, d, A, X [K + 13], s44, 0x4e0811a1);/* 60 */
II (Ref A, B, C, D, X [K + 4], s41, 0xf7537e82);/* 61 */
II (ref D, a, B, c, X [K + 11], S42, 0xbd3af235);/* 62 */
II (ref C, D, a, B, X [K + 2], s43, 0x2ad7d2bb);/* 63 */
II (ref B, c, d, A, X [K + 9], s44, 0xeb86d391);/* 64 */
A + =;
B + = B;
C + = C;
D + = D;
}
Return new uint32 [] {a, B, c, d };
}
Public static byte [] md5array (byte [] input ){
Md5_init ();
Uint32 [] block = md5_append (input );
Uint32 [] bits = md5_trasform (Block );
/* Encodes bits (uint32 []) into output (byte []). Assumes Len is
* A multiple of 4.
*/
Byte [] Output = new byte [bits. length * 4];
For (INT I = 0, j = 0; I <bits. length; I ++, J + = 4 ){
Output [J] = (byte) (BITS [I] & 0xff );
Output [J + 1] = (byte) (BITS [I]> 8) & 0xff );
Output [J + 2] = (byte) (BITS [I]> 16) & 0xff );
Output [J + 3] = (byte) (BITS [I]> 24) & 0xff );
}
Return output;
}
Public static string arraytohexstring (byte [] array, bool uppercase ){
String hexstring = "";
String format = "X2 ";
If (uppercase ){
Format = "X2 ";
}
Foreach (byte B in array ){
Hexstring + = B. tostring (format );
}
Return hexstring;
}
Public static string mdstring (string message ){
Char [] C = message. tochararray ();
Byte [] B = new byte [C. Length];
For (INT I = 0; I <C. length; I ++ ){
B [I] = (byte) C [I];
}
Byte [] digest = md5array (B );
Return arraytohexstring (Digest, false );
}
Public static string mdfile (string filename ){
Filestream FS = file. Open (filename, filemode. Open, fileaccess. Read );
Byte [] array = new byte [fs. Length];
FS. Read (array, 0, (INT) fs. Length );
Byte [] digest = md5array (array );
FS. Close ();
Return arraytohexstring (Digest, false );
}
Public static string test (string message ){
Return "rnmd5 (" "+ message +" ") =" + md5.mdstring (Message );
}
Public static string testsuite (){
String S = "";
S + = test ("");
S + = test ("");
S + = test ("ABC ");
S + = test ("Message Digest ");
S + = test ("abcdefghijklmnopqrstuvwxyz ");
S + = test ("abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz0123456789 ");
S + = test ("12345678901234567890123456789012345678901234567890123456789012345678901234567890 ");
Return S;
}
}
[/Code]