Introduction to MD5 encryption principles

Source: Internet
Author: User
Tags rfc rounds

MD5 Overview

The full name of MD5 is message-Digest algorithm 5, which was invented by MIT's computer science lab and RSA Data Security Inc in Early 1990s and developed by md2, md3, and md4.

Message-digest refers to the hash transformation of a message, which is to convert a byte string of any length into a long big integer. Note that I use the word "Byte string" instead of "string" because this conversion is only related to the value of the byte and has nothing to do with the character set or encoding method.

MD5 converts a "Byte string" of any length into a large integer of BITs, and it is an irreversible String Conversion Algorithm. In other words, even if you see the source program and algorithm description, you cannot
It is a mathematical principle to convert an MD5 value back to the original string because there are infinite numbers of original strings, which is a bit like a mathematical function without an inverse function.

A typical application of MD5 is to generate fingerprint (fingerprint) for a message (byte string) to prevent "tampering ". For example, you can write a paragraph in
In the readme.txt file, and generate an MD5 value for this readme.txt file and record it. Then you can spread this file to others. If someone else modifies any content in the file
When you re-calculate the MD5 value for this file, you will find that. If there is another third-party certification authority, MD5 can also prevent the "credit" of the file author. This is the so-called digital signature application.

MD5 is also widely used in encryption and decryption technology. In many operating systems, users' passwords are stored in MD5 values (or similar algorithms, the system calculates the password entered by the user into an MD5 value, and then compares it with the MD5 value saved in the system. The system does not "know" what the user password is.

Some hackers crack this password in a way called "Running Dictionary. There are two ways to get the dictionary: one is the string table that is collected daily and the other is generated by means of arrangement and combination, use the MD5 program to calculate the MD5 value of these dictionary items, and then use the target MD5 value for retrieval in this dictionary.

Even if the maximum length of the password is 8 and the password can only contain letters and numbers, 26 + 26 + 10 = 62 characters in total, the number of dictionary items in the combination is
P (62,1) + P (62,2 ).... + P (), which is already a very astronomical number. to store this dictionary, you need a TB-level disk group, and this method has a premise:
Only when the MD5 value of the target account is obtained.

In many e-commerce and community applications, managing users' accounts is the most common basic function.
Server provides these basic components, but many application developers prefer to use relational databases to manage users for greater flexibility. The lazy way is that users' passwords are usually in plain text or simple format.
The changes are stored in the database. Therefore, the passwords of these users are not confidential for software developers or system administrators. The purpose of this article is to introduce the MD5 Java
Bean implementation, while giving an example of using MD5 to process the user's account password, this method makes the administrator and the program designer unable to see the user's password, even though they can initialize it
. However, it is important to protect user password settings.

If you are interested, you can obtain the MD5 text, that is, the text of RFC 1321. Http://www.ietf.org/rfc/rfc1321.txt

Implementation Policy

The MD5 algorithm has actually provided C implementation in rfc1321, And we can immediately think of at least two methods to implement it using Java. The first one is, rewrite in Java
Or simply rewrite the C program into a Java program. The second is to use JNI (Java Native
Interface), the core algorithm still uses this C program, and uses the Java class to package it a shell.

However, I personally think that JNI should be a method that Java cannot solve certain problems (such as applications closely related to operating systems or I/O devices ), at the same time, to provide interoperability with other languages
Is a means. The biggest problem caused by using JNI is the introduction of platform dependencies, breaking the Java benefits of Sun's "one-time writing to run everywhere. Therefore, I decided to adopt the first method,
First, let's take a look at the benefits of "one-time writing and running everywhere" and test the efficiency of Java 2 for intensive computing.

Implementation Process

Due to the length of this article, I do not want to introduce this Java SDK to a Java integrated development environment for more readers to focus on the problem itself.
Bean production process. When I introduce a method, I find that the steps and commands are clear, I believe that anyone with more than three days of experience in a Java integration environment will know how to compile the code in the integration environment.
Translate and run. Many screens are required to describe problems with the integrated environment, which is also a headache for the integrated environment. I used a common text editor and Sun's Standard
JDK 1.3.0 for Windows NT.

In fact, converting C to Java is not difficult for a programmer with a certain C Language base. The basic syntax of these two languages is almost identical. it took me about an hour to complete code conversion. I mainly did the following:

Convert some of the # define macro definitions that must be used into final static values in the class, so that multiple instances in a process space can share the data.
Some useless # If define is deleted, because I only care about MD5. The recommended c implementation implements md2 md3 and md4 at the same time, and some # If define is related to different C compilers.
Convert some calculation macros into final static member functions.
The names of all variables are the same as those in the original C implementation. In case sensitivity, the C function is changed to the private method (member function) During calculation ).
Adjust the bit length of key variables
Defined classes and Methods
Note that the int type of many early C compilers is 16 bit, and MD5 uses unsigned long
Int, which is considered to be a 32-bit unsigned integer. In Java, Int Is 32 bit, and long is 64.
Bit. In the MD5 C implementation, a large number of bit operations are used. It is worth noting that although Java provides bitwise operations, since Java does not have the unsigned type
The bitwise operation provides an unsigned right shift: >>>, which is equivalent to> unsigned number processing in C.

Because Java does not provide the unsigned number operation, the addition of the two large int numbers will overflow to obtain a negative number or an exception. Therefore, I changed some key variables to the long type in Java.
(64bit ). In my opinion, this is more convenient than re-defining a set of classes with unsigned numbers and reloading those operators. At the same time, it is much more efficient and the code is also readable.
The abuse of Oriented will lead to inefficiency.

Due to the limited length, the original C code is no longer provided here. readers who are interested in comparison can refer to RFC 1321. Md5.java source code

Test

In RFC 1321, test suite is provided to verify whether your implementation is correct:

MD5 ("") = d41d8cd98f00b204e9800998ecf8427e

MD5 ("A") = 0cc175b9c0f1b6a831c399e269772661

MD5 ("ABC") = 900150983cd24fb0d6963f7d28e17f72

MD5 ("Message Digest") = f96b697d7cb7938d525a2f31aaf161d0

MD5 ("abcdefghijklmnopqrstuvwxyz") = c3fcd3d76192e4007dfb496cca67e13b

......

The meaning of these output results indicates that the MD5 Value of "Null String" is d41d8cd98f00b204e9800998ecf8427e, And the MD5 Value of "a" is 0cc175b9c0f1b6a831c399e269772661 ......
Compile and run our program:
Javac-D. md5.java
Java beartool. MD5
In order not to conflict with other programs of the same name in the future, I used package beartool In the first line of my program;

Therefore, the compiling command javac-D. md5.java automatically creates a beartool directory under our working directory, and the Directory stores the compiled md5.class

We will get the same result as test suite. Of course, you can continue to test other MD5 transformations you are interested in, such:

Java beartool. MD5. 1234

The MD5 value of 1234 is given.

It may be that my computer knowledge started with Apple II and Z80, and I have a preference for uppercase hexadecimal code, to use a lowercase digest string, you only need to change A, B, C, D, E, and F in the bytehex function to A, B, C, D, E, and F..

MD5 is said to be a relatively time-consuming computing, and our Java version of MD5 is out in a flash, there is no obstacle, it cannot be seen with the naked eye that the MD5 of Java is slower than that of C.

To test its compatibility, I copied the md5.class file to another Linux + IBM JDK 1.3 machine. The same result is obtained after execution, it is indeed "A write is running everywhere ".

Java Bean

Now we have completed and tested this Java class. The title of our article is to create a Java Bean.

In fact, a common Java Bean is very simple. It is not a brand new or great concept, but a Java class.
Sun specifies some implementation methods, but they are not mandatory. While EJB (Enterprise Java
Bean) defines some methods that must be implemented (very similar to the Event Response). These methods are used (called) by the EJB container.

In a Java
Using this bean in application or applet is very simple. The simplest way is to use the source code working directory of this class to create a beartool directory
Copy the class file and import it in your program.
Beartool. MD5. Finally, package it into. jar or. War to maintain the relative directory relationship.

Another small benefit of Java is that you do not need to remove the main method in our MD5 class. It is already a working Java
Bean. Java has a major advantage: it allows convenient coexistence of multiple running forms in the same group of Code. For example, you can write a class, which is a console
The application and GUI application are both an applet and a Java
Bean, which provides great convenience for testing, maintenance and release programs, the main test method here can also be put into an internal class, interested readers can refer to: http://help.liangjing.org/Tools/MD5.aspx
The advantages of putting testing and sample code in an internal static class are described here, which is a good engineering technique and method.

Install Java Bean in JSP

As we described at the beginning of this article
Bean applications are based on a user management. Here we assume that the user login process of a virtual community is stored in a database table named users. This table has two words.
Userid: Char (20) and pwdmd5: Char (32). userid is the primary of the table.
Key, pwdmd5 the MD5 string for saving the password. The MD5 value is a large integer of BITs, indicating that the hexadecimal ASCII value must be 32 characters.

Here two files are provided. login.html is used to accept the form input by the user, and login. jsp is used to simulate the login process using the MD5 bean.

To make our test environment simple, we use the JDK built-in JDBC-ODBC bridge driver in JSP, community is the odbc dsn name, if you use other JDBC driver, replace login. in JSP
Connection con = drivermanager. getconnection ("JDBC: ODBC: community ","","");
You can.

The working principle of login. jsp is very simple. It receives the userid and password entered by the user through post, converts the password into an MD5 string, and then
The users table looks for userid and pwdmd5, because userid is the primary of the users table.
Key. If the converted pwdmd5 does not match the records in the table, an empty result set is returned for the SQL query.

To use this bean, you only need to create a beartool directory under the WEB-INF/classes of your JSP application, and then
Copy md5.class to that directory. If you use some integrated development environments, refer to their deploy tool description. Use a Java
The key statement of bean is the 2nd line in the program:

<JSP: usebean id = 'omd5' scope = 'request' class = 'beartool. md5'/>
This is a standard tag that all JSP specifications require JSP Container developers to provide.

Id = actually indicates JSP
The name of the instance variable used when the container creates the bean instance. In the Java program between <% and %>, You can reference it. In the program, you can see that
Pwdmd5 = omd5.getmd5ofstr (password) references the only public method provided by our MD5 Java Bean:
Getmd5ofstr.

Java application
The process of executing. jsp by server is to precompile it into. Java (those tags will become Java statements during pre-compilation), and then compile it into. Class. These are all completed automatically by the system.
The. Class is also called servlet. Of course, if you want to, you can also help Java application
The server did what it was supposed to do and directly wrote the servlet, but using the servlet to output HTML is simply a return to the nightmare era of writing CGI programs with C.

If your output is a complex table, you can use an HTML editor you are familiar with to compile a "template ", then, "embed" the JSP code. However
JSP code has been criticized by some experts as "Hollow powder". It does have a disadvantage that the code is difficult to manage and reuse, But what programs always need is such a balance. I personally think that for small and medium-sized items
The ideal structure is to write the data representation (or not strictly called Web Interface) in JSP, and put the interface unrelated to the bean, generally, you do not need to write directly.
Servlet.

If you think this method is not very Oo (Object Oriented), you can inherit (extends) it and write another bean to put the user-managed function package in.

Is it compatible?

I tested three Java application server environments: Resin 1.2.3, Sun J2EE 1.2, and IBM WebSphere 3.5. Fortunately, this Java
Bean has no problems because it is only a computing program and does not involve operating systems or I/O devices. In fact, it can be easily implemented in other languages for its compatibility, the only advantage of Java
Point is, you only need to provide a form of Running code. Note the word "form". Many computing structures and operating systems now define a large number of code forms apart from the language itself, a very simple section of C language.
Many problems need to be considered when converting core code into different forms. Many tools are used, and many restrictions are imposed, sometimes learning a new "form" may spend more energy than solving the problem itself. For example
Windows has EXE, service, common DLL, com
In the past, there were OCX and so on in DLL. Although it was simpler on UNIX, it was necessary to provide a. h definition of a large number of macros, and to consider the bit length of compiler versions on different platforms. I think this is
Java is a very important charm for me.

MD5 Algorithm Description

I. Makeup
Ii. Data population Length
3. initialize the MD5 Parameter
Iv. bitwise operation functions
V. Main Transformation Process
Vi. output results

Makeup:
The MD5 algorithm first supplements the input data so that the result of Len's 512-plus length is 448. That is, data is extended to K * 512 + 448 bits. That is, K * 64 + 56 bytes, and K is an integer.
Specific bit filling operation: Fill in 1, and then fill 0 to meet the above requirements.
Data population length:
Use a 64-bit number to represent the original length of data B, and use two 32-digit digits to represent B. In this case
The data is filled into a multiple of the length of 512 bits.
Initialize the MD5 parameter:
Four 32-bit integers (A, B, C, D) are used to calculate the information digest, and The hexadecimal tabulation is used for initialization.
Number
A = 0x01234567
B = 0x89abcdef
C = 0xfedcba98
D = 0x76543210

Bitwise operation functions:
X, Y, and Z are 32-bit integers.
F (x, y, z) = x & Y | not (x) & Z
G (x, y, z) = x & Z | y? (Z)
H (x, y, z) = x XOR y XOR Z
I (x, y, z) = y XOR (X | not (z ))

Main transformation process:
Use the regular array T [1... 64], t [I] is a 32-bit integer expressed in hexadecimal notation, and the data is represented in 16 32-bit
The integer array M.
The specific process is as follows:

/* Process the original data */
For I = 0 to N/16-1 do

/* Each time, the original data is stored in array X of 16 elements .*/
For J = 0 to 15 do
Set X [J] to M [I * 16 + J].
End/end the loop on J

/* Save a as AA, B as BB, C as CC, and D as DD.
*/
AA =
BB = B
Cc = C
Dd = d

/* 1st rounds */
/* Use [abcd k s I] to indicate the following operations:
A = B + (a + F (B, c, d) + X [k] + T [I]) <s ).*/

/* Do the following 16 operations .*/
[ABCD 0 7 1] [dabc 1 12 2] [cdab 2 17 3] [BCDA 3
22 4]
[ABCD 4 7 5] [dabc 5 12 6] [cdab 6 17 7] [BCDA 7
22 8]
[ABCD 8 7 9] [dabc 9 12 10] [cdab 10 17 11] [BCDA
11 22 12]
[ABCD 12 7 13] [dabc 13 12 14] [cdab 14 17 15]
[BCDA 15 22 16]

/* 2nd rounds **/
/* Use [abcd k s I] to indicate the following operations:
A = B + (a + g (B, c, d) + X [k] + T [I]) <s ).*/
/* Do the following 16 operations .*/
[ABCD 1 5 17] [dabc 6 9 18] [cdab 11 14 19] [BCDA
0 20 20]
[ABCD 5 5 21] [dabc 10 9 22] [cdab 15 14 23]
[BCDA 4 20 24]
[ABCD 9 5 25] [dabc 14 9 26] [cdab 3 14 27] [BCDA
8 20 28]
[ABCD 13 5 29] [dabc 2 9 30] [cdab 7 14 31] [BCDA
12 20 32]

/* 3rd rounds */
/* Use [abcd k s I] to indicate the following operations:
A = B + (a + H (B, c, d) + X [k] + T [I]) <s ).*/
/* Do the following 16 operations .*/
[ABCD 5 4 33] [dabc 8 11 34] [cdab 11 16 35]
[BCDA 14 23 36]
[ABCD 1 4 37] [dabc 4 11 38] [cdab 7 16 39] [BCDA
10 23 40]
[ABCD 13 4 41] [dabc 0 11 42] [cdab 3 16 43]
[BCDA 6 23 44]
[ABCD 9 4 45] [dabc 12 11 46] [cdab 15 16 47]
[BCDA 2 23 48]

/* 4th rounds */
/* Use [abcd k s I] to indicate the following operations:
A = B + (a + I (B, C, D) + X [k] + T [I]) <s ).*/
/* Do the following 16 operations .*/
[ABCD 0 6 49] [dabc 7 10 50] [cdab 14 15 51]
[BCDA 5 21 52]
[ABCD 12 6 53] [dabc 3 10 54] [cdab 10 15 55]
[BCDA 1 21 56]
[ABCD 8 6 57] [dabc 15 10 58] [cdab 6 15 59]
[BCDA 13 21 60]
[ABCD 4 6 61] [dabc 11 10 62] [cdab 2 15 63]
[BCDA 9 21 64]

/* Perform the following operations */
A = a + AA
B = B + BB
C = C + CC
D = d + dd

End/* end the I loop */

Output result.

Http://help.liangjing.org/Tools/MD5.aspx

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.