Unbind the RSA encryption of Renren's logon Password

Source: Internet
Author: User

There are two types of passwords in the world: one is to prevent your little sister from peeking at your files; the other is to prevent the authorities from reading your files.
-- Bruce Schneier application Cryptography

The legendary "plaintext password" has two forms: plaintext transmission and plaintext storage. The plaintext transmission password is not necessarily stored in plaintext, And the plaintext storage password is not necessarily transmitted in plaintext. Last year, the popular plaintext password event was the plaintext storage of passwords. Once the website database is stolen, users' passwords will be stolen. The plaintext transmission of passwords is also very dangerous. Sniffing devices may be installed in many locations on the network. The plaintext transmission passwords are no secret for these sniffer. This article focuses on the security issues in password transmission.

What is "plaintext "? If the password is directly sent in the form of ASCII characters, this is plain text for anyone; if the password is base64 encoded (for example, 123456 is encoded as MTIzNDU2 after base64 ), for most people, this may be ciphertext, but for any professional programmer, It is plaintext. Some people think that no one can analyze the "encryption" algorithm by making it more complicated and using code obfuscation tools. This is called hiding, rather than security. It is a level that prevents the younger sister from peeking at files. Real security relies on public and widely used cryptographic algorithms, and relies on keys instead of algorithms to ensure security.

Unfortunately, cryptographic algorithms and protocols are not as secure as they are pieced together.

Observing Renren's password Transmission

Long ago, Renren's password was transmitted in plaintext without any encoding. Currently, the logon password of Renren is encrypted. View the http post request (/ajaxLogin/login) logged on to Renren in the browser's developer tools, and you will find that the password has become a long string.

What I initially thought of was hash. Will it use the user-Entered password and rkey for some calculation and then use the hash value as the password? I tried string connection and common hash algorithms and found that this is not the case. Where does this rkey come from? The/getEncryptionKey HTTP request returns data including rkey. I can see e: "10001" and I guess this may be RSA encryption, but I will not press it for the moment.

Note: A replay attack vulnerability is found here. After Successful Logon, The rkey is not invalid. Using the same POST request on a machine on the other side of the ocean can still successfully log on, and even if the original user logs out, the availability of POST login is not affected, only by changing the password or waiting for a long enough time can the POST request that has been sniffed be invalidated. This may be enough for script kiddies who want to snoop on others' privacy. But our goal is to restore the plaintext password. Can we stop this?

Analyze messy JS Code

It seems that we have to look at the source code. Renren network login is the login-v6.js processing, this JavaScript code after obfuscation, with JSBeautifier after processing it will look much more. (Shows the processed code snippet)

When programming for beginners, the teacher told us that "the program should be read from the main function", but for any program of a certain scale, if you read from the main, i'm afraid my hair is white when I find the code snippets I care about. More likely, the brain stack overflows before I find the code snippets I care about. If you don't believe it, read the following code (part of the RSA encryption Library) and read it.

A powerful weapon for analyzing large programs may be string search (grep ). These strings, magic numbers, or on-screen prompts let us know the role of this Code, or the identification string of a library.

Search for the first part of the code (the user enters the password) posted above. You can find the second part of the code posted above. The second part of the code looks like some mathematical operations. One group of magic numbers leaked the Tianji:

Search for these two magic numbers in Google, which is probably part of a database:

Further searches show that this is an open-source RSA library implemented by pure JavaScript. The homepage is http://www.ohdave.com/rsa /. A one-to-one comparison with the confusing Renren code is correct. The general process is as follows:

  1. The browser initiates the getEncryptionKey request. The server generates a 256-bit RSA Public/Private Key, generates a random rkey, and stores the public/private key based on the rkey.
  2. The server sends e (fixed to 10001 in hexadecimal format), n (256-bit public key in hexadecimal format), and rkey back to the browser.
  3. When you enter the password and click log on, the browser encrypts the password with the public key n and exponential e to obtain the password (256 bits in hexadecimal notation) and send it to the server together with the rkey.
  4. The server extracts the private key based on the rkey, decrypts the password to obtain the plaintext password, and then performs the regular password check process.

When I see the word "RSA", many people may think that there is no hope of cracking, And I have analyzed it as if I have seen the dawn of victory.

Why does RSA require a long key?

In 1976, Diffie and Hellman published the world's first public secret generation protocol based on public keys, opening a new page in the history of cryptography. Prior to this, people had always believed that security communication could not be implemented on the listened channel without mutual secrets between the two sides, the Diffie-Hellman protocol is an algorithm that generates common secrets securely on the listened channel. One year later, in 1977, Rivest, Shamir, and Adleman published RSA public key encryption algorithms.

Public key cryptography is essentially based on some historical mathematical difficulties, such as the knapsack problem, the discrete logarithm problem, and the factorization problem. The Diffie-Hellman algorithm is based on the discrete logarithm problem, while the RSA algorithm is based on the factorization problem. Every public key is a mathematical problem. As long as attackers try to solve this problem, they can obtain the private key and decrypt information. This is totally different from what we usually call "password encryption" (name symmetric encryption.

The security of symmetric encryption depends on the confidentiality of passwords. For the ideal symmetric encryption algorithm, only all passwords can be used for decryption. Therefore, a random password with 16 uppercase/lowercase letters and numbers is roughly equivalent to a 96-bit binary password, that is to say, the password can be guessed only 4 trillion million times on average, but cannot be tried in the lifetime. The 96-bit (Binary) mathematical problem is not a problem in the face of computers. It is enough to break down such a large sum of numbers in one second.

We often use RSA keys with a length of 2048 and 4096 bits, while symmetric keys with a length of 128 bits and 256 bits are enough. Public Keys are often much longer than symmetric keys.

Let's take a look at how RSA works. First, find the big prime number with the same number of digits.PAndQ, Calculate their productPqAs the public keyN. ComputingPhi (n)=Phi (p) PHI (q)= (P-1) (q-1 ). SelectEAs an index (exponent ). Set messageM,MeModNIt is the ciphertext (cipher-text ). In practiceEGenerally, take the specified value, for exampleE= Hex 10001 = 65537. In this way, only 17 multiplications are required for encryption.

FindDMakeDeModPhi (n)= 1, which can be quickly obtained by the division of the moving phase. You only need to calculate the d power of the ciphertextCdModNThe plaintext can be restored.M. HereDIs the private key. The mathematical principle here isMed-1ModN= 1, which is due to the Euler's FunctionPhi (n)For more information, see Wikipedia.

We already know that the RSA public keyNIt is two mass numbers.PAndQ. You only need to calculate the two prime numbers to obtain the private key.DTo decrypt the information. Renren uses the 256-bit RSA public key, which is a 77-bit decimal integer. It can be decomposed in two or three minutes on a personal PC. This is what makes the short public key terrible.

Decomposition prime factor

The prime factor decomposition method we have learned is a Trial Division, from 1 to n. For a 77-bit decimal integer, try to divide it by 1038 times, which is obviously unacceptable.

Ferma's squared difference formula

Ferma's prime factor decomposition algorithm is the first major improvement after Trial Division and screening. The starting point of this algorithm is the square difference formula (a-B) (a + B) = a2-b2. If we find two positive integers u and v, u + v =n smaller than n, and (u2-v2) can be divisible by n, then (u + v) (u-v) can be divisible by n. Since n is the product of two prime numbers p and q, neither u + v nor u-v can be divisible by n, so p and q have to be separated from u + v and u-v. The maximum common divisor of u-v and n is p or q. It is a very fast process to calculate the maximum common approx.

For example, n = 2041, u = 1416, v = 311, you can verify that (u2-v2) can be divisible by n, take u-v and n as the maximum common approx. (1416-311,204 1) = 13, 13 must be a prime factor of n. In fact, 2041 = 13 × 157.

The problem is, where can I find such u and v? One way is to make u = sqrt (n) (the integer on the root number n, that is, u is the smallest positive integer not less than the root number n), and check whether (u2-n) is the number of workers, if yes, u2-n = v2. If not, the larger n is tested progressively. For example, n = 2041, sqrt (n) = 46, the test process is as follows:

  • 46X46-2041 = 75
  • 47*47-2041 = 168
  • 48*48-2041 = 263
  • 49*49-2041 = 360
  • 50*50-2041 = 459
  • 51*51-2041 = 560
  • ......

Obviously, this road is not far away.

The product is the number of rows.

In the above (u2-n), if the product of a few of them is the number of workers, then the same applies. In the above example, if we suddenly find that 75*168*360*560 is the number of workers, then (462-2041) * (472-2041) * (492-2041) * (512-2041) = 75*168*360*560 = 504002.

Note that the square relationship we are looking for is mod n, so there will be 462*472*492*512 mod 2041 = 504002 mod 2041, that is (46*47*49*51) 2 mod 2041 = 504002 mod 2041. U = 46*47*49*51 mod 2041 = 311, v = 50400 mod 2041 = 1416, which is the u and v we are looking.

The problem comes again: how to find 75,168,263,360,459,560... Which product is the number of shards? Smart readers may think that we need to perform a prime factor decomposition on these numbers, and then we only need to find several numbers so that the power of each prime number in their prime factor factorization formula is the even number, the product of these numbers is the number of workers. For example

  • 75 = 3*52
  • 168 = 23*3*7
  • 360 = 23*32*5
  • 560 = 24*5*7

The idempotence of the above four numbers for the quality factors 2, 3, 5, and 7 is 10, 4, 4, and 2, respectively, which are even numbers, therefore, their product = (25*32*52*71) 2 is the number of workers.

Sieve out the "Easy to decompose" Number

It is not easy to break down the prime factor. Even if the pair (u2-n) is close to the number of n, it is still quite time-consuming. But we are close to the end! We can only consider the number that can be divisible by a relatively small prime number. If requirement 1 ~ In the range of N, the number of prime numbers less than B can be divisible. The most common method is the screening method ~ N is used when it is a prime number. First, draw out the multiples of all 2, and then draw out the multiples of all 3 ...... After the multiples of all prime numbers are crossed, the rest will be prime numbers.

What we need now is u = 0, 1, 2... The prime number in the (u2-n) sequence. The screening method can still be used after slight modification. When filtering out multiples of p, we do not need to try p one by one. If (u2-n) can be divisible by p, u2 mod p = n mod p. Since p is a prime number, there can be at most two u smaller than p (or not ). After finding a u using a fast algorithm, (n-u) 2 mod p = n mod p finds another u, which can be called-u In the mod p sense. Add a multiple of p on the basis of ± u, and its square mod p is still the same as n. Therefore, we only need to drag the array subscript u, u + p, u + 2 p, u + 3 p... And-u + p,-u + 2 p,-u + 3 p... To screen out all the multiples of p.

Because we want to do a prime factor decomposition, we should not delete the multiples of these p, but divide them by p until the power of p in this element is obtained.

For example, for n = 2041, take B = 10, that is, find only the prime number smaller than 10, that is, 2, 3, 5, 7. If a (X2-n) is converted to 1 after the sieve, the number can be completely decomposed into the product of 2, 3, 5, and 7. Otherwise, this number is not considered.

Extract the number that can be completely decomposed in the above table and record the power of them to 2, 3, 5, and 7:

Returns the product of the number of workers.

Because our goal is to find that the product of several numbers is the power number, that is, the power number and the even number, we are only interested in the power of parity, so all mod 2 get the following table:

We need to find several columns so that the sum of these columns is an even number, or that mod 2 of these columns is a zero vector. Isn't this a standard linear equations? Yes, Gaussian elimination can be used (in fact, more efficient Lanczos algorithms can be used ).

We get three groups of extraordinary solutions: (46, 47, 49, 51), (51, 54), (46, 53). In fact, it is enough to get a group of solutions. The three groups correspond to one group (u, v) respectively, and the prime factor n = 2041 is obtained:

  • (46, 47, 49, 51): 46*47*49*51 mod 2041 = 2041, 23*32*52*7 mod 1416 = 1416, (311,204-1) = 13
  • (51, 54): 51*54 mod 2041 = 713,22*52*7 mod 2041 = 700, (713-700,204 1) = 13
  • (46, 53): 46*53 mod 2041 = 2041, 24*3*5 mod 240 = 397, (240,204-157 1) =

In actual computing, decomposition like 459 = 33*17 does not need to be discarded, and the last 17 can be saved in the temporary table. If there is another number, 17 is also broken down, for example, u = 52, 663 = 3*13*17 (assuming that the mass set used for screening contains 2, 3, 5, 7, 11, 13 ), this 17 is matched and still usable. If no matching is found, the number that cannot be completely decomposed can only be discarded.

The above algorithm is basically based on elementary number theory and is not complicated. However, this algorithm, called quadratic sieve, is still the fastest Algorithm for prime factor decomposition in decimal digits smaller than 100 bits.

Practice RSA Decomposition

Many scientific computing software have built in the big data decomposition function, and do not need to implement high-precision computing on their own. Here I use a small open-source tool msieve to implement a prime factor decomposition algorithm, including the secondary screening method.

Break down the public key of Renren:

$ ./msieve 0x8ca2ddaf2a8da3c3c6ed795b87eca0d33827c82b31b6282a5045bc75e9b83153

On the personal PC of Core i7 (msieve is single-threaded), it takes about 2 minutes 55 seconds (instead of using different 256-bit RSA keys for testing, the decomposition time is between 2 to 4 minutes ), the complete output is as follows.

The first 170 seconds are used to screen out the "Easy to decompose" number, and the last 4 seconds are used to solve the linear equations. From this output, we can see that in order to break down the 77-bit integer, we select B = 919223, that is, we use the 36471 prime numbers below B for filtering. The range of the filtered u is 12*32768 = 393216. Among these us, We found 19614 completely decomposed numbers in the range of B, and 188575 cannot be completely decomposed, however, after removing all the prime numbers below B, the "tails" can be paired to form 16981 matching "tails ". After some optimization, we obtain a matrix of about 20 thousand degrees and obtain the prime factor decomposition of n: 220575968563521666214098235031226805271 and 288388433467298454747300239713114570277 (decimal). They are all 39-bit integers.

It's time to witness a miracle. First, calculate PHI (n) = (p-1) (q-1), then calculate e = 65537 in modPhi (n)Multiplication inverse element d in the meaning:

Then, pass in the ciphertext (password in the http post request) c = cipher and find the cd mod n. Then the plaintext password 0x64726f7773736150656854746f4e734971096854 is obtained.

It is displayed with ASCII characters drowssaPehTtoNsIsihT. When the RSA library used by Renren converts the password string to an integer, the start of the string is low and the end of the string is high. Therefore, the password we just entered is thisnotthepassword. Task completed!

Vulnerability not complete

If Renren's programmer changes the RSA key to 2048 bits, the encryption will still be vulnerable. Let's assume that the user's password is ruby1111 (to put it bluntly, many people are using four letters and four digits). According to the RSA library used by Renren, if the string length is insufficient, the end is supplemented with zero. If it is an integer, It is the high position supplemented with zero. That is, the message m used for encryption is 0 × 3131313179627572.

We perform factorization on m = 0x3131313179627572: m = 2*17*397*193679*1355887523. These factors can be combined into two close integers: m = 2614279142*1355887523, both integers are less than 232, set a = 1355887523, B = 2614279142. Since c = me mod n = (AE mod n) * (be mod n) mod n, We have (c/(AE mod n) mod n = be mod n. That is, we only need to find a and B in the range of 232 to meet the above formula, and then we can restore m.

With the public n (2048 bits) and e = 0 × 10001, for B = ,... 232, calculate be mod n respectively, and save it in a large hash table. Then ,... 232, calculate (c/(AE mod n) mod n separately, where division can be achieved using the multiplication inverse element mentioned above. Once the results are found in the big hash table, a group of a and B that meet the conditions are found, and then the plaintext m is obtained after multiplication.

Why is this attack effective? First, a considerable proportion of large integers can be divided into two integers with little difference, which makes the above-mentioned "Tabulation" attack possible. Second, the above-mentioned attack restores the plain text with a length of L, only the time and space complexity of O (2L/2) is required, that is, the valid length of the password is reduced by half.

Do not pass plaintext directly into the RSA Algorithm

In cryptography textbooks and Wikipedia, the plaintext is directly used as m, and the e-power mod n is used as the ciphertext. In the widely used encryption standards such as PKCS, either the plaintext is filled with random padding bytes to n digits, or another random string is generated as m, use m as the symmetric key to encrypt the plaintext.

The RSA library used by Renren should defend against the above attacks. The simplest modification method is to imitate the original version of PKCS. Do not add zero before the plaintext password, but add a 0xFF byte first, use random bytes other than 0xFF to fill the number of digits in n (such as the 2048-bit public key, which is supplemented to 256 bytes ). Here, the pseudo-random number generator must use good seeds and never create a definite sequence. After the server decrypts the data, find the first 0xFF starting from the high position. The 0xFF is followed by the original plaintext password. Of course, there is a time attack (timing attack) in the process of searching for 0xFF. However, using this information to restore plaintext requires sending a large number of requests, which is not realistic on the Internet.

Conclusion

I don't want this article to bring any trouble to everyone's programmers. It's a good idea to use public key cryptography to encrypt and transmit passwords. However, cryptography is really a specialized field. It is only possible to think of key length, message filling, and other issues that require understanding of the mathematical principles behind the algorithm.

Examples of weak encryption protocols created by large companies are everywhere. We are not a cryptology. Do not design encryption protocols by yourself, or "optimize" encryption algorithms intelligently. We must use tested cryptographic libraries such as OpenSSL and GnuPG. Because the browser does not have a mature cryptographic library, Renren cannot blame programmers for using a library that looks good in code quality.

Finally, let's give you something to install X: the time complexity of the secondary screening method described in this article is exp (sqrt (log N * log N)/2 ), N indicates the number to be decomposed. It is higher than any polynomial algorithm, and lower than the exponential complexity. The next time someone asks "is there an algorithm that is slower than the polynomial algorithm and faster than the best known algorithm with NP complete problems", they can use this to install X. In fact, in the progressive sense of the factorization problem, the time complexity of the best known algorithms (number field screening method) is also higher than that of any polynomial algorithm, but lower than that of the exponent. The time complexity of Factorization is an open question.

References
  1. Msieve, sourceforge project: http://sourceforge.net/projects/msieve/
  2. Carl Pomerance, A Tale of Two Sieves, notices of the ams, December 1996: http://www.ams.org/notices/199612/pomerance.pdf
  3. Carl Pomerance, Smooth numbers and the quadratic sieve, Algorithmic Number Theory, MSRI Publications, Volume 44,200 8: http://library.msri.org/books/Book44/files/03carl.pdf
  4. Dan Boneh, Twenty Years of Attacks on the RSA Cryptosystem: http://crypto.stanford.edu /~ Dabo/pubs/papers/RSA-survey.pdf
  5. Bruce Schneier, Applied Cryptography, 2nd Edition
  6. Wikipedia
  7. StackOverflow

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.