I. Overview
RSA algorithm was 1977 by Ron Rivest, Adi Shamir and Leonard Adleman Trio in thesis A Method for obtaining Digital Signatures and Public-key Cryp Tosystems proposed public-key encryption algorithm. Since encryption and decryption use different keys to avoid key distribution problems, it can also be used for digital signatures. The birth of the algorithm was largely inspired by the paper New directions in Cryptography(co-published by Whitfield Diffie and Martin Hellman), and the interesting things to see behind the birth of RSA how the RSA algorithm was born .
The original paper can be said that the idea is very clear, easy to understand, is to learn the algorithm is very good English information. The original text first combs the public key encryption system and the characteristics of the digital signature and needs to meet the requirements (this part is actually borrowed from Whitfield Diffie and Martin Hellman idea), and then explains how to use different encryption key and decryption key to implement the encryption and decryption process, This is the core of the RSA algorithm work, then introduces the mathematical principle behind it and proves the correctness of the algorithm, mainly involves the knowledge of basic number theory (such as Euler function, Fermat theorem, Euler theorem, etc.), in order to make the algorithm more operational, it also introduces how to use "repeated flat method" algorithm to calculate the power modulus quickly, In order to quickly add and decrypt operations, as well as other parameters are involved in the algorithm selection (such as the selection of large prime $p,q$, $e $ and $d$ selection and calculation); A simple small instance is also used to simulate the flow of the algorithm; the last important topic is to discuss the security of the algorithm. Try to crack the algorithm by considering different methods and explain the difficulty of cracking the algorithm from a computational perspective, as well as other details of the bla bla bla ....
Therefore, the original paper is almost the whole of the RSA algorithm is described in all aspects and without losing readability, it is worth learning. Some Chinese books also have an easy-to-understand interpretation of the algorithm, such as the Code of Cryptography and Network security principles and practices and graphic Cryptography Two book related chapters, are very good learning materials.
Second, public key encryption system
The main feature of public key cryptography system is the use of different encryption keys and decryption keys. In this system, each user generates their own encryption key and decryption key, where the encryption key to public, anyone can obtain, the encryption key is also known as the public key, decryption key must be kept in good keeping, decryption key is also known as the private key. At the time of communication, the sender encrypts the message with the receiver's encryption key (that is, the public key), obtains the ciphertext message, sends it to the receiver, and the receiving party receives the message with its own decryption key (that is, the private key) to decrypt the plaintext. The encryption and decryption method is adopted to avoid the key distribution problem in the traditional cryptographic system.
In cryptography, cryptographic algorithms are often public, and attempts to secure cryptographic security through secret encryption algorithms are not recommended (this is known as covert security). In fact, it is not easy to invent a workable cryptographic algorithm, and it is good practice to expose it to the time and the testing of the vast number of users. Encryption algorithms are encrypted and decrypted with the use of secret keys. Therefore, under the premise that the encryption algorithm is public, the security of encryption depends on the security of the secret key.
The encryption Process (encryption) and decryption process (decryption) are treated as a handler, respectively, with $e$ and $d$ representations. PlainText messages and ciphertext messages are represented by $m$ and $c$, respectively. The public key cryptography system has the following four features:
(a) for encrypted ciphertext $c=e (m) $, the corresponding decryption procedure can be processed by the plaintext: $D (C) =d (E (M)) =m$.
(b) Encryption process $e$ and decryption process $d$ is easy to calculate.
(c) By the public cryptographic program $e$ can not easily get the decryption program $d$, so that the $e$ encrypted messages can only be decrypted by $d$.
(d) The decryption of the plaintext message before processing in the encryption process can still be clear, that is to ensure the feasibility of reverse processing: $E (d (M)) =m$.
The attribute (a) guarantees the original purpose of the encryption, and if the encrypted ciphertext cannot be restored back to the plaintext by the recipient's decryption, it will not communicate properly with each other. Features (a) and (b) are also established in traditional symmetric cryptography systems. Feature (d) for digital signatures, it is not surprising that plaintext can be decrypted: by throwing away the concept of encryption and confidentiality, $E $ and $m$ are actually a mapping from input to output, the concept of plaintext and ciphertext is divided by the line of people, for computers, whether plaintext or ciphertext, is a bit sequence and there is no other difference, so encrypting the plaintext is nothing more than doing a function operation.
Third, secret key generation and encryption and decryption process
1. Secret key generation
Each user will generate their own public and private keys, with the following process:
1) Select two large primes $p$ and $q$.
2) Calculate the product $n=p \times q$ of $p$ and $q$.
3) Randomly select a number coprime with $\phi (n) = (p-1) \times (q-1) $ $e$, which is $GCD (d, (p-1) \times (1-1)) =1$, which is usually selected in the application.
4) Calculate $e$ modulo $\phi (n) $ for the modulo inverse element $d$, which is also calculated to meet $e\cdot d = 1\;(mod\;\phi (n)) =e\cdot d = 1\;(mod\;(p-1) \cdot (q-1)) $ of $d$.
5) $ (e,n) $ public as a public key that anyone can get; $ (d,n) $ as the private key, save yourself properly.
In the RSA encryption algorithm, the public key appears in the form of a positive integer pair, as well as the private key.
During the key generation process, the following information is generated:
$p, Q,n,\phi (n), d,e$
But the information that needs to be disclosed is only $e,n$ two integers, and all other information should be kept strictly confidential.
(The above process is actually slightly different from the original paper: the original paper is to select $d$ and then to calculate the $e$, but in many public key certificates now, $e $ is basically the same, that is, the actual application is to choose $e$ and then calculate $d$. )
2. Encryption and decryption
The process of encrypting and decrypting the RSA algorithm is explained by the role of the two Nets red in the two cryptography studies, Alice and Bob. Suppose Alice initiates communication to Bob and has obtained Bob's public key to $ (e,n) $.
1) Alice first breaks down the plaintext into blocks, each of which can be represented as an integer (that is, the long-bit sequence is decomposed into a number of short bit sequences, each of which can naturally be represented as an integer), represented by $m$, so that \leqslant M \leqslant n-1$. For convenience, only the process of encrypting a block is considered.
2) Alice uses Bob's public key $ (e,n) $ to do the following operations, get ciphertext $c$, the ciphertext through the public network channel sent to Bob.
$C = m^{e}\;mod\;n$
3) Bob receives the ciphertext, with his own private key $ (d,n) $ to do the following operations, can get the plaintext $m$.
$M =c^{d}\;mod\;n$
At this point, the process of adding and decrypting is over, and the process is very simple and clear.
The process of generating and decrypting with a summary key:
The next question is, will Bob be able to decrypt the plaintext $m$? How does it work? The answer is yes, its principle this is exactly what the next section needs to explain.
Iv. Principles of Mathematics
To prove that Bob is able to decrypt the plaintext $m$, it needs a simple basic knowledge of number theory, such as Euler's function, Fermat theorem, Euler's theorem, but this knowledge is relatively tolerant, has already combed this aspect of knowledge, see Modern cryptography in the basic knowledge of number theory comb .
In order to get a clearer understanding of what is being done, it is not a matter of abstracting the problem into the following statement:
Known conditions:
$p, q$ is prime, $n =p \times q$
$d $ with $ (p-1) \cdot (q-1) $ coprime, and $e,d$ satisfies $e\cdot d = 1\;(mod\;\phi (n)) =e\cdot d = 1\;(mod\;(p-1) \cdot (q-1)) $
$C = m^{e}\;mod\;n,1 \leqslant M \leqslant n-1$
Confirmation:
$M =c^{d}\;mod\;n$
Prove:
by $e\cdot d = 1\;(mod\;\phi (n)) $ $e\cdot d=k\phi (n) +1$ ($k $ for a positive integer).
Convert $c \equiv m^{e}\;(mod\;n) $ simple equivalent transform to:
$C ^{d} \equiv m^{e\cdot D} \equiv m^{k\phi (n) +1}\;(mod\;n) $
Now you need to prove $m^{k\phi (n) +1} \equiv m\;(mod\;n) $.
(1) When $m$ and $n$ coprime, the Euler theorem is:
$M ^{\phi (n)}\equiv m^{(p-1) (q-1)}\equiv 1\;(mod\;n) $
Simple equivalence transforms:
$ (M^{\phi (n)}) ^{k}\cdot m\equiv m^{k\phi (n) +1}\equiv m\;(mod\;n) $ from this certificate.
(2) When $m$ and $n$ do not coprime, because the $n$ factor decomposition can only be $n=p \times q$, so $GCD (m,n) $ or $p$ or $q$, which is $m=hp$ or $m=hq$.
Assuming $m=hq$, at this time $m$ inevitably with $p$ coprime.
By Fermat theorem and Euler function:
$M ^{p-1}\equiv1\;(mod\;p) $
After a simple equivalence transformation:
$ (M^{p-1}) ^{k (p-1)}\cdot m\equiv m^{k\phi (n) +1}\equiv m\;(mod\;p) $
That is:
$ (HQ) ^{k\phi (n) +1}= jp+hq$ ($h $ for a positive integer)
The left side of the $q$ is bound to be an integer multiple, so the right side is necessarily the $q$ integer multiples, you can infer that $jp$ is necessarily $q$ integer times, because $p,q$ coprime the relationship, you can infer that $j$ is necessarily the integral multiples of $q$, which is $j=t \cdot q$, so continue to collate:
$ (HQ) ^{k\phi (n) +1}= tqp+hq=tn+hq$ ($t $ for a positive integer)
This leads to: $M ^{k\phi (n) +1}\equiv m\;(mod\;n) $
Similarly, the same conclusion can be obtained $M =hp$.
Synthesis (1) and (2) can be concluded that: $M ^{k\phi (n) +1}\equiv m\;(mod\;n) $ is always set up, so $m=c^{d}\;mod\;n$ is also established, the certificate of completion.
The above certification process (2) will be more difficult to understand a little, the original paper on the process of the proof will be more obscure, but is basically the proof of ideas, so this article specifically for the proof as far as possible to refine the treatment.
How far concrete examples
A more specific example is used to simulate the algorithm.
1) First Bob generates the public and private keys
The parameters are selected $p=887,q=911,n=p \times q=808057,\phi (n) = (p-1) (q-1) =806260,e=65537$.
Using the extended Euclidean algorithm for $65537d-y\phi (n) =1$: $65537\times 158233+ ( -12862) \times 806260=1$, so the $d=158233$.
So Bob's public key is $ (e,n) = (65537,808057) $, the private key is $ (d,n) = (158233,808057) $.
2) Alice encrypts the plaintext message with Bob's public key $m=123456$ the ciphertext, $C =m^{e}\;mod\;n=123456^{65537}\;mod\;808057=147690$, sends $c$ to Bob.
3) Bob receives the ciphertext message, decrypts it with his private key, and $M =c^{d}\;mod\;n=815453^{158233}\;mod\;808057=123456$,bob decryption succeeds.
In fact, $n$ in real-world applications are large, often 1024-bit binary numbers (about the equivalent of $1024\times log_{10}2 \approx 309$ bit decimal), or 2048-bit, or even 4,096-bit large numbers.
Six, more details
The above content roughly explains the algorithm's flow and principle, but in fact there are a lot of details worth thinking about, such as how to quickly calculate power modulus, $p, q$ two large primes how to effectively select, modulo inverse element $d$ how to calculate, and so on, these details of the implementation of the algorithm is very important.
1. Fast power-take modulus algorithm
The process of encryption and decryption is essentially a power-modulo operation, which is calculated: $a ^b\;mod\;n$. In practical application, each parameter is a huge integer, can find an efficient algorithm?
It is true that an algorithm can be used to calculate the power modulus efficiently, called the "Iterative Square" algorithm, which is also described in the introduction of the algorithm . The algorithm flow is as follows:
The demo implementation of the iterative leveling method:
Public Static LongDomodularexponentiation (LongALongBLongN) {Longdigit = long.tobinarystring (b). Length ();//get the binary representation of B in bits LongMask = 1 << (digit-1); Longremainder = 1; for(Longi = Digit-1;i >= 0;i--) {remainder= (remainder*remainder)%N; if(Mask = = (b & Mask)) {//when the bit is 1 o'clock, a supplementary test is requiredremainder = (remainder*a)%N; } Mask>>= 1; } returnremainder; }
The function of mask is to determine whether the bits (from the high) of the current round in the binary representation of the exponential $b$ is 1, and the mask will move right one after each turn.
For example, for $147690^{158233}\;mod\;808057$, in the Domodularexponentiation (147690,158233,808057) calculation, the binary representation of the exponential $b$ and each turn mask is as follows:
100110101000011001100000000000000000 10000000000000000 1000000000000000 100000000000000 10000000000000 1000000000000 100000000000 10000000000 1000000000 100000000 10000000 1000000 100000 10000 1
2. Select large prime and prime number test
The distribution of prime numbers has not been fully studied yet, but there have been some interesting conclusions, such as the prime number theorem shows:
$\lim_{n \to \infty}\pi (n) =\frac{n}{ln (n)}$
where $\PI (n) $ is the prime distribution function, which represents the number of primes less than or equal to $n$.
When the $n$ is large, the results of $\FRAC{N}{LN (n)}$ are closer to the real value. The theorem shows that the probability of randomly selecting an integer as a prime number is $\FRAC{1}{LN (n)}$, from the angle of probability distribution, it can be considered that there is a prime number in $LN (n) $. Therefore, to find a length and $n$ the same number of primes, about the need to detect $n$ near the $LN (n) $ integer can be found, for example, to find a 20,482-bit prime, you can detect $ln2^{2048}\approx 1420$ a randomly selected 2048-bit integer primality can be.
Test primality is a more complex job than the selection of prime numbers, Fermat testing is a basic idea, but there are flaws, there will be a miscarriage of information (such as $7^{560}\equiv 1\;(mod\;561) $, but 561 is a composite, 561=3*11*17, and like 341, 645,1105, such as the number of such miscalculations is called Carmichael number), Miller-rabin test accuracy will be higher. In fact, testing large primes is a probabilistic work.
3. Select $e$ and calculate $d$
$e$ in many public key certificates in real-world applications typically choose $ (010001) _{16}= (65537) _{10}$.
It is not difficult to calculate $d$ by $e$, because $ed\equiv\;1 (Mod\;\phi (n)) $, which is the solution of linear equations $ed+ (-y) \phi (n) =1=gcd (E,\phi (n)) $, can be solved quickly using the extended Euclidean algorithm.
Extended Euclidean algorithm Demo implementation:
Public Static Long[] Gcdext (LongALongb) { Longans; Long[] result=New Long[3]; if(b==0) {result[0]=A; result[1]=1; result[2]=0; returnresult; } Long[] Temp=gcdext (b,a%b); Ans= Temp[0]; result[0]=ans; result[1]=temp[2]; result[2]=temp[1]-(A/b) *temp[2]; returnresult; }
Seven, digital signature
Digital signature is the electronic of paper signature.
A digital signature involves two questions: first the recipient of the message must be able to confirm the source of the message, that is to say that the message was indeed sent by the desired person, and that the recipient could persuade the third party to verify that the message was actually sent by the signer rather than the other person's forged signature, which would refute the sender's repudiation. The digital signature must be both message-dependent (capable of identifying tampering) and signer-dependent. The significance of a digital signature is that a particular signer is bound to a particular message.
The digital signature relies on the attribute (d), and the encryption algorithm acts on the unencrypted plaintext message, which is actually the inverse use of the encryption process.
So the signature $s$ is actually calculated by the following rules:
$S =d (M) $
So Bob sends this digital signature to Alice,alice, which can be verified by Bob's public key, and that Alice can have a third party verify the signature, because the signature can only be computed by Bob's private key, so Alice and others cannot forge the signature. So Bob can't deny it.
Eight, crack
Due to the secret key generation process, the public key is only exposed to $ (e,n) $, so the cracker can only attempt to calculate the private key from these two messages trying to derive the $ (d,n) $.
Decoding the RSA algorithm can be attempted from the following aspects:
(1) Factor decomposition $n$
(2) Calculation of $\phi (n) $ without decomposition of the $n$
(3) Under the premise of not decomposing $n$ and not calculating $\phi (n) $, brute force hack $d$
However, the above problems are not easy, there is no very efficient algorithm solution.
Theoretically, as long as the secret key space is limited, spend a lot of computation and a lot of time can be cracked, but as long as the key length selection enough, in real time to crack the algorithm becomes less likely, so the security of RSA is actually discussing a timeliness problem. As hardware computing speeds are upgraded, the number of key bits that were previously considered safe enough to be compromised will have to be further increased, so the secret key length of the RSA algorithm is generally longer than the secret key length of symmetric and elliptic curve encryption. In other words, the security level of the encryption algorithm is not entirely dependent on the length of the secret key, which has a great relationship with the characteristics and principles of the cryptographic algorithm itself.
Ix. Other
Because of the possibility of a man-in-the-middle attack on the public key, the received public key is not the other's public key but the intermediary's own public key, which poses a threat to the confidentiality of the message. In order to avoid this kind of man-in-the-middle attack, the public key authentication method is adopted, that is, the public key certificate is promulgated by the authoritative certification body, and the trust problem of the public key is verified by the higher credibility Certification body (CA), and the credibility of the certification body is guaranteed by credibility.
RSA public key cryptography is slower than symmetric encryption in operation, so it is not used to actually encrypt messages, but to be used as digital signatures and in combination with symmetric encryption in mixed cipher systems.
Combining symmetric and public key cryptography in a hybrid cryptography system:
(1) The sender generates a temporary secret key, called the session key, which encrypts the message with the symmetric encryption algorithm.
(2) The sender uses the public key encryption algorithm to encrypt the session key with the recipient's public key, and then the encrypted message is sent to the recipient with the encrypted session key combination.
(3) The recipient points out the encrypted session key, decrypts the session key with its own private key, and decrypts the encrypted message with the session key.
Ten, References
1, A Method for obtaining Digital Signatures and Public-key cryptosystems
2. Cryptographic coding and network security principles and practices
3. Graphic cipher Technology
Reprint please indicate the original source: https://www.cnblogs.com/qcblog/p/9011834.html
More details on understanding RSA algorithm