HTTPS practices for large Web sites (i)--HTTPS protocols and principles

Source: Internet
Author: User
Tags sha1 asymmetric encryption

Disclaimer: This series of articles (a total of about 4) is transferred from the cool network, the middle of my personal changes or comments.

Objective

Baidu has been in the recent launch of the full HTTPS -site security search, the default will be the HTTP request to jump into HTTPS . This article focuses on HTTPS the protocol, and briefly describes HTTPS the significance of the deployment of the whole station.

HTTPS Protocol Overview

HTTPSCan be considered to be HTTP + TLS . The HTTP protocol is familiar to everyone, most of the WEB applications and websites are now HTTP transmitted using protocols.

TLSIs the Transport Layer encryption protocol, its predecessor is the SSL agreement, the earliest by the netscape company released in 1995, 1999 after IETF Discussion and specification, renamed TLS . If not specifically stated, SSL and TLS is said to be the same agreement.

HTTPand the TLS location of the protocol layer and the TLS composition of the protocol such as:

TLSThere are five main parts of the protocol: Application Data layer protocol, handshake Protocol, alarm protocol, encrypted Message Acknowledgement protocol, heartbeat protocol.

TLSThe protocol itself is record transmitted by the Protocol, and the format of the protocol is shown at the record very right.

The commonly used protocols HTTP are the HTTP1.1 following: the common TLS Protocol version TLS1.2, TLS1.1, TLS1.0 和 SSL3.0 . SSL3.0 POODLE While the attacks have proved unsafe, the statistics found that less than 1% of browsers still use them SSL3.0 . TLS1.0There are also some security vulnerabilities, such as RC4 and BEAST attacks.

TLS1.2and TLS1.1 temporarily no known security vulnerabilities, more secure, while there is a large number of expansion to improve speed and performance, recommended for everyone to use.

One thing to be concerned about is that it will be TLS1.3 TLS a very significant reform of the agreement. Both security and user access speed will be a qualitative improvement. However, there is no definitive release time.

It HTTP2 has also been formally finalized, the protocol SPDY evolved from the agreement HTTP1.1 is a very significant change, can significantly improve the efficiency of application layer data transfer.

HTTPS Feature Introduction

Baidu HTTPS 's use of the agreement is mainly to protect user privacy, prevent traffic hijacking.

HTTPis transmitted in plaintext, without any security processing. For example, users in Baidu search for a keyword, such as "Apple phone", the middle is fully able to look at this information, and may call to harass users. There are also some users complained about the use of Baidu, found the homepage or the results of the page floated a very long and large ads, which is certainly the middle of the page to plug in the advertising content. If hijacking technology is inferior, users can't even access Baidu.

The intermediary mentioned here mainly refers to some network nodes, is the user data in the browser and Baidu Server intermediate transmission must pass through the node . such as WIFI hotspots, routers, firewalls, reverse proxies, cache servers and so on.

HTTPunder the agreement, the intermediary can sniff the user search content, steal privacy and even tamper with the webpage. But HTTPS it is the bane of these hijacking actions that can be completely effective in defending.

Overall, the HTTPS protocol provides three powerful features to combat the above hijacking behavior:

    1. Content encryption. Browser to Baidu server content is encrypted form of transmission, the intermediary can not directly view the original content.

    2. Identity authentication. Ensure that users access to the Baidu service, even if DNS hijacked to a third-party site, will also remind users not to visit Baidu Services, there may be hijacked

    3. Data integrity. Prevent content from being impersonated or tampered with by a third party.

HTTPSHow did that happen to the three points above? Here's a brief introduction to the principle.

HTTPS Principle Introduction 4.1 content Encryption

Encryption algorithms are generally divided into two types, symmetric and asymmetric encryption. Symmetric encryption (also known as key encryption) means that encryption and decryption use the same key. Asymmetric encryption (also known as public-key encryption) means that encryption and decryption use different keys.

Symmetric and Asymmetric encryption

Symmetric content encryption is very strong, generally can not be cracked. But there's a big problem with the inability to safely generate and keep keys . If each session between the client software and the server uses fixed, the same key encryption and decryption, there must be a great security risk. If someone obtains a symmetric key from the client side, the entire content is not secure, and managing a huge amount of client-side keys is a complex matter.

Asymmetric encryption is mainly used for key exchange (also called key negotiation), which can solve this problem well. Each new session of the browser and the server uses an asymmetric key exchange algorithm to negotiate the symmetric key, using these symmetric keys to complete the application data encryption and decryption and validation, the entire session of the key is only generated and saved in memory, and each session of the symmetric key is not the same (unless the session is reused), the intermediary cannot steal.

Asymmetric key exchange is safe, but it is also HTTPS the "culprit" for a severe decrease in performance and speed. If you want to know HTTPS why it affects speed and why you consume resources, you must understand the whole process of asymmetric key exchange.

The following highlights the mathematical principle of asymmetric key exchange and its application in the TLS handshake process.

Asymmetric key exchange

Before the asymmetric key exchange algorithm appears, a big problem with symmetric encryption is that you don't know how to safely generate and store the key. The asymmetric key exchange process is mainly to solve this problem, making symmetric key generation and use more secure.

Key exchange algorithm itself is very complex, the key exchange process involves random number generation, modulo exponential operation, blank completion, encryption, signature and other operations.

Common key exchange algorithms have RSA,ECDHE,DH,DHE such algorithms. They are characterized by the following:

RSA: The algorithm is simple, was born in 1977, has a long history, after a lengthy break test, high security. The disadvantage is that a large number of primes (currently 2048-bit) are needed to ensure security intensity and consume CPU computational resources. RSAis currently the only algorithm that can be used for both key exchange and certificate signing.
DH: diffie-hellman key exchange algorithm, the birth time is earlier (1977), but 1999 is not public. The disadvantage is the comparison of consumption CPU performance.
ECDHE: An algorithm using an elliptic curve () has the advantage of being ECC DH able to achieve RSA the same level of security with a smaller prime number (256 bits). The disadvantage is that the algorithm is complex and the history of the key exchange is not long, and it has not been tested for long time security attack.
ECDH: Not supported PFS , low security, not implemented at the same time false start .
DHE: Not supported ECC . Very CPU resource intensive.

Preferred support RSA and key ECDH_RSA exchange algorithms are recommended. The reasons are:

1, ECDHE Support ECC acceleration, calculate faster. Support PFS , more secure. Support false start , users access faster.

2, there are at least 20% clients not supported ECDHE , we recommend using instead of RSA DH or DHE because the DH series algorithm is very expensive CPU (equivalent to doing two RSA calculations).

It is important to note that the key ECDHE exchange is usually referred to by default ECDHE_RSA , using ECDHE DH the public private key that is required to generate the algorithm, and then using the RSA algorithm to sign the final calculation of the symmetric key.

Asymmetric encryption is more secure than symmetric encryption, but there are two obvious drawbacks:

1, the CPU computational resource consumption is very large. A full TLS handshake, the asymmetric decryption calculation of key exchange accounted for more than 90% of the entire handshake process. Symmetric encryption is only equivalent to 0.1% of asymmetric encryption, if the application layer data also uses asymmetric encryption and decryption, the performance overhead is too large to withstand.

2, the asymmetric encryption algorithm has a limit on the length of the encrypted content and cannot exceed the public key length. For example, the current common public key length is 2048 bits, which means that the content to be encrypted cannot exceed 256 bytes.

Therefore, public key encryption can only be used for key exchange or content signature, and is not suitable for the application layer to transmit the content of the encryption and decryption .

Asymmetric key exchange algorithm is the cornerstone of the whole HTTPS security, fully understand the asymmetric key exchange algorithm is HTTPS the key to understand the protocol and function.

The following are briefly introduced and the application in the RSA ECDHE key exchange process.

RSAKey negotiation

RSAAlgorithm Introduction

RSAThe security of the algorithm is based on the multiplication is irreversible or the large number factor is difficult to decompose . RSAthe derivation and realization of the Euler function and Fermat theorem and the concept of modulo inverse elements, interested readers can self-Baidu.

RSAAlgorithm is one of the most important algorithms to rule the world, and from the present, it RSA is also the HTTPS most important algorithm in the system, not one.

RSAThe calculation steps are as follows:

    1. Randomly pick two prime numbers p , q assuming p = 13 that q = 19 . n = p * q = 13 * 19 = 247;

    2. ?(n)Represents n the number of Inma with an integer. If n the product equals two prime numbers, then ?(n)=(p-1)(q-1) pick a number e , satisfy 1< e <?(n) and e with coprime, assume e = 17 ;

    3. Calculation e n of the modulo inverse elements, ed=1 mod ?(n) by e = 17 , ?(n) =216 available d = 89 ;

4, find out e , and d , assuming plaintext m = 135 , the ciphertext is c represented by. Then the encryption and decryption is calculated as follows:

In practice, a (n,e) public key pair is formed, (n,d) which consists of a private key pair, which n d is a nearly 22048 large number. Even if the performance is very strong now CPU , want to calculate m≡c^d mod(n) , also need to consume relatively large computational resources and time.

Public key pairs (n, e) are generally registered in the certificate, anyone can directly view, such as Baidu Certificate public key to such as, where the last 6 digits ( 010001 ) converted to 10 is 65537, that is, the public key pair e . eThere are two advantages to taking a smaller value:

1, by the known c=m^e mod(n) , e smaller, client CPU computing consumes less resources.

2, increase server the end of the crack difficulty. erelatively small, the private key pair d must be very large. So d the value of space is very large, increased the difficulty of cracking.

Why (n,e) is it so safe to disclose it as a public key, even if you can see it directly from the certificate? The analysis is as follows:

Because ed≡1 mod ?(n) , know e and n , want to ask out the private key d , you must know ?(n) . Instead ?(n)=(p-1)*(q-1) , the private key must be computed p and q can be determined d . But when n large to a certain extent (such as near 2^2048), even now the fastest can CPU not do this factorization, that is, can not know the n number p and q the multiplication. So even if you know the public key, the entire encryption and decryption process is very safe.

Key negotiation in the handshake process RSA

How does the RSA symmetric key required for the final session be generated? RSAWhat's the matter with it?

TLS1.2For example, simply describe a handshake message that is not related to key exchange. The process is as follows:

1, the browser sends client_hello , contains a random number random1 .

2, the server reply server_hello , including a random number random2 , while replying certificate , carrying the certificate public key P .

3, the browser random2 will be able to generate premaster_secrect as well as after receiving it master_secrect . Where premaster_secret the length is 48 bytes, the first 2 bytes are the protocol version number, and the remaining 46 bytes are populated with a random number. The structure is as follows:

Struct {byte Version[2random[46];}

master secrectThe generation algorithm is summarized as follows:

Master_key = PRF(premaster_secret, “master secrect”,随机数1+随机数2labellabelXOR P_SHA-1label + seed)

As can be seen from the above, the premaster_key assignment to secret , " master key " assignment, the label browser and the server side of the two random number of seeds can be determined to find a 48-bit long random number.

The master Secrect contains six parts, which are keys for verifying content consistency, keys for symmetric content encryption and decryption, and initialization vectors (for CBC mode), client and server.

At this point, the browser-side key has been negotiated.

4, the browser uses the certificate public key P will be encrypted and sent to the premaster_secrect server.

5, the server uses the private key decryption to get premaster_secrect. Because the server received a random number of 1 before, so the service side based on the same generation algorithm, under the same input parameters, the same master Secrect is obtained.

RSAThe key negotiation handshake process is illustrated as follows:

As you can see, the key negotiation process requires 2 RTT , which is also an HTTPS important reason for slowness. And the RSA key role of play is to encrypt and decrypt the premaster_secrect. It is impossible for the intermediary to crack the RSA algorithm, it is impossible to know the premaster_secrect, thus ensuring the security of the key negotiation process.

is the key ECDHE negotiation process

Symmetric content Encryption
The asymmetric key exchange process concludes with the symmetric key that is required for this session. Symmetric encryption is divided into two modes: streaming encryption and packet encryption. Streaming encryption is now commonly used RC4 , but is RC4 no longer secure, Microsoft also recommends that the site try not to use RC4 streaming encryption.

A new alternative RC4 to the streaming encryption algorithm called ChaCha20 , it is Google introduced faster, more secure encryption algorithm. It has been adopted by Android and Chrome, has been compiled into Google's open source OpenSSL branch-boring SSL, and Nginx 1.7.4 also supports compiling BORINGSSL.

The previously common pattern for packet encryption is AES-CBC, but CBC has been shown to be susceptible to beast and LUCKY13 attacks. The currently recommended packet encryption mode is AES-GCM, but its disadvantage is that it is computationally expensive, with high performance and power consumption, and is not suitable for mobile phones and tablets.

Identity verification

Identity authentication mainly involves PKI and digital certificates. Typically PKI (public key Infrastructure) contains the following sections:

End entity: The terminal body, which can be a terminal hardware or Web site.
CA: Certificate issuing authority.
RA: Certificate registration and Auditing authority. such as reviewing the application website or the authenticity of the company.
CRL Issuer: Responsible for publishing and maintaining the certificate revocation list.
Repository: Responsible for digital certificates and CRL content storage and distribution.
Applying for a trusted digital certificate usually has the following process:

1, the terminal entity generates a public private key and a certificate request.

2, RA examines the legality of the entity. This step is not required if you are an individual or a small website.

3, the CA issues a certificate and sends it to the requester.

4, the certificate is updated to Repository, the terminal subsequently updates the certificate from repository, queries the certificate status, and so on.

The current certificate used by Baidu is x509v3 format, consisting of the following three parts:

1, Tbscertificate (to be signed certificate signed certificate content), this part contains 10 elements, respectively, the version number, serial number, Signature algorithm identification, issuer name, validity period, certificate principal name, certificate principal public key information, publisher unique identifier, Subject unique identification, extension, etc.

2, Signaturealgorithm, Signature algorithm identification, specifies the algorithm to sign the tbscertificate.

3, Signaturvalue (signature value), use Signaturealgorithm to calculate the signature value for Tbscertificate.

Digital certificates have two functions:

1, identity authorization. Make sure that the Web site that your browser accesses is a trusted, CA-verified site.

2. Distribute the public key. Each digital certificate contains the public key generated by the registrant. The SSL handshake is transmitted to the client through the certificate message. For example, the RSA certificate public key encryption and ECDHE signature mentioned earlier are all used in this public key.

The applicant gets the CA's certificate and deploys it on the server side of the website, and after the browser initiates a handshake to receive the certificate, how to confirm that the certificate is issued by CA? How to avoid a third party to forge this certificate?

The answer is digital signatures (digitally signature). Digital signatures are security labels for certificates, and the process of making and verifying the most widely used SHA-RSA digital signatures is as follows:

1, digital signature of the issue. The first is to use a hash function to treat the signature content as a secure hash, generate a message digest, and then encrypt the message digest using the CA's own private key.

2, digital signature verification. Use the CA's public key to decrypt the signature and then use the same signature function to sign the content of the signing certificate and compare it with the signature content in the server-side digital signature, if the same is considered successful.

Data integrity

This part of the content is better understood, similar to the usual MD5 signature, but the security requirements are much higher. OpenSSL now uses two kinds of integrity check algorithms: MD5 or SHA. Since MD5 is more likely to conflict in practice, try not to use MD5 to verify content consistency. SHA also cannot use SHA0 and SHA1, and Professor Xiao of Shandong University in China announced in 2005 that the SHA-1 full version of the algorithm was cracked.

Both Microsoft and Google have announced that they will no longer support SHA1 signing certificates after 16 and 17.

HTTPS usage Costs

The only problem with HTTPS at the moment is that it has not been applied on a large scale and has received less attention and research. As for the cost of use and the extra cost, don't worry too much at all.

Generally speaking, you may be very concerned about the following issues before using HTTPS:

Certificate fees and update maintenance. We feel that the application of the certificate is very troublesome, the certificate is also very expensive, but the certificate is not expensive, cheap dozens of yuan a year, up to hundreds of. And now there are free certification authorities, such as the famous Mozilla-sponsored free Certificate project: Let's Encrypt (https://letsencrypt.org/) supports free certificate installation and Automatic Updates. The project will be formally used in the middle of this year.
The cost of digital certificates is actually not high, for small and medium-sized websites can use cheap or even free digital Certificate Services (there may be security risks), like the famous VeriSign Company's certificate generally thousands of to tens of thousands of blocks a year. Of course, if the company's demand for certificates is relatively large, high customization requirements, you can establish their own CA site, such as Google, can be free to issue Google-related certificates.

HTTPS reduces user access speed. HTTPS has a somewhat reduced speed, but the impact of HTTPS on speed is fully acceptable as long as it is properly optimized and deployed. In many scenarios, the HTTPS speed is exactly the same as HTTP, if you use SPDY,HTTPS even faster than HTTP.
We now use Baidu HTTPS security search, have you feel slow?

HTTPS consumes CPU resources and requires a large number of machines to be added. The asymmetric key exchange is described earlier, which is a major drain on CPU computing resources, and in addition, symmetric plus decryption requires CPU computing.
Similarly, as long as reasonable optimization, HTTPS machine costs will not increase significantly. For small and medium-sized websites, there is no need to add machines to meet performance requirements.

Postscript

Many foreign large Internet companies have enabled the full-site HTTPS, which is the trend of the Internet in the future. The large domestic internet does not deploy HTTPS all the time, but HTTPS is enabled on some sub-pages/sub-requests involving accounts or transactions. Baidu search for the first full-site deployment of HTTPS, the domestic Internet, the full-site HTTPS process will have a huge role in promoting.

At present, the Internet on the Chinese information on HTTPS is relatively small, this article focuses on the HTTPS protocol involved in the important knowledge points and usually not easy to understand the blind area, hoping to understand the HTTPS protocol is helpful. Baidu HTTPS performance optimization involves a lot of content, from the front page, back-end architecture, protocol features, encryption algorithms, traffic scheduling, architecture and operation, security and other aspects have done a lot of work. The articles in this series will be described in turn.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

HTTPS practices for large Web sites (i)--HTTPS protocols and principles

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.