"Illustrated https" reading notes.
There may be security issues such as information eavesdropping or identity spoofing in the HTTP protocol, and the use of HTTPS communication mechanisms can effectively prevent these problems.
Disadvantages of 1.HTTP
HTTP mainly has these shortcomings, such as the following examples:
Communication using plaintext, content may be tapped;
does not verify the identity of the communication party, so it is possible to encounter camouflage;
Cannot prove the integrity of the message, so it may have been tampered with ...
These problems occur not only on HTTP, but also in other unencrypted protocols. In addition, HTTP itself has many drawbacks. Also, there are weaknesses in real-world applications such as certain Web servers and specific Web browsers (which can also be described as vulnerabilities or security breaches), and Web applications developed in programming languages such as Java and PHP may also have security vulnerabilities.
1.1 Communication using plaintext may be bugged
Because HTTP itself does not have the functionality of encryption, it is also unable to encrypt the overall communication (the content of requests and responses that use HTTP protocol communication), that is, HTTP messages are sent in plaintext. If you ask why communication is not encrypted is a disadvantage, because according to the TCP/IP protocol family work mechanism, communication content on all communication lines are likely to be peep. The so-called Internet is made up of networks that can connect to the world. No matter which corner of the world server in the communication with the client, some network equipment on this communication line, optical cable, computer, etc. can not be personal private, so do not rule out a link will be malicious peep behavior. Even if the communication is already over encrypted, it will be peered into the communication, which is the same as unencrypted communication. Just say that if the communication is encrypted, it is possible to break the meaning of the message message, but the encrypted processing of the message itself will be seen.
Eavesdropping on the same segment of communication is not a difficult task, only need to collect data packets flowing on the internet (frames) on the line. For the parsing of the collected packets, they can be handed over to the capture (Packet capture) or sniffer (Sniffer) tools. The following image example is a widely used grab kit Wireshark. It can get the content of the request and response of the HTTP protocol and parse it. A series of things like using the Get method to send a request, a response that returns a $ OK, and a view of the entire contents of an HTTP response message can be done.
At present, we are studying how to prevent eavesdropping protection information of several countermeasures, the most popular is the encryption technology. There are so many encrypted objects, one way to encrypt the traffic. There is no encryption mechanism in the HTTP protocol, but it can be used in combination with SSL (Secure Socket Layer) or TLS (Transport layer Security, Secure Layer transport protocol) to encrypt the communication content of HTTP. Once a secure communication line is established with SSL, HTTP communication can be made on this line. The HTTP combination used with SSL is called HTTPS (HTTP secure, Hypertext Transport Security Protocol) or HTTP over SSL.
There is also a way to encrypt the content that participates in the communication itself. Because there is no encryption mechanism in the HTTP protocol, the content transmitted by the HTTP protocol is inherently encrypted. That is, the content contained in the HTTP message is encrypted. In this case, the client needs to encrypt the HTTP message before sending the request.
Admittedly, in order to achieve effective content encryption, the premise is to require both the client and the server with encryption and decryption mechanism. The main application is in the Web service. It is important to note that the content is still at risk of being tampered with because the method differs from SSL or TLS in encrypting the entire communication line. We will explain later.
1.2 You may encounter a disguise without verifying the identity of the communicating party
Requests and responses in the HTTP protocol do not acknowledge the communication party. This means that there is a "server is a host that the URI really specifies in the sending request, whether the returned response really returns to the client that actually made the request", and so on. When communicating with the HTTP protocol, anyone can initiate a request because there is no processing step to confirm the communication party. In addition, the server receives a request, regardless of who is the other person will return a response (but also limited to the sending side of the IP address and port number is not restricted by the Web server set access).
The implementation of the HTTP protocol itself is very simple, regardless of who sent the request will return the response, so do not confirm the communication party, there are the following various pitfalls:
It is not possible to determine whether the Web server that sent the request to the destination is the server that returned the response in real intention, possibly a spoofed Web server;
It is not possible to determine whether the client returned by the response is the client that received the response in real intention, possibly a spoofed client;
It is not possible to determine whether the person communicating has access, because important information is stored on some Web servers and only the permission to communicate to a specific user is required;
Can not determine where the request is from, who is the hand;
Even meaningless requests can be taken in full order, unable to block Dos attacks under massive requests (denial of service, denial of services attacks) ...
Although it is not possible to determine the communication party using the HTTP protocol, it is possible to use SSL. SSL not only provides cryptographic processing, but also uses a means known as a certificate that can be used to determine the party. Certificates are issued by trusted third-party organizations to prove that the server and client are actually present. In addition, it is technically difficult to forge a certificate. So as long as you can confirm the communication party (server or client) hold the certificate, you can judge the true intentions of the communication party.
By using a certificate to prove that the communication party is the expected server. This also reduces the risk of personal information disclosure to the user personally. In addition, the client holds the certificate to complete the identification of the individual, but also for the Web site Certification link.
1.3 Cannot prove message integrity, may have been tampered with
Completeness refers to the accuracy of information. Failure to prove its integrity often means that it is impossible to determine whether the information is accurate or not. Because the HTTP protocol cannot prove the message integrity of the communication, there is no way to know if the content of the request or response has been tampered with since the request or response was sent out until the other party received it. In other words, there is no way to confirm that the request/response is made and the request/response received is the same as before.
For example, downloading content from a Web site does not determine whether the files downloaded by the client and the files stored on the server are consistent. The contents of the file may have been tampered with for other content in transit. Even if the content is really changed, the client as the receiver is not aware of it. Like this, a request or response is called a man-in-the-middle attack (Man-in-the-middle attack,mitm) when the attacker intercepts and tamper with the content in transit.
Although there are methods for determining message integrity using the HTTP protocol, it is not convenient and reliable in fact. It is commonly used to verify the hash values such as MD5 and SHA-1, and the method of digital signature used to confirm the file. Web sites that provide file download services also provide the appropriate digital signatures created with PGP (Pretty good Privacy, perfect privacy) and hash values generated by the MD5 algorithm. PGP is a digital signature used to justify the creation of a file, and MD5 is a hash value generated by a one-way function. Whichever method you use, the user who manipulates the client personally checks to verify that the downloaded file is the file on the original server and that the browser does not automatically check for the user. Unfortunately, the use of these methods is still not guaranteed to confirm the results correctly. Since PGP and MD5 itself are rewritten, there is no way for users to be aware of it. To effectively prevent these drawbacks, it is necessary to use HTTPS. It is very difficult to ensure integrity by HTTP alone, so this is achieved by combining it with other protocols. In the next section we describe the contents of HTTPS.
2.http+ Encryption + authentication + integrity Protection =https
2.1HTTP plus encryption processing and authentication and integrity protection is HTTPS
If an unencrypted plaintext is used during HTTP protocol communication, such as entering a credit card number in a Web page, the credit card number is exposed if the communication line is tapped. In addition, for HTTP, server or client, there is no way to confirm the communication party. Because there is a good chance that the communication is not actually communicated with the intended party. It is also necessary to consider the possibility that the received message has been tampered with during the communication. In order to solve these problems uniformly, we need to add encryption processing and authentication to HTTP. We refer to HTTP as HTTPS (HTTP Secure) for adding encryption and authentication mechanisms.
We often use HTTPS communication on the Web login page and the shopping checkout screen. When using HTTPS communication, the http://is no longer used instead of https://. In addition, a locked tag appears in the address bar of the browser when the browser accesses a Web site that is valid for HTTPS communication. The way HTTPS is displayed will vary depending on the browser.
2.2HTTPS is an HTTP that wears an SSL shell
HTTPS is not a new protocol for the application layer. Just the HTTP communication interface part is replaced with the SSL (Secure Socket layer) and TLS (Transport layer Security) protocol. Typically, HTTP communicates directly with TCP, and when SSL is used, it becomes a first-and-SSL communication, which is then communicated by SSL and TCP. In short, the so-called HTTPS, is actually wearing the SSL protocol layer of the shell HTTP.
With SSL, HTTP has the encryption, certificate, and integrity of HTTPS to protect these features. SSL is an HTTP-independent protocol, so not only the HTTP protocol, but other protocols that run on the application layer, such as SMTP and Telnet, can be used in conjunction with the SSL protocol. It can be said that SSL is the most widely used network security technology in the world today.
2.3 Public key encryption technology of mutual exchange key
Before we talk about SSL, let's take a look at the encryption method. SSL uses a cryptographic processing method called Public key encryption (Public-key cryptography). In modern encryption method, the encryption algorithm is public, but the secret key is confidential, so the security of encryption method can be maintained in this way. Encryption and decryption will use the key, no key can not decrypt the password, on the other hand, anyone can only hold the key to decrypt. If the key is obtained by an attacker, the encryption loses its meaning. Encrypting and decrypting a key in the same way is called Shared key encryption (Common key crypto system), also known as symmetric key encryption.
The key must also be sent to the other party when it is encrypted with a shared key. But how can we safely transfer them? When a key is forwarded over the Internet, the key can fall into the attacker's hand if the communication is being monitored, and it loses the meaning of encryption. You also have to try to safely keep the keys you receive.
Public key encryption is a good way to solve the problem of shared secret encryption. Public key encryption uses a pair of asymmetric keys. One is called the private key, and the other is the public key. As the name implies, the private key cannot be known to anyone else, and the public key can be released at will and can be obtained by anyone. Using public key encryption method, the party sending ciphertext uses the other's public key for encryption processing, the other party receives the encrypted information, and then use their own private key to decrypt. This way, you do not need to send the private key to decrypt, and do not have to worry about the key by the attacker eavesdropping and stolen away. In addition, it is very difficult to restore the original information according to the ciphertext and the public key, because the decryption process is to evaluate the discrete logarithm, which is not easy to do. Step back, if you can quickly factoring a very large integer, then there is still hope for password cracking. But in terms of the current technology is not very realistic.
HTTPS uses a hybrid encryption mechanism with both shared key encryption and public key encryption. If the key can be exchanged securely, then it is possible to consider using only public key encryption to communicate. However, public key encryption is slower to handle than shared key encryption. Therefore, we should make full use of the advantages of both to combine various methods for communication. In the Exchange key link uses the public key encryption method, after the establishment communication Exchange message phase uses the shared secret key encryption method.
2.4 Certificate to prove public key correctness
Unfortunately, there are still some problems with the public key encryption method. It is impossible to prove that the public key itself is a genuine public key. For example, when preparing a communication with a server for public key encryption, how to prove that the public key received is the public key that was originally intended to be issued by that server. Perhaps in the public key transmission, the real public key has been replaced by the attacker. To address these issues, you can use a public key certificate issued by a digital certificate Authority (ca,certificate authority) and its relevant authorities. The digital certificate Authority is in the position of a third-party organization that both the client and the server can trust. Let's introduce the business process of the digital certificate certification authority. First, the operator of the server presents a public key application to the digital certification authority. After ascertaining the identity of the applicant, the digital certificate Authority will digitally sign the public key that has been applied, then assign the signed public key and bind the public key after it is placed in the public key certificate. The server sends the public key certificate issued by the digital certificate authority to the client for public-key encryption communication. A public key certificate can also be called a digital certificate or directly called a certificate. The client receiving the certificate can verify the digital signature on the certificate using the public key of the digital certificate Authority, and once the authentication is passed, the client can clarify two things: the public key of the authentication server is a true and effective digital certificate certification authority; The public key of the server is trustworthy. The public key of the certification authority here must be securely forwarded to the client. When using communication methods, how to safely transfer is a very difficult thing, therefore, most browser developers release version, will be in advance in the internal implantation of common certification authority public key.
One of the functions of a certificate is to prove the specification of the server as a party to the communication, and another function is to confirm whether the enterprise behind the other server is real. The certificate that owns the feature is the EV SSL certificate (Extended Validation SSL Certificate). EV SSL certificates are certificates issued based on the certification guidelines of international standards. It is a strict policy of confirming the authenticity of an operating organization, so that a certified web site can achieve a higher degree of recognition. The background color of the browser address bar of the Web site that holds the EV SSL certificate is green, and the name of the organization that is recorded in the SSL certificate and the name of the certification authority that issued the certificate are displayed to the left of the address bar. Many users may not be aware of the knowledge associated with EV SSL certificates and therefore will not be aware of it. Client certificates can also be used in HTTPS. Client Authentication with client certificates proves that the other party that the server is communicating with is always an expected client, and its role is the same as the server certificate. However, the client certificate still has several problem points, one of which is the acquisition and publication of the certificate. When you want to obtain a certificate, you must install the client certificate yourself. However, because the client certificate is to be paid for, and each certificate corresponds to each user, it also means that there is a cost equivalent to the number of users to be paid. In addition, to allow users of different levels of knowledge to install their own certificates, the event itself is full of challenges. As it is, a highly secure certification authority can issue client certificates but only for special purpose businesses. such as those that can support the client certificate expense of the business. For example, the bank's online banking uses a client certificate. When logging on to the net, the user is not only required to confirm the input ID and password, but also to ask the user's client certificate to confirm whether the user has access to the network bank from a specific terminal. Another problem with client certificates is that, after all, client certificates can only be used to prove that the client is actually present, and cannot be used to prove the user's true validity. That is, as long as you have access to the computer that has the client certificate installed, it means that you also have the use rights for the client certificate. It is possible to intervene in the SSL mechanism because it is based on the premise that the credit is absolutely reliable. However, in July 2011, the Netherlands, a certification body called DigiNotar was hacked, issued the google.com and twitter.com, such as the forgery of the website, the incident fundamentally shook the credibility of SSL. Because a forged certificate has a formal certification authority's digital signature, the browser will determine that the certificate is justified. When a forged certificate is used as a server disguise, the user simply cannot detect it. Although there is a certificate revocation list (Certificate revocation list,crl) mechanism that can invalidate the certificate and remove the root certification authority from the client (root CerTificate Authority,rca), but it will take some time for the distance to take effect, and it is unclear how many users are going to suffer the loss during that time. If you use OpenSSL, this open source program, everyone can build a set of their own certification authority, so that they themselves issued a server certificate. But the server certificate is not available as a certificate on the Internet and seems to be of little help. The independent building of the certification body is called self-certification body, by the self-certification body issued by the "useless" certificate is also dubbed as self-signed certificate. When the browser accesses the server, a warning message such as "Unable to confirm connection security" or "there is a problem with the site's security certificate" is displayed.
A server certificate issued by a self-accrediting authority does not work because it does not eliminate the possibility of spoofing. The role that self-accrediting agencies can produce is at best the extent that they claim to be "I am XX". Even with self-signed certificates, you may occasionally see a hint of the security state of the communication after SSL encryption, but that is problematic. Because even encrypted communication does not preclude being able to maintain communication with spoofed fake servers. Trusted third-party agencies to intervene in the certification, in order to have implanted in the browser of the certification body issued by the public key to play a role, and to prove the authenticity of the server. Most browsers pre-populate a trusted certification authority's certificate, but a small number of browsers are implanted with a certificate from an intermediate certification authority. For the server certificate issued by the Intermediate Certification authority, some browsers will be treated with a formal certificate, and some browsers will be treated as self-signed certificates.
Secure communication mechanism for 2.5HTTPS
To better understand HTTPS, let's look at the communication steps for HTTPS.
Step 1: The client begins the SSL communication by sending a customer hello message. The message contains the specified version of SSL supported by the client, the encryption component (Cipher Suite) list (the encryption algorithm used and the key length, etc.).
Step 2: When the server is able to make SSL communication, it responds with the server Hello message. As with the client, the SSL version and the cryptographic components are included in the message. The contents of the encrypted component of the server are filtered from within the received client encryption component.
Step 3: After the server sends the certificate message. The message contains a public key certificate.
Step 4: The last server sends the server Hello done message to notify the client that the initial phase of the SSL Handshake Negotiation Section ends.
Step 5:ssl After the first handshake is over, the client responds with the customer Key Exchange message. The message contains a random cipher string called Pre-master secret used in communication encryption. The message has been encrypted with the public key in step 3.
Step 6: Then the client continues to send the change Cipher spec message. The message will prompt the server and the communication after this message will be encrypted with the Pre-master secret key.
Step 7: Send the finished message to the client. This message contains the overall checksum value of all messages connected to date. Whether the handshake negotiation can be successful, the server can correctly decrypt the message as a criterion.
Step 8: The server also sends the change Cipher spec message.
Step 9: The server also sends finished messages.
Step 10: After the finished message exchange between the server and the client is complete, the SSL connection is established. Of course, communication is protected by SSL. This is where the application layer protocol communication begins, sending an HTTP request.
Step 11: Apply the layer protocol communication, that is, send the HTTP response.
Step 12: Finally the client disconnects. When disconnecting, send a close_notify message. Do some ellipsis, this step then send the TCP fin message to close the communication with TCP.
In the above process, the application layer will append a message digest called MAC (message authentication Code) when sending data. The Mac is able to check if the message has been tampered with, thus protecting the integrity of the message. Here is a diagram of the entire process. The diagram illustrates the entire process of establishing HTTPS communication from a public key certificate (server certificate) that uses only the server side.
HTTPS uses both the SSL (Secure Socket layer) and the TLS (Transport layer Security) protocols. SSL technology was first pioneered by the browser developer Netscape Communications and developed prior to the SSL3.0 version. The current dominance has been transferred to the IETF (Internet Engineering Task force,internet Engineering Task Force). The IETF is based on SSL3.0, and then TLS1.0, TLS1.1 and TLS1.2 are developed. The TSL is an SSL-based protocol that is sometimes uniformly called SSL. The current mainstream version is SSL3.0 and TLS1.0. Since the SSL1.0 protocol was found to be problematic at the beginning of the design, it was not actually put into use. SSL2.0 has also been found to have problems, so many browsers have directly abolished the protocol version. HTTPS also has some problems, that is, when using SSL, it will be slow to process.
There are two types of SSL, one is slow communication, the other is the slow processing speed due to the heavy consumption of resources such as CPU and memory. The Network load can be 2 to 100 times times slower than using HTTP. In addition to TCP connections and HTTP requests/responses, SSL communication must also occur, so the overall processing of traffic will inevitably increase. Another point is that SSL must be encrypted. Both the server and the client need to perform cryptographic and decryption operations. As a result, the server and client hardware resources are consumed more than HTTP, resulting in increased load. There is no fundamental solution to the problem of slow speed, and we use the SSL accelerator (dedicated server) hardware to improve the problem. This hardware is dedicated to SSL communication hardware, and relative to the software, can improve the computing speed of several times SSL. The SSL accelerator works only when SSL is processed to share the load. Since HTTPS is so safe and reliable, why don't all Web sites use HTTPS all the time? One reason is that encrypted communication consumes more CPU and memory resources than plain text communication. If each communication is encrypted, it consumes a considerable amount of resources, and the number of requests that can be processed on a single computer must also be reduced. Therefore, if non-sensitive information is used for HTTP communication, HTTPS is used to encrypt traffic only when sensitive data such as personal information is included. In particular, whenever those Web sites with more traffic are encrypted, the load they bear is not to be underestimated. When encryption is processed, all content is not encrypted, but it is encrypted only when information is hidden to conserve resources. In addition, the cost of purchasing a certificate is one of the reasons to save. For HTTPS communication, the certificate is essential. The certificate used must be purchased from the certification authority. The certificate price may vary slightly depending on the certification authority. Services that are not cost-effective to purchase certificates, as well as some personal sites, may only opt to use HTTP for communication.
Securing Web-Safe HTTPS