Relevant background knowledge
To clarify the implementation principle of the HTTPS protocol, at least the following background knowledge is Required.
Get an overview of the implications of several basic terms (HTTPS, SSL, TLS)
Get an overview of HTTP and TCP relationships (especially "short connections" and "long Connections")
A general understanding of the concepts of cryptographic algorithms (especially "symmetric and asymmetric Encryption")
General understanding of the purpose of CA certificates
Considering that a lot of technical rookie can not understand the above background, I would like to use the simplest text Description. If you think you are not a rookie, please skip this article and look directly at "http protocol requirements".
HTTPS:
first, HTTP is a network protocol that is specifically designed to help you transfer Web content, even if you don't know it, at least you've heard about it.
For example, Http://www.baidu.com. HTTP is part of the protocol, and most Web sites use HTTP protocols to transfer Web pages, as well as various things on Web Pages.
SSL, Tls:ssl is the abbreviation of Yangwen "secure Sockets layer", Chinese is called "secure sockets layers". Have Netscape Design. By the way, Netscape did not invent ssl, but also invented a lot of web-based facilities (css styles and JS scripts).
Why is SSL this protocol invented? Because the original HTTP protocol used in the Internet is written in plaintext, there are many shortcomings, such as the transmission of content will be peeping. and Tampering. Invented the SSL protocol, which means to solve this problem.
By the year 1999, SSL has become the standard of the Internet because of its lack of application, and the IETF was the one that standardized the SSL in that Period. The name after normalization is changed to Tls. Called the Transport Layer Security Protocol.
A lot of related articles, which are called (ssl/tls). Because the two can be seen as a different stage of the same thing.
What does 3.HTTPS mean?
After interpreting HTTP and ssl/tls, it is now possible to explain HTTPS. We usually say HTTPS Protocol. The change is a combination of the HTTP protocol and the SSL/TLS Protocol.
You can roughly understand HTTPS as-"http over SSL" and "http over tls" (anyway SSL and TLS are the Same)
Talk about the features of HTTP
As a background, sister, I need to read a little bit about the characteristics of the HTTP protocol itself. There are many features of HTTP in this province, considering the features of my system that are related to https, given the limited space.
Version and History of 1.HTTP
Now we use the HTTP protocol, this version number is 2.0. There were three versions of 1.1 and 0.9 and 1.0. 0.9 times times as wide as the use of the LACK. 1.0 too.
HTTP and TCP Relationships
Simply put, the TCP protocol is the UNDERLYING-HTTP protocol for the HTTP protocol that relies on the TCP protocol to transmit Data.
In a hierarchy of networks, TCP is referred to as the "transport layer protocol", and HTTP is called the "application-layer protocol."
There are a lot of common application protocols are based on tcp, such as "FTP, SMTP, POP, IMAP" and so On.
TCP is referred to as a "connection-oriented" transport layer Protocol. I'm not going to open the details about Him. You just need to know: the transport layer has two protocols: TCP and UDP. TCP is more reliable than udp. You can think of the TCP protocol as a Pipe. The sending side of the water, the first to send the data line Arrived. (UDP does not guarantee this).
TCP is a gram connection and does not drop Packets. UDP is not Guaranteed.
How does the 3.HTTP protocol use TCP connections?
The use of HTTP for TCP connections is divided into two ways: "short connection", "long connection" (continuous connection), Yangwen called Keep Alive.
Suppose a webpage that contains a lot of pictures and CSS files and JS Files.
In a short-connect mode, the browser will send a TCP connection to the HTML source of the page (the TCP connection is closed after the HTML is taken). then, The browser began to analyze the source of the site, know that the page contains a lot of external resources. pictures, resources, css, Js. then, for each external resource, a TCP connection is initiated separately, and the files are fetched locally (again, the corresponding TCP disconnects when no external resource is fetched)
conversely, if it is a "long connection" way, the browser will also start a TCP connection to earn the page, only after crawling the page, the TCP connection will not immediately close, but temporarily maintained, and then the browser analysis of HTML source, found that there are a lot of external resources, Just use that TCP connection to fetch the external resources of this Page.
In HTTP1.0, the default is to use "short connection", at that time is the beginning of the web, the web is relatively simple, "single link" problem is not big.
By the end of 1995, the start of the HTTP1.1 draft is that the Web page began to become more complex, (more and more scripts and styles), this time with a short connection, the efficiency is too low. Because a TCP connection is a CPU cost that has a time cost. So in HTTP1.1, the default is to use Keep-alive.
About Keep-alive more introduction, can baidu.
1. What is encryption and decryption: in Layman's terms, you can interpret "encryption" and "decryption" as a mathematical process of reciprocal inversion. Is like the arithmetic of addition and Subtraction. The operation of multiplication and Division. The process of "encrypting" is to turn plaintext into "ciphertext". conversely, "decryption" is the conversion of ciphertext into clear text.
These two processes need to have a key thing: the secret Key.
2. What is "symmetric encryption"?
The so-called "symmetric encryption technology". This means "encrypt" and "decrypt" use the same secret key. This is better understood.
It's like you use 7zip or winrar to create a cryptographic compression with a Password. You need to enter the same password when you want to unpack the archive Again. In this example, the password number is equivalent to the secret key that was just SAID.
3. What is non-symmetric encryption
The so-called "asymmetric encryption technology". It means "encrypt" and "decrypt". This thing compares that Understanding. Also more difficult to imagine. The invention of "asymmetric encryption" in the year. Also known as the "cryptography" history of a Revolution.
I will not dwell on the topic of "asymmetric encryption", which is limited in Length.
4. What are their strengths and weaknesses?
After reading this definition, it is clear that "asymmetric encryption" is more capable than "symmetric encryption". This is the advantage of "asymmetric encryption". however, the implementation of "asymmetric encryption" usually involves "complex mathematical problems". therefore, the performance of "asymmetric encryption" is usually much worse.
The pros and cons of both, also affect the SSL protocol Involved.
What is the need for HTTPS medicine?
Now is the Point. First of all talk about the original HTTPS is to meet those needs?
Many articles that introduce HTTPS come up with a lot of Detail. Personally think: this is not a number of practice.
Because there is an existing HTTP and then Https. therefore, the designer of HTTPS must take into account the HTTP compatibility of the SOURCE.
The compatibility mentioned here contains a lot of aspects. For example, Some Web applications are likely to be migrated to https, for example, to the browser manufacturer, the changes should be as small as Possible. Based on "compatibility" considerations, it is easy to draw the following conclusions:
1.HTTPS is still going to give TCP to transmit
(if you change to UDP as the transport layer), whether it is a Web server or browser client, The change is too large.
Use a new protocol alone, the HTTP protocol wrapped up.
The so-called HTTP over ssl, is actually on the original basis of the home layer of SSL Encapsulation. HTTP protocol of the original get, post and other mechanisms, Intact.
Make an Analogy. If the original HTTP is a plastic pipe is not wrapped in a layer of metal pipe, one of the original plastic pipe run as usual, and come, with metal reinforcement, not easy to Break.
As I said earlier, HTTPS is equivalent to "HTTP over SSL".
If the SSL protocol involves enough NB in terms of "scalability", then it can be paired with other Application-layer protocols in addition to HTTP. Isn't it perfect?
Now it seems that the first person to design SSL is really more than nb, now SSL/TLS can be combined with a lot of common application layer to enhance the security of this Protocol.
Then just for example, if the ssl, TLS is to be used as a reinforcing metal pipe, he can not only be used to encrypt the water conveyance pipeline, but also to strengthen the gas pipeline.
Confidentiality
HTTP needs to be good enough for privacy.
Speaking of confidentiality, we need to be able to fight sniffer sniffer first. The so-called sniffer, the popular is to monitor your network traffic, If you are using plaintext HTTP internet, then monitoring through sniffing, you are in the access to the network of those Pages.
Sniffing is the Lowest-level attack Technique. In addition to sniffing, HTTPS also needs to counter some of the other slightly advanced attacks, such as "replay attacks."
Completeness: In addition to confidentiality, an equally important goal is to ensure completeness.
Before the invention of this https, because HTTP is clear, not only easy to sniff, but also easy to tamper with.
Give me a chestnut:
For example, Our network operators (isps) are compared rogue, often have netizens complained that visit a website is no advertising, unexpectedly will jump out of many China Telecom ADS. Why is this so? Your network traffic needs to go through your isp's line to reach the public Network. If you're using plain text http,isp It's easy to embed ads on the pages you visit.
therefore, when the design of https, there is a need to "ensure that the content of the HTTP protocol is not tampered with."
When it comes to the need for https, "authenticity" is often overlooked. In fact, "authenticity" is no less important than the previous "confidentiality" and "integrity".
Give me a chestnut:
You need to access the network Silver Web site because of the use of Internet banking. so, How do you make sure that the site you visit is really the site you want to visit?
Some students say: by looking at the domain name, because the DNS system itself is unreliable, especially in the design of the era of ssl, the face you see the domain name of the site may not be True.
therefore, the HTTPS protocol must be a mechanism to ensure the "authenticity" of the Requirements.
Performance:
Another best requirement-performance
After the introduction of http, can not cause poor performance, otherwise, who still use Ah.
To ensure performance, SSL designers have at least the following points to Consider:
How to choose an encryption algorithm
How to take into account the "short connection" TCP approach to HTTP
SSL was designed before 1995, at that time the HTTP version is still 1.0, the default is to use the "short connection" tcp, The default is not Keep-alive.
Summary: the above is the design of the SSL protocol, must take into account the various NEEDS.
Main difficulties in designing HTTP protocols
There are several difficulties in designing the HTTPS Protocol.
I personally think the biggest difficulty is "key exchange"
In the traditional cryptography scenario, assume that Zhang three wants to establish a channel of encrypted communication with John Doe, and the two sides should agree on which encryption algorithm to use Beforehand. Colleagues also have to agree to use the secret key is the mountain? In this scenario, the encryption algorithm of the music line let others know, not much, but the secret key must not lie rebel know, once others know, nature can crack communication ciphertext, get clear Text.
When you visit a public Web site, your browser and the website of the server, if you want to establish encrypted communication, it is necessary to discuss the use of what Algorithm. What Secret Key. In the communication terminology of the Network. This process is called "three-time handshake".
In the handshake process, because the encryption method is not negotiated well, so the handshake phase must be clear. Since it is clear, there is a possibility that the third party will be peeping. then, consider the gap between the two sides of an internet, what can Happen.
therefore, in the handshake process, How to secure the exchange of secret key information, rather than let the surrounding third party to see, this is the most difficult to design HTTP. By connecting these, you'll know why the protocol was designed to be like This.
On the background and foundation of HTTPS and SSL/TLS protocol