To clarify the implementation principle of the HTTPS protocol, at least the following background knowledge is required.
1. General understanding of the meaning of several basic terms (HTTPS, SSL, TLS)
2. A general understanding of the relationship between HTTP and TCP (especially "short connections" VS "long Connections")
3. A general understanding of the concept of cryptographic algorithms (especially the difference between symmetric and asymmetric encryption)
4. General understanding of the purpose of CA certificates
Considering that many technical novices may not understand the above background, I will first describe it in the shortest text. If you think you are not a rookie, skip this section and go directly to the "HTTPS protocol Requirements".
First clarify a few terms--https, SSL, TLS1. What does "HTTP" do with the drip?
First, HTTP is a network protocol that is specifically designed to help you transfer Web content. Even if you don't know about this agreement, at least you've heard of it? For example, you visit my blog's homepage, the browser address bar will appear the following URL
The part that I added in bold is the HTTP protocol. Most Web sites use the HTTP protocol to transfer web pages, as well as various things (images, CSS styles, JS scripts) that are contained on Web pages.
2. What does "SSL/TLS" Do with the drip?
SSL is the abbreviation for foreign language "secure Sockets layer", which is called "Secure Sockets Layer" in Chinese. It was designed by Netscape in the middle of the 90 century. (Incidentally, Netscape not only invented SSL, but also invented a lot of Web infrastructure-such as "CSS stylesheets" and "JS scripts").
Why to invent SSL this protocol pinch? Because the HTTP protocol used on the internet is plaintext, there are a number of drawbacks-such as the fact that the transmitted content is peeping (sniffing) and tampering. The invention of the SSL protocol is to solve these problems.
By the year 1999, SSL has become a de facto standard on the Internet because of its wide application. The IETF standardized SSL in that year. The name after normalization is changed to TLS (the abbreviation for "Transport Layer Security"), and Chinese is called the "Transport Layer secure Protocol".
Many of the related articles refer to these two terms (SSL/TLS) because they can be regarded as different stages of the same thing.
3. What does "HTTPS" mean?
After interpreting HTTP and SSL/TLS, it is now possible to explain HTTPS. What we usually call the HTTPS protocol, is plainly the combination of "HTTP protocol" and "SSL/TLS protocol". You can roughly understand HTTPS as-"http over SSL" or "HTTP over TLS" (SSL and TLS, anyway).
Again, the features of the HTTP protocol
As a background knowledge, we need to talk a little bit about the characteristics of the HTTP protocol itself. HTTP itself has many characteristics, considering the space is limited, I only talk about those and HTTPS-related features.
1. Version and History of HTTP
Now we use the HTTP protocol, the version number is 1.1 (that is, HTTP 1.1). This 1.1 version was drafted at the end of 1995 (technical documentation is RFC2068) and was formally released in 1999 (technical documentation is RFC2616).
Before 1.1, there were two versions of "0.9 and 1.0", with HTTP 0.9 "not" widely used, and HTTP 1.0 being widely used.
In addition, it is said that the IETF will release the HTTP 2.0 standard next year. I'll wait and see.
2. The relationship between HTTP and TCP
Simply put, the TCP protocol is the cornerstone of the HTTP protocol--http protocol needs to rely on the TCP protocol to transfer data.
In the network layering model, TCP is called the Transport Layer protocol, and HTTP is called the application-layer protocol.
There are many common application-layer protocols that are based on TCP, such as "FTP, SMTP, POP, IMAP," and so on.
TCP is referred to as a "connection-oriented" Transport layer protocol. I will not unfold the specifics of it (otherwise the space is out of control). All you need to know: The transport layer has two protocols, TCP and UDP, respectively. TCP is more reliable than UDP. You can think of the TCP protocol as a water pipe, the end of the water, the receiving end of the head on the effluent. and the TCP protocol ensures that first-sent data arrives first (in contrast, UDP does not guarantee this).
3. How does the HTTP protocol use TCP connections?
The use of HTTP for TCP connection is divided into two ways: commonly known as "short connection" and "Long Connection" ("Long Connection" also known as "persistent connection", foreign language is called "keep-alive" or "persistent Connection")
Suppose you have a Web page that contains a lot of pictures, and also contains a lot of "external" CSS files and JS files. In the "Short Connect" mode, the browser initiates a TCP connection and gets the HTML source code for the page (the TCP connection is closed after the HTML is taken). Then, the browser began to analyze the source of this page, know that this page contains a lot of external resources (images, CSS, JS). Then, for the "each" external resources, and then initiate a TCP connection, the files are obtained locally (again, after each fetch of an external resource, the corresponding TCP will be disconnected)
Conversely, if it is a "long connection", the browser initiates a TCP connection to crawl the page. However, after crawling the page, the TCP connection does not close immediately, but is temporarily maintained (so-called "keep-alive"). Then the browser analyzes the HTML source code, found that there are a lot of external resources, using the TCP connection just now to crawl the external resources of this page.
In the HTTP 1.0 version, "Default" uses "Short Connection" (at that time the web was born in the early days, the Web page is relatively simple, "short connection" problem is not big);
By the end of 1995, when the HTTP 1.1 draft was introduced, the Web page was beginning to become complex (more and more pictures and scripts in the Web). At this point in a short connection, the efficiency is too low (because the establishment of a TCP connection is a "time cost" and "CPU cost" drop). So, in HTTP 1.1, "default" uses the "keep-alive" approach.
For more information on "keep-alive", refer to Wikipedia entry ("Here")
Talk about the concept of symmetric encryption and asymmetric encryption 1. What is "encryption" and "decryption"?
In layman's terms, you can interpret "encryption" and "decryption" as some sort of "reciprocal" mathematical operation. It is as if "addition and subtraction" are reciprocal operations, and "multiplication and division" are inverse.
The process of "encryption" is the process of turning "plaintext" into "ciphertext", whereas the process of "decryption" is to turn "ciphertext" into "clear text". In both of these processes, you need a key thing-called a key-to participate in mathematical operations.
2. What is "symmetric encryption"?
The so-called "symmetric encryption Technology" means that "encryption" and "decryption" use the same key. This is better understood. It's like you use 7zip or WinRAR to create a cryptographic package with a password (password). The next time you want to unpack the compressed file, you need to enter the "Same" password. In this example, the password/passphrase is just like the "key" just said.
3. What is "asymmetric encryption"?
The so-called "asymmetric encryption technology" means that "encryption" and "decryption" use "different" keys. This thing is more difficult to understand, but also more difficult to think of. The invention of "asymmetric Encryption" was also known as a revolution in the history of "cryptography".
Because of the limited space, I will not unfold the topic of "Asymmetric Encryption". If you are free, write a separate piece of literacy.
4. What are the advantages and disadvantages of each?
After reading the definition, it is clear that "asymmetric encryption" (from a functional standpoint) is more capable than symmetric encryption. This is the advantage of "asymmetric encryption". However, the implementation of "asymmetric encryption" usually involves "complex mathematical problems". Therefore, the performance of "asymmetric encryption" is usually much worse (as opposed to "symmetric encryption").
The pros and cons of both are also affecting the design of the SSL protocol.
The principle and application of CA certificate
In this regard, see the "Digital certificate and the CA's literacy presentation", which I wrote 4 years ago. There is no repetition of nagging here, lest the length be too long.
What are the requirements of the HTTPS protocol?
Spent a lot of saliva, finally the background knowledge is finished. Here's the official entry. First of all, the design of HTTPS was designed to meet what needs?
A lot of articles about HTTPS will tell you the implementation details as soon as they come up. Personally think: This is bad practice. As early as 2009 years of the opening of the Bo, sent a "Learning Technology trilogy: What, How, why", which refers to the "why type Problem" importance. I'll give you the details of the deal as soon as you know what and how you can only understand why. In the previous chapter, I said "background", in this chapter, "needs", which helps you understand:
To be designed like this? -This is the question of why type.
Because HTTP is the first and then there is HTTPS. Therefore, the designer of HTTPS must consider the compatibility of the original HTTP.
The compatibility mentioned here includes many aspects. For example, the existing WEB applications to migrate to HTTPS as seamlessly as possible, for example, the browser manufacturer, the changes should be as small as possible;
Based on "Compatibility" considerations, it is easy to draw the following conclusions:
1. HTTPS is still to be transmitted based on TCP
(If you change to UDP as the transport layer, whether it is a WEB server or browser client, you have to change the big, the movement is too large)
2. Use a new protocol alone to wrap up the HTTP protocol
(The so-called "http over SSL" is actually a layer of SSL encapsulation outside of the original HTTP data.) HTTP protocol of the original GET, POST and other mechanisms, basically intact)
For example: If the original HTTP is a plastic pipe, easy to be punctured, then the new design of HTTPS today is like in the original plastic pipe, and then a layer of metal pipe. As a result, the original plastic pipe is still running, and secondly, after being reinforced with metal, it is not easy to be punctured.
As I said earlier, HTTPS is equivalent to "HTTP over SSL".
If the SSL protocol is designed to be "scalable" enough, it can be paired with other application-layer protocols in addition to HTTP. Wouldn't it be beautiful?
Now it seems that the person who designed the SSL did compare cattle. Today's SSL/TLS can be combined with many common application layer protocols such as FTP, SMTP, POP, Telnet to enhance the security of these application layer protocols.
Then the analogy: if the SSL/TLS is regarded as a reinforcing metal tube, it can not only be used to reinforce the water conveyance pipeline, but also to strengthen the gas pipeline.
Confidentiality (Protection against leaks)
HTTPS needs to be good enough for confidentiality.
When it comes to confidentiality, it is first to be able to fight sniffing (jargon called Sniffer). The so-called "sniffing", in layman's terms, is to monitor your network transmission traffic. If you use plaintext HTTP to surf the internet, the Watcher will know which pages you are visiting by sniffing.
Sniffing is the lowest-level attack technique. In addition to sniffing, HTTPS also needs to be able to counter some of the other slightly advanced attacks-such as "replay attacks" (which are discussed later in the Protocol's rationale).
In addition to "confidentiality", there is an equally important goal to "ensure integrity". The concept of completeness is broadly mentioned in the previous blog post, "Integrity checks for literacy files-about hash values and digital signatures." Forget the classmate to brush up again.
Before the invention of HTTPS, because HTTP is clear, not only easy to sniff, but also easy to tamper with.
As an example:
For example, our celestial network operators (ISPs) are compared rogue, often have netizens complained about visiting a website (originally there is no advertising), unexpectedly will jump out of many China Telecom ads. Why is it so pinched? Because your network traffic needs to go through the ISP's line to reach the public network. If you're using clear text http,isp It's easy to embed ads on the pages you visit.
Therefore, when the design of HTTPS, there is a need to "ensure that the content of the HTTP protocol is not tampered with."
When it comes to the need for HTTPS, "authenticity" is often overlooked. In fact, "authenticity" is no less important than the previous "confidentiality" and "integrity".
As an example:
You need to access the network Silver Web site because of the use of online banking. So, how do you make sure that the site you visit is really the site you want to visit? (It's a little tongue twister)
Some naïve reunion said: By looking at the domain name inside the URL to ensure. Why say such a classmate is "naïve"? Because the DNS system itself is unreliable (especially in the era of SSL design, even DNSSEC has not been invented). Due to the unreliable DNS (the existence of "domain spoofing" and "Domain Hijacking"), you see the URL inside the domain name "not necessarily" is the real drop!
(Do not understand "domain name spoofing" and "Domain name hijacking" classmate, you can see I wrote earlier, "The principle of literacy DNS, and" Domain name hijacking "and" Domain name spoofing/domain name pollution ")
Therefore, the HTTPS protocol must have some mechanism to ensure the "authenticity" of the need (as for how to ensure that the latter will be chatted).
One last requirement-performance.
After HTTPS is introduced, "no" can cause poor performance. Otherwise, who would like to use it?
To ensure performance, SSL designers have at least the following points to consider:
1. How do I choose the encryption algorithm ("symmetric" or "asymmetric")?
2. How to take into account the "short connection" TCP method used by HTTP?
(SSL was designed before 1995, when the HTTP version is still 1.0, the default is to use a "short connection" TCP mode-keep-alive is not enabled by default)
HTTPS and SSL/TLS protocols