HTTP co-and web nature

Source: Internet
Author: User

HTTP protocol and Web nature

As a developer, especially a web developer, I think you need to understand this series of processes, during which the browser and server exactly how to deal with it? How does the server handle it? How does the browser display the Web page to the user? ......

There are so many doubts and details. Frankly speaking, in order to thoroughly understand each of these doubts and processing details, at least 10 book thickness, so-called "bottom No Limit" well, and different Web server and server-side programming language implementation and processing process is not the same (but the essence is the same). In this article, I will explain some of the nature of web development to you based on the knowledge of the HTTP protocol. Whether you're in. NET, or Java EE or PHP development, and so on, are inseparable from these essence. I hope you will finish reading this article and have a new harvest and insight. Because of my level and experience is limited, inevitably wrong, hope readers forgive me.

?

What is the HTTP protocol (hypertext Transfer Protocol, Hypertext Transfer Protocol)?

The so-called agreement refers to the norms that both parties follow. The HTTP protocol is a specification of "communication" between a browser and a server. We are looking at space, brush micro bo ... are using the HTTP protocol, of course, far more than these applications.

I have always heard that HTTP belongs to the "Application layer protocol" and is based on TCP/IP protocol. This is not difficult to understand, if you have learned the "computer network" course in college, you must know the OSI Seven Layer Reference protocol (I was rote). If you are in touch with socket network programming, you should understand that both TCP and UDP use a wide range of communication protocols (establishing connections, three handshakes, and so on, which, of course, are not the focus of this article).

Since TCP/UDP is a widely used network communication protocol, why is there more than one HTTP protocol?

The author has written a simple Web server processing software, according to my inference (not necessarily accurate). UDP protocol is unreliable and unsafe, obviously it is difficult to meet the needs of Web applications.

The TCP protocol is based on the connection and the three-time handshake, although it has the reliability, but the person has certain flaw. But imagine, ordinary C/s architecture software, at most thousands of clients at the same time connected, and B/s architecture of the site, 100,000 people at the same time online is also very common thing. If 100,000 clients and servers remain connected, how does the server meet the load?

This generates the HTTP protocol. TCP-based reliability connections. The popular point is that after the request, the server closes the connection immediately and frees the resources. This ensures that the resources are available and the advantages of TCP reliability are also absorbed.

Because of this, it is often said that the HTTP protocol is "stateless", that is, "the server does not know what your client is doing", in fact, largely based on performance considerations. So that later had the session and the like.

?

Actual Combat preparation work:

On the monitoring network, the Windows platform has a good software called Sniffer, which is also a lot of "hackers" often used sniffer tools. When studying the HTTP protocol, we recommend that you use a

A tool called HttpWatch. (Unfortunately, the tool is chargeable.) What to do, you know). After the installation is complete, you can open it directly in IE's tools (Firefox is also currently supported). :

?

?

?

?

?

?

Click Record to start monitoring and logging HTTP messages. Stop, clear and so on the function of the button, here is not introduced. Take the example to speak, the following is my record to visit the Main.aspx page when recorded, can clearly see the HTTP message information details,

learn the HTTP protocol, the main need to understand the HTTP request and response (of course, get, post and other request methods, status code, URI, MIME, etc.)


First look at the HTTP request message (that is, the browser dropped to the server):


An HTTP request represents the data that the client browser sends to the server. A complete HTTP request message containing a request line, a number of message headers (request headers), newline, entity content

Request Line: Describes how the client is requested, the name of the request resource, and the version number of the HTTP protocol. For example: Get/book/java. HTML http/1.1

The request header (the message header) contains (the server host name requested by the client, the client's environment information, and so on):
Accept: Used to tell the server, the type of data supported by the client? (Example: accept:text/html,image/*)
Accept-charset: Used to tell the server the encoding format used by the client
Accept-encoding: Used to tell the server that the client supports the data compression format
Accept-language: Client language environment
Host: Client through this server, want to access the hostname
If-modified-since: The client tells the server through this header that the cache time of the resource
Referer: The client tells the server through this header which resource (the client) is accessing the server (anti-theft chain)
User-agent: The client tells the server through this header, the client's software environment (operating system, browser version, etc.)
Cookie: The client takes Coockie information to the server through this header
Connection: Tell the server whether to keep the connection after the request is complete
Date: Tells the server the time of the current request

(Line wrapping)
Entity content:
Refers to the entity data that the browser sends to the server over the HTTP protocol. Example: name=dylan&id=110
(A GET request is passed to the server's value via a URL.) When a post is requested, the value is sent to the server via the form)
?
then look at the HTTP response message (the server returned to the browser):

An HTTP response represents the data that is echoed back to the client by the server, which includes:
A status line, a number of message headers, and the entity content

The response header (message header) contains:
Location: This head with 302 status, used to tell the client who to look for
Server: Servers through this header, tell the browser the type of server
Content-encoding: Tells the browser that the data compression format of the server
Content-length: Tell the browser the length of the loopback data
Content-type: Tell the browser, the type of loopback data
Last-modified: Tells the browser the current resource cache time
Refresh: Tell the browser how often it refreshes
Content-disposition: tells the browser to open the data in the way it is downloaded. For example: Context. Response.AddHeader ("Content-disposition", "attachment:filename=aa.jpg");??????????????????????????????????????? Context. Response.WriteFile ("aa.jpg");
Transfer-encoding: Tell the browser to transmit the encoded format of the data
ETAG: Cache related headers (can be updated in real time)
Expries: How long to tell the browser to echo the resource cache. If it is-1 or 0, it means that it does not cache
Cache-control: Control browser do not cache data?? No-cache
Pragma: Control browser do not cache data????????? No-cache

Connection: Is the connection disconnected after the response is completed? Close/keep-alive
Date: Tells the browser that the server response time

Understanding the above HTTP request messages and response messages, I believe you have understood the HTTP protocol is deep enough. For more specific details about the HTTP protocol, you can refer to the HTTP RFC documentation .

The approximate step is: The browser first sends the request to the server, the server receives the request, does the corresponding processing, then encapsulates the response message, and then returns it to the browser. After the browser has received the response message, then through the browser engine to render the Web page, parse the DOM tree, the JavaScript engine parsing and executing script operations, plug-ins to do the work of the plug-in ... For browser rendering, the principle of parsing, you can refer to http://kb.cnblogs.com/page/129756/

Frankly speaking, the nature of the so-called web is nothing more than: request/processing/response, any Web server, any service-side programming language, can not be divorced from this essence. and the browser side parsing the HTML, pictures and other static content, presented to the user, the script engine executes script code, the completion of the script code to do things (such as DOM operations, CSS property changes, send AJAX requests, etc.).

I think that, in fact, the browser is a special client, and b/s architecture is a special C/s architecture. It is worth mentioning that different Web servers and programming languages, but also how to receive user HTTP requests. How to handle, how to respond to it? The author takes the familiar ASP. NET as an example, through the Anti-compilation tool to view the source code (Microsoft This guy is really packaged too good) from the bottom of the analysis,

Due to space constraints, the details of the ASP. NET, IIS Web server, and the underlying implementation can no longer be further dissected. Because Microsoft's ASP. NET technology system is huge and complex. The author will continue to update the series of articles, readers are welcome to continue to pay attention.

Good 5 floor Shandaaiwo2 2012-02-15 was great. 6 Floor Jallin 2012-02-17 the "last-modified" explained above is easily misleading:
1) What is "last-modified"?
When the browser requests a URL for the first time, the server-side return status is 200, the content is the resource you requested, and there is a last-modified attribute that marks the last time the file was modified at the end of the service period, similar in format:
Last-modified:fri, 2006 18:53:33 GMT
When the client requests this URL for the second time, according to the HTTP protocol, the browser transmits the If-modified-since header to the server, asking if the file has been modified after that time:
If-modified-since:fri, 2006 18:53:33 GMT
If the server-side resource does not change, the HTTP 304 (not Changed.) Status code is returned automatically, and the content is empty, which saves the amount of data transferred


The client requires the server to validate its (client) cache by passing the token back to the server.
The process is as follows:

The client requests a page (a).
The server returns page A and adds a last-modified/etag to a.
The client presents the page and caches the page along with Last-modified/etag.
The customer requests page A again and passes the Last-modified/etag that the server returned when the last request was sent to the server.
The server checks the last-modified or ETag and determines that the page has not been modified since the last client request, directly returning the response 304 and an empty response body.

Original address: http://www.reader8.cn/jiaocheng/20120901/1357825.html

HTTP co-and web nature

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.