The HTTP protocol (hypertext Transfer Protocol, Hypertext Transfer Protocol) is the application-layer communication protocol used to transfer hypertext (HTML) from the Web server to the client browser. Learn the HTML before you know the HTTP protocol
First, the HTML
HTML (Hyper Text Mark Language): Hypertext Markup Language . "Hypertext" means that the page can contain pictures, links, or even music, programs and other non-text elements.
1, Hypertext Markup Language structure includes head (head) and body (body), where head provides information about the Web page, the body section provides the specific content of the page.
HTML text frame:
<title>TITLE</title>
<body>
<p></p>
<p> <a href= "admin.html" >ToGoogle</a> </p>
</body>
2. How HTML documents are generated:
Static: Manually created HTML document
Dynamic : Programming languages (PHP, JSP, asp,.net) are programmed to output HTML-formatted results
The execution of these scripts relies on the script interpreter:
php:php Interpreter
Jsp:jvm
Second, the HTTP protocol
The HTTP protocol uses the request/Response model, and the HTTP client initiates a request to establish a TCP connection to the server-specified port (by default, port 80). The HTTP server listens on that port for requests sent by the client. Once the request message is received, the server (to the client) returns a response message, which may be the requested file, a success or error code, or some other information.
HTTP transactions: One request and the corresponding response
1. HTTP protocol message:
Request message :
<method> <request-URL> <version>
<HEADERS>
#首部和主体之间要有一个空白行
<entity-body>
Response message :
<version> <status> <reason-phrase>
<HEADERS>
<entity-body>
Description
Method: How to request
Request-url: The requested resource, which can be a relative path, such as/images/log.jpg, can also be an absolute path, such as Http://www.magedu.com/images.banner.jpg
Version:http protocol version, formatted as HTTP/<MAJOR>.<MINOR>, for example http/1.0, http/1.1
Header: Various headers that can be used
Status: State code
Reason-phrase: A causal phrase that refers to the readable information of a status code
2. HTTP protocol version:
HTTP 0.9: Only for transferring HTML documents
HTTP 1.0: Introducing MIME and keep-alive mechanisms
HTTP 1.1: More request methods, finer cache control
HTTP 2.0
3. Web resources:
HTTP 1.0 begins with the introduction of the MIME framework, which makes the HTTP protocol more heterogeneous than just plain text. MIME(Multipurpose Internet Mail extesions) is an Internet standard that expands the e-mail standard to support messages in multiple formats, such as non-ASCII characters, non-text format attachments (binary, sound, image, etc.)
Resource Type (content-type):major/minor
Text/html, Text/plain, Image/jpeg, Image/gif, VEDIO/MPEG4, Application/vnd.ms-powerpoint ...
Resource Name:
Url:uniform Resource Locator Unified Resource Locator that describes a specific location for a resource on a specific server
Format: Scheme://server:port/path/to/resource
such as http://www.magedu.com:80/download/bash-4.3.1-1.rpm
is divided into three parts:
Sheme (Scheme):/HTTP//
Server: www.magedu.com:80
Resources on a specific server:/download/bash-4.3.1-1.rpm
A page may contain multiple resources, and these resources may not be on the same server, and the response message returned as the first server in the resource portal may contain hyperlinks to subsequent resources.
4.HTTP request Method :
Get: Request for a resource that requires a server to send
HEAD: Approximate to get, but it does not require the service to respond to the requested resource, but returns the response header
POST: The server submits the data based on an HTML table, which is typically stored by the servers; (location: typically a relational database)
PUT: Sends a resource to the server, in contrast to get, where the server usually needs to store the resource; (location: usually file system)
Delete: Remove the resource that the URL points to
OPTIONS: Probing server-side request methods that are supported by the requested URL
TRACE: A proxy server, firewall, or gateway that passes in the middle of a request
5.HTTP status Code :
1XX: Informational Status code
2XX: Success Status Code
200:ok
201:created
3XX: Status code for redirect class
301:moved permanently, permanent redirect
302:found, temporary redirection, will use "Location: New position" in the response message;
304:not Modified
4XX: Client class Error
403:forbidden
404:not Found
405:method not allowed
5XX: Error in server class
500:internal server error, internal errors
502:bad Gateway, Proxy server receives a pseudo response from the upstream server
503:service unavailable, service temporarily unavailable
6.http protocol header
Name:value
① General Header
Connection: Defines the relevant options for requests and responses between C/s
Connection:keep-alive
Cache-control: Cache Control
② Request Header:
CLIENT-IP:
Host: Requesting hosts
Referer: Indicates the URL of the original resource that requested the current resource, can be anti-theft chain
User-agent: User agent, typically a browser
Start of Accept:
Accept: Media types acceptable to clients
Accept-charset:
Accept-encoding:
Accept-language:
Conditional Request Header:
Security-related request header:
Authorization:
Cookies:
③ Response Header:
Age: The amount of time a resource in response can be used
Server: Describe your program name and version to the client
Negotiation Header:
Vary: The first list, the server will choose the most suitable version according to the content of the list to the client
Security-Related:
Www-authentication:
Set-cookie:
④ Entity Header:
Location: New Locations for resources
Allow: A request method that allows the use of this resource
Content-Related headers:
Content-encoding:
Content-language:
Content-length:
Content-location:
Content-type:
Cache correlation:
ETag: Extended Tag
Expires
Last-modified:
Press F12 in Google Chrome to view information about a webpage:
650) this.width=650; "src=" Http://s1.51cto.com/wyfs02/M01/77/D7/wKiom1Zvje7DWXfqAAHX-8SWMmI035.png "title=" 2015-12-15_114149.png "width=" "height=" 219 "border=" 0 "hspace=" 0 "vspace=" 0 "style=" WIDTH:700PX;HEIGHT:219PX; " alt= "Wkiom1zvje7dwxfqaahx-8swmmi035.png"/>
Third, HTTP stateless and session
First, "state" here refers to some common information reserved for two interrelated user operations, which are often used to store data such as workflow or user state information
HTTP is stateless (stateless), stateless indicates that the client and server side do not remember the previous state (the server side does not automatically maintain client context information), and each request is considered unique and independent. Each request itself contains all the information that the server needs to respond to this request. For example, we can access the news page through the "News" hyperlink in the homepage of NetEase (www.163.com), or we can access the http://news.163.com directly without having to go through the homepage, meaning that the request www.163.com is irrelevant to the request http://news.163.com.
HTTP is designed to be stateless, primarily in terms of scalability of the Web server and flexibility of user access. For example, for load balancing, in stateful mode, a user's request must be committed to a server that holds its associated state information, otherwise these requests may not be understood, which means that the server side cannot dispatch the user request freely in this mode. Stateless mode makes it easy to load balance and scale horizontally.
However, for some transactions, the Web page must be stateful (stateful), for example, we shop on the shop, must follow the order through the login, order, payment, and so on several steps, these request actions are related. Session is a mechanism to realize stateful.
Session Principle :
① when a client sends a request to the server, the server establishes a session for it and creates an identity for the session (session ID)
② all subsequent requests to this client include this identity, and the server will proofread the identity to determine which session the request belongs to.
There are two ways to implement session IDs:cookie and URL rewriting
cookie: Generated by the server, sent to User-agent (typically a browser), the browser will save the cookie key/value to a text file in a directory, The next time you request the same website, the cookie is sent to the server (provided the browser is set to enable cookies). The cookie name and value can be defined by the server-side development itself
The most typical application of a cookie is to determine whether a registered user has logged on to the site and is dealing with a transaction such as a shopping cart.
Iv. keep-alive mechanism of HTTP
In the old HTTP version, each HTTP request is connected by a three-step handshake between the client and the server, and after completion it takes four steps to disconnect, which is inefficient. The HTTP 1.0 release begins with the introduction of the keep-alive mechanism, which makes the client-to-server connection continuous and effective, and avoids frequent re-establishing connections when a subsequent request is made to the server, which is useful for sites that provide static resources, but for heavier sites , there may be another problem: although preserving the established connection has some benefit, the resources that could have been freed during the processing pause are always occupied, so it is important to decide whether to turn on the keep-alive feature and set a reasonable keep-alive expiration time (timeout).
In HTTP 1.0, Keep-alive is turned off by default and needs to be explicitly specified in the HTTP header "Connection:keep-alive" in order to be enabled. HTTP 1.1 is enabled by default Keep-alive
V. The specific process of Web resource request
650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/77/D2/wKioL1Zu5AbQc240AADr_Jvd4LI224.jpg "title=" Basic steps for Web server requests. jpg "width=" 650 "height=" 430 "border=" 0 "hspace=" 0 "vspace=" 0 "style=" width:650px;height:430px; "alt=" Wkiol1zu5abqc240aadr_jvd4li224.jpg "/>
Establish connection ==> receive request ==> process request ==> Access resource ==> build response ==> Send response ==> log (asynchronous write)
Vi. Web Server
1.Web server I/O structure :
Single-Process model: Serial, processing one request before processing the next, too inefficient
multi-process Model : Each process responds to a user request for concurrent results
The Web service master process listens to the default listener on 80/TCP, each receiving a client request, creating a child process (or possibly extracting a process from a pre-created process pool) and handing the result to the main process still sent to the client via port 80
Pros: Concurrent responses to multiple requests
Cons: A lot of process context switching (protecting the site, recovering the site) wastes resources
multiplexing I/O mechanisms :
① a process generates multiple threads, and each thread responds to a user request
Switching between threads also wastes system resources, but threads can reduce I/o times because they are more lightweight than processes and the memory space of a process is shared by each thread it contains.
② multiple threads per thread, responding to multiple user requests
2. Web server:
Httpd,nginx,lighttpd,gws
3. App Server:
IIS, (Tomcat, jetty, resin), weblogic,websphere
Principles of HTTP protocol and Web services