Introduction to HTTP
The HTTP protocol, abbreviated as the Hyper Text Transfer Protocol (Hypertext Transfer Protocol), is a delivery protocol used to transmit hypertext to a local browser from the World Wide Web (www:world Wide Web) server.
HTTP is a TCP/IP communication protocol to pass data (HTML files, picture files, query results, and so on).
HTTP is an object-oriented protocol belonging to the application layer, which is suitable for distributed hypermedia information System because of its simple and fast way. It was proposed in 1990, after several years of use and development, has been continuously improved and expanded. Currently used in the WWW is the sixth edition of Http/1.0, http/1.1 is in progress, and Http-ng (Next Generation of HTTP) has been proposed.
The HTTP protocol works on the client-server architecture. The browser sends all requests through the URL to the HTTP server, or Web servers, as an HTTP client. The Web server sends response information to the client, based on the request received.
HTTP request-response model. jpg Main features
1, simple and fast: When the customer requests service to the server, only the request method and path can be transmitted. The request method commonly has, POST. Each method prescribes a different type of customer contact with the server. Because the HTTP protocol is simple, the HTTP server's program is small, so the communication speed is very fast.
2, Flexible: HTTP allows the transfer of any type of data objects. The type being transferred is marked by Content-type.
3. No connection: The implication of no connection is to limit the processing of only one request per connection. When the server finishes processing the customer's request and receives a reply from the customer, the connection is disconnected. In this way, the transmission time can be saved.
4. Stateless: The HTTP protocol is a stateless protocol. Stateless means that the protocol has no memory capability for transaction processing. A lack of status means that if the preceding information is required for subsequent processing, it must be retransmission, which may result in an increase in the amount of data transmitted per connection. On the other hand, it responds faster when the server does not need prior information.
5, support B/S and C/s mode. URL of http
HTTP uses a Uniform Resource identifier (uniform Resource Identifiers, URI) to transmit data and establish a connection. A URL is a special type of URI that contains enough information to find a resource
URL, the full name is Uniformresourcelocator, Chinese is called the Uniform Resource Locator, is the internet used to identify a resource on the address. Take the following URL, for example, to introduce the components of a common URL: http://www.aspxfans.com:8080/news/index.asp?boardID=5&ID=24618&page=1# Name
As you can see from the URL above, a complete URL includes the following sections:
1. Protocol part: The protocol portion of the URL is "http:", which represents the HTTP protocol used by the Web page. There are a number of protocols available on the Internet, such as http,ftp, and the HTTP protocol is used in this example. "//" after "HTTP" as a separator
2. Domain name part: The domain name part of the URL is "www.aspxfans.com". A URL, you can also use the IP address as the domain name
3. Port part: Following the domain name is the port, domain name and port between the use of ":" as a separator. The port is not a required part of a URL, and if the port part is omitted, the default port will be used
4. Virtual Directory section: from the first after the domain name "/" to the Last "/", is the virtual directory part. The virtual directory is also not a necessary part of a URL. The virtual directory in this example is "/news/"
5. FileName section: From the last after the domain name "/" start to. "So far, is the filename part, if there is no"? ", is the file part, if not"? "from the Last"/"after the domain name. "and" # ", then from the last"/"to the end of the domain name, it is the filename part. In this example, the filename is "index.asp". The filename part is not a necessary part of a URL, and if omitted, the default filename is used
6. Anchor part: From "#" to the end, is the anchor part. The anchor part in this example is "name." The anchor part is not a necessary part of a URL
7. Parameter part: from ". The section between the start and the # is part of the parameter, also known as the search section, the query section. The parameters in this example are "boardid=5&id=24618&page=1". A parameter can allow multiple parameters, with "&" as the separator between parameters and parameters.
(Original: http://blog.csdn.net/ergouge/article/details/8185219) URI and URL of the difference URI, is Uniform Resource Identifier A Uniform resource identifier used to uniquely identify a resource.
Each resource available on the Web, such as HTML documents, images, video clips, programs, and so on, is a URI to locate
The URI is generally composed of three parts:
① a naming mechanism for accessing resources
② host name for storing resources
The name of the ③ resource itself, represented by the path, with emphasis on resources. The URL is the Uniform Resource Locator, a Uniform Resource locator, which is a specific URI that can be used to identify a resource, and also indicates how to locate the resource.
URLs are strings used on the Internet to describe information resources, mainly used on various WWW client programs and server programs, especially the famous mosaic.
URL can be used in a uniform format to describe a variety of information resources, including files, server addresses and directories. The URL is generally composed of three parts:
① protocol (or service method)
② the host IP address of the resource (sometimes including the port number)
③ the specific address of the host resource. such as directory and file name urn,uniform resource Name, Uniform Resource naming, is to identify resources by name, such as mailto:java-net@java.sun.com.
URIs define uniform resource identities in an abstract, high-level concept, while URLs and urns are the ways in which specific resource identities are identified. A URL and a urn are both URIs. Generally speaking, each URL is a URI, but not necessarily every URI is a URL. This is because the URI also includes a subclass, the Uniform Resource Name (URN), which names the resource but does not specify how to locate the resource. The mailto, News, and ISBN URIs above are examples of urns.
In the Java URI, a URI instance can represent absolute or relative, as long as it conforms to the grammatical rules of the URI. The URL class, however, not only conforms to semantics, but also contains information to locate the resource, so it cannot be relative.
In the Java class Library, the URI class does not contain any method of accessing the resource, and its only function is parsing.
Instead, the URL class can open a stream that reaches the resource. HTTP requests Message request
The request message that the client sends an HTTP request to the server consists of the following format: request line, request header (header), blank line, and four parts of request data.
HTTP request message structure. The PNG request line starts with a method symbol, separated by a space, followed by the requested URI and version of the Protocol. Get Request example, using the request of Charles Crawl:
Get/562f25980001b1b106000338.jpg http/1.1
Host img.mukewang.com
user-agent mozilla/5.0 (Windows NT 10.0; WOW64) applewebkit/537.36 (khtml, like Gecko) chrome/51.0.2704.106 safari/537.36 Accept image/webp,image/*,* /*;q=0.8
Referer http://www.imooc.com/
accept-encoding gzip, deflate, SDCH
accept-language zh-cn,zh;q=0.8
The first part: The request line, which describes the request type, the resource to access, and the HTTP version used.
Get description Request type is get,[/562f25980001b1b106000338.jpg] is the resource to be accessed, the last part of the line indicates that the HTTP1.1 version is used. Part Two: The request header, immediately after the request line (that is, the first line) to describe the additional information the server will use
From the second row to the request header, host will indicate the destination of the request. User-agent, which is accessible to both server and client scripts, is an important basis for browser type detection logic. This information is defined by your browser and is automatically sent in each request. Part Three: Blank Lines, Requesting a blank line behind the head is required
Even if the request data in part four is empty, there must be a blank line. Part IV: Request data is also called the body, you can add any other data.
The request data for this example is empty. POST Request example, using the request of Charles Crawl:
post/http1.1
Host:www.wrox.com
user-agent:mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1. NET CLR 2.0.50727;. NET CLR 3.0.04506.648;. NET CLR 3.5.21022)
content-type:application/x-www-form-urlencoded
content-length:40
Connection:keep-alive
Name=professional%20ajax&publisher=wiley
Part One: The request line, the first line is the POST request, and the http1.1 version.
Part Two: Request head, second line to line sixth.
Part Three: Blank lines, line seventh.
Part IV: Request data, line eighth. HTTP Response message response
Typically, a server receives and processes a request from a client that returns an HTTP response message. HTTP responses are also composed of four parts: status line, message header, blank line, and response body.
HTTP response message format. jpg
Example
http/1.1 OK
Date:fri, May 2009 06:07:21 GMT content-type:text/html charset=utf-8
c12/>
The first part: The status line, by the HTTP protocol version number, status code, State message three parts.
The first behavior state line, (http/1.1) indicates that the HTTP version is 1.1, the status code is 200, and the status message is (OK) Part Two: The message header, used to describe some additional information that the client is going to use
Second row and third Act message header,
Date: The day and time when the response was generated; Content-type: Specifies the MIME type of HTML (text/html), the encoding type is UTF-8 Part three: Empty rows, the empty lines following the message header are required Part IV: Response body, The text information that the server returns to the client.
The HTML section after the blank line is the response body. Status Code of HTTP
The status code consists of three digits, the first number defines the category of the response, divided into five categories: 1xx: Indicates the message--indicates that the request has been received, continues processing 2xx: Success--Indicates that the request has been successfully received, understood, accepted 3XX: Redirect-requires further action to complete the request 4xx: Client Error--Request has syntax error or request cannot be implemented 5xx: Server-side Error-Server failed to implement legitimate request
Common Status Codes:
// Client request success Bad Request //client requests have syntax errors that cannot be understood by the server
401 Unauthorized //Request Unauthorized, This status code must use the 403 Forbidden/server with the Www-authenticate header domain to receive the request, but deny service
404 the Not Found//request resource does not exist. Eg: the wrong URL has been entered
Internal Server error /server Unexpected error
503 Server unavailable /server is currently unable to process client requests. May return to normal after a period of time
More status Code http://www.runoob.com/http/http-status-codes.html HTTP request Method
Depending on the HTTP standard, HTTP requests can use multiple request methods.
HTTP1.0 defines three request methods: Get, POST, and head methods.
HTTP1.1 has added five new request methods: Options, put, DELETE, TRACE, and CONNECT methods.
Get requests the specified page information and returns the entity body. The head is similar to a GET request, except that there is no specific content in the returned response to obtain the header
POST to submit data to the specified resource for processing requests (such as submitting a form or uploading a file). The data is included in the request body. Post requests may result in the creation of new resources and/or modification of existing resources. put replaces the contents of the specified document with data that is transferred from the client to the server.
Delete requests the server to delete the specified page.
CONNECT The http/1.1 protocol is reserved for proxy servers that can change the connection to a pipe mode.
OPTIONS allow clients to view server performance. The TRACE Echo server receives requests that are used primarily for testing or diagnostics.
How http Works
The HTTP protocol defines how a Web client requests a Web page from a Web server and how the server transmits the Web page to the client. The HTTP protocol uses a request/response model. The client sends a request message to the server that contains the requested method, URL, protocol version, request header, and request data. The server responds with a status line that includes the protocol version, success or error code, server information, response headers, and response data.
The following is an HTTP request/Response step: 1, the client connects to the Web server
An HTTP client, usually a browser, that establishes a TCP socket connection with the HTTP port of the Web server (default 80). For example, http://www.oakcms.cn. 2. Send HTTP request
Through the TCP socket, the client sends a text request message to the Web server, and a request message consists of 4 parts of the request line, the request header, the blank line, and the request data. 3. The server accepts the request and returns the HTTP response
The Web server resolves the request and locates the requested resource. The server writes a copy of the resource to the TCP socket, which is read by the client. A response consists of 4 parts of the status row, the response header, the blank row, and the response data. 4. Free Connection TCP connection
If the connection mode is close, the server actively shuts down the TCP connection, the client shuts down the connection passively, releases the TCP connection, and if the connection mode is keepalive, the connection is maintained for a period of time and the request can continue to be received; 5, the client browser parsing HTML content
The client browser first resolves the status line to see the status code that indicates whether the request was successful. Each response header is then parsed, and the response header tells the following for several bytes of HTML documents and the character set of the document. The client browser reads the response data HTML, formats it according to the syntax of the HTML, and displays it in the browser window.
For example: Type a URL in the browser address bar, and then press ENTER to go through the following process:
1. The browser requests the DNS server to resolve the IP address of the domain name corresponding to the URL;
2, after resolving the IP address, according to the IP address and the default port 80, and the server to establish a TCP connection;
3. The HTTP request of the browser to read the file (the corresponding file after the domain name in the URL), the request message is sent to the server as the third message of TCP three handshake;
4, the server to the browser request to respond, and the corresponding HTML Wenbenfal to the browser;
5, release the TCP connection;
6, the browser will be the HTML text and display content; the difference between get and post requests Get Request
Get/books/?sex=man&name=professional http/1.1
Host:www.wrox.com
user-agent:mozilla/5.0 (Windows; U Windows NT 5.1; En-us; rv:1.7.6)
gecko/20050225 firefox/1.0.1
connection:keep-alive
Note that the last line is a blank line POST request
post/http/1.1
Host:www.wrox.com
user-agent:mozilla/5.0 (Windows; U Windows NT 5.1; En-us; rv:1.7.6)
gecko/20050225 firefox/1.0.1
content-type:application/x-www-form-urlencoded
content-length:40
connection:keep-alive
Name=professional%20ajax&publisher=wiley
1, get submitted, the requested data will be appended to the URL (that is, put the data in the HTTP protocol header), to split the URL and transfer data, multiple parameters with & connection; for example: login.action?name=hyddd&password= Idontknow&verify=%e4%bd%a0%E5%A5%BD. If the data is an English letter/number, sent as is, if it is a space, converted to +, if it is Chinese/other characters, the string is directly encrypted with BASE64, such as:%E4%BD%A0%E5%A5%BD, where the xx in%xx for the symbol in the 16 in ASCII representation.
Post submission: The submitted data is placed in the package body of the HTTP package. The red font in the example above indicates the actual transfer data. Therefore, the data submitted by get is displayed in the Address bar, and post submission, the Address bar does not change
2, the size of the transmission data: first of all: the HTTP protocol does not limit the size of the data transmitted, the HTTP protocol specification does not limit the length of the URL.
And in the actual development of the main limitations are:
get: Specific browsers and servers have restrictions on the length of URLs, such as IE's limit of 2083 bytes (2k+35) for URL lengths. For other browsers, such as Netscape, Firefox, etc., there is no theoretical length limit, the limit depends on the operating system support.
So for get commits, the transfer data is limited by the length of the URL.
POST: Theoretically, the data is not restricted because it is not passed through a URL. However, the actual Web server will specify the size of the post submission data limits, Apache, IIS6 have their own configuration.
3. Safety
Post is more secure than get. For example: To submit data through get, user name and password will appear on the URL, because (1) the login page is likely to be cached by the browser, (2) Other people to view the history of the browser, so others can get your account and password, in addition, Submitting data using Get can also cause Cross-site request forgery attacks
4, HTTP Get,post,soap protocol is run on HTTP
(1) Get: The request parameter is appended to the URL as a sequence of key/value pairs (the query string).
The length of the query string is limited by the Web browser and Web server (ie supports up to 2048 characters) and is not suitable for transporting large datasets while it is unsafe
(2) Post: The request parameter is transmitted in a different part of the HTTP header (named entity body), which is used to transfer the form information, so the Content-type must be set to: application/x-www-form- Urlencoded. Post design is used to support user fields on Web Forms, and their parameters are also transferred as Key/value.
However: it does not support complex data types because post does not define the semantics and rules of the transport data structure.
(3) Soap: is a special version of HTTP POST, followed by a special XML message format
Content-type is set to: Text/xml Any data can be XML.
The HTTP protocol defines a number of ways to interact with the server, the most basic of which is get,post,put,delete 4. A URL address is used to describe a resource on a network, and the Get, POST, put, and delete in HTTP corresponds to the search, change, increase, and deletion of 4 operations on the resource. The most common of us is get and post. Get is typically used to obtain/query resource information, and post is typically used to update resource information.
Let's look at the difference between get and post
The data submitted by get is placed after the URL to split the URL and transmit the data, and the parameters are connected to &, such as editposts.aspx?name=test1&id=123456. The Post method is to place the submitted data in the body of the HTTP package.
The data size of the Get commit is limited (because the browser has a limit on the length of the URL), and the data submitted by the Post method is not limited.
The Get method needs to use Request.QueryString to get the value of the variable, and the Post method obtains the value of the variable by Request.Form.
Get way to submit data, there will be security issues, such as a login page, the get way to submit data, user name and password will appear on the URL, if the page can be cached or other people can access the machine, you can obtain the user's account and password from the historical record.
from: http://www.cnblogs.com/ranyonsue/p/5984001.html