HTTP Concise Basics

Source: Internet
Author: User

HTTP Hypertext Transfer Protocol (hypertext Transfer Protocol) is one of the most widely used network protocols on the Internet. All WWW documents must comply with this standard. It is a standard (TCP) for client and server-side requests and responses. The client is the end user and the server side is the Web site. By using a Web browser, crawler, or other tool, the client initiates an HTTP request to a specified port on the server (the default port is 80), and the server-side response message process. This article briefly describes some of the basics of HTTP and Web sites for your reference.

First, what is HTTP
http,超文本传输协议(HyperText Transfer Protocol)是互联网上应用最为广泛的一种网络协议。    HTTP是一个客户端和服务器端请求和应答的标准(TCP)。客户端是终端用户,服务器端是网站。    客户端(user agent)通过使用Web浏览器、网络爬虫等工具,发起一个到服务器上指定端口(默认端口为80)的HTTP请求。    应答的服务器上存储着(一些)资源,比如HTML文件和图像,(我们称)这个应答服务器为源服务器(origin server)。    通常,由HTTP客户端发起一个请求,建立一个到服务器指定端口(默认是80端口)的TCP连接。    HTTP服务器则在那个端口监听客户端发送过来的请求。    一旦收到请求,服务器(向客户端)发回一个状态行,比如"HTTP/1.1 200 OK",和(响应的)消息等。    消息的消息体可能是请求的文件、错误消息、或者其它一些信息。
Second, the HTTP protocol version
0.9GET 一种请求。1.0:第一个在通讯中指定版本号的HTTP 协议版本,至今仍被广泛采用,特别是在代理服务器中。1.1:当前版本。持久连接被默认采用,并能很好地配合代理服务器工作。    支持以管道方式同时发送多个请求,以便降低线路负载,提高传输速度。
Third, HTTP related concepts
html: Hypertext Markup Language, "hypertext" refers to the page can contain pictures, links, even music, programs and other non-text elements.        URL: A Uniform Resource Locator is a concise representation of the location and access methods of resources that can be obtained from the Internet, and is the address of standard resources on the Internet.        Each file on the Internet has a unique url , which contains information that indicates the location of the file and how the browser should handle it. Commonly used representations: protocol://Username: password @ subdomain. Domain name. TLD: Port number/directory/filename. file suffix? parameter = value  #标志  URL The most common protocol is http , other protocols such as https , ftp  , Mailto,ldap,file , news,gopher,telnet, etc. URI: Uniform Resource Identifier, is a string that identifies an Internet resource name.        The common format is the protocol name://domain name. root domain name/directory/file name. Suffix This identity allows users to interoperate with specific protocols for any resource, including local and internet. URI example, http ://www.baidu.com/photo/abc.gif This example is a use 
Iv. Web Resources
存放在Internet网上供外界访问的文件或程序,又根据它们呈现的效果及原理不同,将它们划分为静态资源和动态资源。    静态资源:        浏览器能够直接打开的,如一个js文件,浏览器可以直接打开没有出现问题,那么就说明它是一个静态资源。        如,html文件、css文件、js文件等它们都是静态资源。    动态资源:        浏览器不能够直接打开,但是经过翻译之后浏览器能够打开的资源称之动态资源。        比如说jsp文件、servlet、php、ASP等这些都是动态资源。    差别:        浏览器访问静态资源,服务器会直接响应给浏览器;        若浏览器访问的是动态资源,服务器先将动态资源翻译或转换成静态资源,然后再响应给浏览器    web资源类型:        html        text/html类型        txt         text/plain类型        jpeg        image/jpeg类型        gif         image/gif类型         mov,flv     视频资源类型
V. HTTP messages
Request message Syntax format start line:<Method>  <Url>  <Protocol version>Request Header: Headers Principal: The content of the request (contains data from the client Pull server) method: The client wants server-side action on the resource get: Get a copy of the Web resource from the server, need the server to send HEAD: Only from the server to get        First POST of document: Send data to the server (usually form submission) PUT: Send a resource to the server instead of get; the server typically needs to store this resource (location: typically a file system) Delete: Delete the resource that the URL points to OPTIONS: Probing server-side request methods that are supported by the requested URL TRACE: A proxy server, firewall, or gateway that passes in the middle of a request Generic header: Additional information added to the message describing the body Connection: defining C/ The relevant options for requests and responses between S Connection:keep-alive        Cache-control: Cache control Via: Displays the intermediate node request header for the message: Client-ip: Client IP host: Requested hostReferer: Indicates the URL of the original resource that requested the current resource User-agent: User Agent Accept Header: Accept: Indicates the type of media that can be sent by the server accept-charset: Supported character set for use Accept-encoding: Supported encoding mode of use Accept-language: Support for using language conditional requests: Expect: Tell the server which media types to sendIf-modified-since: Whether this resource has been modified within the specified timeIf-none-match: If the provided entity tag does not match the entity tag of the current document, obtain this document with security-related requests: Authorization: Authentication data that the client submits to the server, such as the account number and password Cookie: The client sends the identity to the server The starting line for the syntax format of the response message:<Protocol version>  <Response Status Code>  <Reason Phrases>Response Header: Headers Body: Response Content Response status code --199Information Tips $-299Success Status Code --399redirect --499Client Error --599Server error401: Authentication failed404: The requested resource could not be found403: No access to resources response headerDate: Message Generation Age: Response duration Server: Explaining its own program name and version to the client ETage: Opaque authenticator location:url Alternate location Content-length: The length of the entity Content-tyep: Entity's media type negotiation header: ary: The first list, the server will choose the most appropriate version according to the content of the list to send to the client Accept-ranges: The type of scope that the server can accept for the current resource is security-related: WWW-authentication: Ask the customer for an account number and passwordSet-cookie: The server sends a token on the first request of a client
Vi. http Interaction Process (Http/web Transaction)

The interactive process of a Web request response (details of the server's specific work)1, establish connection: Receive client connection request;2, receiving requests: Request a specific resource request from a request message from the network; connected input/output processing structure: single-process Web server: Initiates a process to receive requests and processes only one request at a time, and then receives and processes subsequent requests when processing is finished; multi-process Web server: Start multiple Processes, each process processes a request, generates a process for each request, belongs to a pre-built model, generates multiple idle child processes in advance, a process pool (thread pool), a Web server for multiplexing I/O: a process responds to multiple requests, an event-driven pattern implementation, and multi-threaded Web servers that reuse I/O: A Process to respond to n requests; start m processes; Number of requests processed simultaneously: N*M3, processing requests: The request message process parsing, the request resources and so on, according to the request of the header to determine the user requested resources; There are many details: HEADER host:www.baidu.com URL:/im Ages/logo.jpg method:Get4, Access resources: Gets the resource specified in the message, and the Web server is the Web resource server, which is responsible for sending pre-created or dynamically generated content, where the placement is called docroot;var/www/html/a.html Docroot =/var/www/html/var/www/html/imags/jpgs/a.jpg http://www.baidu.com/imags/jpgs/a.jpgThe Web server supports multiple resource mapping methods: (Mapping file system paths and URL paths) (1) Docroot (2) VirtualHost Docroot (3) User Home Docroot (4Alias Note: Access control mechanism may affect the access rights of resources;5, build the response message:6, send response message: Long connection: keep-alive short Connection7, logging
Seven, HTTP stateless

HTTP stateless:
means that the protocol has no memory capacity for transactional processing. A lack of state means that if the previous information is required for subsequent processing, it must be re-routed, which may cause the amount of data to be transferred per connection to increase. Second, with the advent of Web applications where the client interacts with the server dynamically, such as the shopping cart program needs to know what product the user has chosen before, the stateless issue needs to be addressed. In order to support the interaction between the client and the server, it is necessary to store the state interactively through different technologies, which are the cookies and the session.

Cookie Solutions:
Through the client hold state, this way to send the server to the client special information, as a text file in the form of the client, and then each time the client sends a request to the server will bring these special information. When a user connects to a Web site that supports cookies, the user provides personal information including the user's name and submits it to the server, and the server sends back the personal information as it returns the corresponding hypertext to the client. The effect of saving information to a cookie, such as when a login is added to a shopping cart and when you log in, is to save the password.

Session Solution:
When the client accesses the server, the server sets the session according to the requirements, saves the conversation information on the server, and passes the SessionID that marked the session to the client browser. The browser saves this SessionId in memory, and we call it a Cookie without expiration time. After the browser is closed, the cookie will be cleared and it will not exist in the user's cookie temporary file. In the future, each request will be added with this parameter value, the server will be based on this SessionId, the client can obtain data information. If the client browser shuts down unexpectedly, the session data saved by the server is not released immediately, and the data will still exist, as long as we know the SessionID, we can continue to request this session information. Because the background of the session still exists, of course, we can set a session timeout time, once more than the specified time without a client request, the server will clear the corresponding SessionId session information.

Copyright NOTICE: This article for Bo Master original article, welcome to spread, spread please be sure to indicate the source.

HTTP Concise Basics

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.