How the "Golang" Web Works

Source: Internet
Author: User
Tags domain name server domain server
This is a creation in Article, where the information may have evolved or changed.

When we browse the Web, we open the browser, enter the URL and press ENTER, then we will show you what you want to see. What is hidden behind this seemingly simple user behavior?

For the normal online process, the system is actually doing this: the browser itself is a client, when you enter the URL, the first browser will go to request a DNS server, through DNS to obtain the corresponding domain name corresponding IP, and then through the IP address to locate the IP corresponding server, the request to establish a TCP connection, After the browser sends the HTTP request (request) packet, the server receives the request packet before it starts processing the request package, and the server calls its own service, returning an HTTP Response (response) packet The client receives a response from the server and starts rendering the body (body) in the response package, and then disconnects the TCP connection to the server after receiving the entire content.

Figure 3.1 The process of a user accessing a Web site

A Web server is also known as an HTTP server, which communicates with clients through the HTTP protocol. This client usually refers to a Web browser (in fact, the mobile client is also implemented inside the browser).

How the Web server works can simply be summed up as:

    • The client establishes a TCP connection to the server through the TCP/IP protocol
    • The client sends an HTTP protocol request packet to the server, requesting a resource document from the server
    • The server sends an HTTP protocol reply packet to the client, and if the requested resource contains content in dynamic language, the server invokes the dynamic language's interpretation engine to handle the dynamic content and returns the processed data to the client
    • The client and server are disconnected. The client interprets the HTML document and renders the graphic results on the client screen

A simple HTTP transaction is implemented in this way and looks complicated, and the principle is actually quite simple. It is important to note that the communication between the client and the server is non-persistent, that is, when the server sends an answer, it disconnects from the client and waits for the next request.

URL and DNS resolution

We browse the Web page is accessed through the URL, then what is the URL exactly?

The URL (Uniform Resource Locator) is the abbreviation for "Uniform Resource Locator" that describes a resource on a network with the following basic format

scheme://host[:p ort#]/path/.../[?query-string][#anchor]scheme         Specifies the protocol used by the underlying (for example: HTTP, HTTPS, FTP) host           The default port for the HTTP server's IP address or domain name port#          http Server is 80, in which case the lower number can be omitted. If you use a different port, you must indicate, for example, the path to the Http://www.cnblogs.com:8080/path           Access resource query-string   the data anchor anchor sent to the HTTP server         
DNS, the domain name System, is the acronym for a domain-organized hierarchy of computers and network service naming systems that are used in TCP/IP networks to work to convert hostnames or domain names to actual IP addresses. DNS is such a "translator", its basic principles of work can be expressed.

Figure 3.2 How DNS works

The process of more detailed DNS parsing is as follows, and this process helps us understand the working mode of DNS

  1. Enter the www.qq.com domain name in the browser, the operating system will first check whether the local Hosts file has this URL mapping relationship, if so, first call this IP address mapping, complete the domain name resolution.

  2. If the hosts do not have the mapping of this domain name, then find the local DNS parser cache, whether there is this URL mapping relationship, if any, directly return, complete the domain name resolution.

  3. If the hosts do not have a corresponding URL mapping relationship with the local DNS resolver cache, first find the preferred DNS server set in the TCP/IP parameter, where we call it a local DNS server, when the server receives the query, if the domain name to be queried is included in the local Configuration zone resource, The parsing result is returned to the client and the domain name resolution is completed, which is authoritative.

  4. If the domain name that you are querying is not resolved by the local DNS server zone, but the server has cached this URL mapping relationship, call this IP address mapping to complete the domain name resolution, which is not authoritative.

  5. If the local DNS server local zone file and cache resolution are invalidated, the query is based on the local DNS server's settings (whether to set forwarders), and if the forwarding mode is not used, local DNS sends the request to the root DNS server, and the root DNS server will determine the domain name (. com) upon receipt of the request. Who is authorized to administer and returns an IP that is responsible for the top-level domain name server. After the local DNS server receives the IP information, it will contact the server responsible for the. com domain. After the server that is responsible for the. com domain receives the request, if it cannot resolve itself, it will find a management. com domain's next-level DNS server address (qq.com) to the local DNS server. When the local DNS server receives this address, it will find the qq.com domain server, repeat the above action, query until the www.qq.com host is found.

  6. If the use of the forwarding mode, this DNS server will forward the request to the first level of DNS server, by the top level of the server to resolve, the previous level of the server if it can not resolve, or find root DNS or transfer requests to the upper ancestor, in this cycle. Whether the local DNS server uses either forwarding or root hints, the result is returned to the local DNS server, which is then returned to the client.

Figure 3.3 The entire process of DNS resolution

The so-called "respondents to the query" is 递归查询过程 replaced by 迭代查询过程 "the person who submitted the inquiry" unchanged.

For example, you want to know a girl with a law class on the phone, and you secretly patted her photos, back to the bedroom to tell a very loyal friend, this buddy son without further ado, Pat Chest told you, don't worry, I check for you (here completed a recursive query, that is, the role of the Inquirer). Then he took the picture asked the college seniors, seniors told him that the girl is XX department, and then this friend is not busy and asked the XX Department of the Office of the Assistant classmate, assistant classmate said is XX department yy class, and then very loyal brother son to xx Department of the class of the monitor took the girl phone. (Several iterative queries are completed here, that is, the role of the inquirer is constant, but the subject is repeatedly replaced) Finally, he handed in your number. Complete the entire query process.

Through the above steps, we finally get the IP address, that is, when the browser last initiated the request is based on IP and the server to do information interaction.

HTTP protocol Detailed

The HTTP protocol is the core of web work, so knowing how the Web works is a good way to get a detailed understanding of how HTTP works.

HTTP is a protocol that allows Web servers and browsers (clients) to send and receive data over the Internet, which is built on the TCP protocol and typically uses TCP port 80. It is a request, response protocol--the client makes a request and the server responds to the request. In HTTP, the client always initiates a transaction by establishing a connection and sending an HTTP request. The server cannot actively contact the client or send a callback connection to the client. A connection can be interrupted prematurely by both the client and server side. For example, when a browser downloads a file, you can turn off the HTTP connection to the server by clicking the "Stop" button to interrupt the download of the file.

The HTTP protocol is stateless, and there is no correspondence between this request and the last request of the same client, and it is not known to the HTTP server whether the two requests are from the same client. To solve this problem, the Web program introduces a cookie mechanism to maintain the sustainable state of the connection.

The HTTP protocol is built on top of the TCP protocol, so a TCP attack can affect HTTP traffic as well, such as some common attacks: SYN Flood is one of the most popular DOS (denial of service attacks) and DDoS (distributed denial of service attacks) in the current way, This is a way to exploit a TCP protocol flaw that sends a large number of bogus TCP connection requests, which can cause the attacker to run out of resources (CPU full load or low memory).

HTTP request package (browser information)

Let's take a look at the structure of the request package, the request package is divided into 3 parts, the first part is called the request line, the second part is called the request header, the third part is the body (body). There is a blank line between the header and the body, as shown in the sample request package:

GET /domains/example/ HTTP/1.1        //请求行: 请求方法 请求URI HTTP协议/协议版本Host:www.iana.org                //服务端的主机名User-Agent:Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/22.0.1229.94 Safari/537.4            //浏览器信息Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8    //客户端能接收的mineAccept-Encoding:gzip,deflate,sdch        //是否支持流压缩Accept-Charset:UTF-8,*;q=0.5        //客户端字符编码集//空行,用于分割请求头和消息体//消息体,请求资源参数,例如POST传递的参数

The HTTP protocol defines a number of request methods that interact with the server, with 4 basic types, get,post,put,delete. A URL address is used to describe a resource on a network, and the Get, POST, PUT, delete in HTTP corresponds to the search for this resource, change, increase, delete 4 operations. Our most common is get and post. Get is typically used to get/query resource information, and post is typically used to update resource information.

The following request information can be seen through the Fiddler capture package:

The following request information can be seen through the Fiddler capture package:

Figure 3.4 Get information for fiddler crawl

Figure 3.5 Fiddler grabbed post information

Let's look at the difference between get and post:

    1. We can see that the GET request message body is empty and the POST request has a message body.
    2. Get submitted data is placed after the URL to split the ? URL and transfer data between the parameters to & connect, such as EditPosts.aspx?name=test1&id=123456 . The Post method is to put the submitted data in the body of the HTTP packet.
    3. The data size for get commits is limited (because the browser has a limit on the length of the URL), and there is no limit to the data submitted by the Post method.
    4. The Get method submits the data, which brings security problems, such as a login page, when the data is submitted via get, the user name and password will appear on the URL, and if the page can be cached or someone else can access the machine, the user's account and password can be obtained from the history record.

HTTP Response Pack (server information)

Let's take a look at the HTTP response package, which has the following structure:

HTTP/1.1 200 OK                        //状态行Server: nginx/1.0.8                    //服务器使用的WEB软件名及版本Date:Date: Tue, 30 Oct 2012 04:14:25 GMT        //发送时间Content-Type: text/html                //服务器发送信息的类型Transfer-Encoding: chunked            //表示发送HTTP包是分段发的Connection: keep-alive                //保持连接状态Content-Length: 90                    //主体内容长度//空行 用来分割消息头和主体<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"... //消息体

The first line in the response package is called the status line, which consists of the HTTP protocol version number, the status code, and the status message.

The status code is used to tell the HTTP client whether the HTTP server produced the expected response. The 5 class status codes are defined in the http/1.1 protocol, and the status code is made up of three digits, and the first number defines the category of the response.

    • 1XX hint Message-Indicates that the request was successfully received and continues processing
    • 2XX Success-Indicates that the request has been successfully received, understood, accepted
    • 3XX Redirect-further processing is required to complete the request
    • 4XX Client Error-Request syntax error or request not implemented
    • 5XX server-side error-the server failed to implement a legitimate request

We see the following figure shows the detailed return information, the left can see a lot of resource return code, 200 is commonly used, indicating normal information, 302 means jump. Detailed information is displayed in the response header.

Figure 3.6 Accessing all request information for a site

The HTTP protocol is a stateless and connection:keep-alive difference

Stateless means that the protocol has no memory capacity for transactions, and the server does not know what the client state is. On the other hand, there is no connection between opening a Web page on a server and the pages you have previously opened on this server.

HTTP is a stateless connection-oriented protocol, and stateless does not mean that HTTP cannot maintain TCP connections, nor does it use the UDP protocol (in the face of No connection) for HTTP.

From http/1.1 onwards, the default is to turn on the Keep-alive maintain connection feature, in short, when a Web page opens, the TCP connection between the client and the server for transmitting HTTP data does not close, and if the client accesses the Web page on the server again, will continue to use this established TCP connection.

Keep-alive does not permanently keep the connection, it has a hold time that can be set in different server software (such as Apache).

Request Instance

Figure 3.7 Requests and response for a single request

Above this picture we can understand the entire communication process, while the attentive reader has noticed a little, a URL request but why there are so many resource requests in the left column (these are static files, go for static files have a special treatment method).

This is a function of the browser, the first request URL, the server side returned is an HTML page, and then the browser began rendering HTML: When parsing the HTML DOM inside the image connection, CSS script and JS script link, the browser will automatically initiate a request for static resources HTTP request, Get the corresponding static resources, then the browser will be rendered, and eventually all the resources to integrate, rendering, the full display in front of us on the screen.

One of the steps in Web optimization is to reduce the number of HTTP requests, which is to combine as many CSS and JS resources as possible to minimize the number of pages requesting static resources, improve page loading speed, and slow down the server pressure.


Original link: https://astaxie.gitbooks.io/build-web-application-with-golang/content/zh/03.1.html
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.