(to) The basic process and principles of the user's access to the site (most of all, none)

Source: Internet
Author: User
Tags ack session id domain name server browser cache domain server purchase domain name

basic processes and principles of user access to the site (most of all, none)

Original: http://blog.csdn.net/yonggeit/article/details/72857630

Directory (?) [-]

    1. User access to the site process framework
    2. DNS parsing principle
    3. TCPIP Three-time handshake
      1. OSI Reference Model
      2. TCPIP Model Processing Process
      3. Data link approximate flow of Ethernet
      4. Route resolution
      5. Static routes
      6. Dynamic routing
      7. Routing algorithms
    4. HTTP protocol principle WWW Service request process request Detail Message Details
    5. Large-scale site cluster architecture details
    6. HTTP protocol Principle WWW Service Response Process Response Detail Message Details
      1. HTTP message structure
      2. The HTTP11 specification defines the following 47 header fields
    7. TCPIP four-time wave process
      1. Persistent connections
      2. Attack patterns for Web apps
      3. Cross-site scripting attacks XSS
      4. SQL injection attacks
      5. OS Command injection attack
      6. HTTP Header Injection Attack
      7. Security vulnerabilities caused by session management negligence
      8. Dos attacks

[TOC]

User access to the site process framework

The first step: After the client user enters the www.baidu.com website URL from the browser, the system will query the local HOSTS file and the DNS cache information to find out if there is a URL corresponding IP resolution record. If there is a direct access to the IP address, and then visit the site, generally the first time the request, the DNS cache is not resolved records;

The second step: if the client does not have a DNS cache or the hosts do not have a corresponding www.baidu.com site URL of the domain name resolution record, then the system will be the browser's resolution request to the client local Settings DNS server address Resolution (this DNS is Ldns, that is, local DNS), If the local cache of the LDNS server has a corresponding parsing record, the IP address will be returned directly, and if not, LDNS will be responsible for continuing to request other DNS servers;

Step three: Ldns will be from the DNS system "." The root begins to request www.baidu.com domain name resolution, after a series of lookups to each level of DNS server, will eventually find the www.baidu.com domain name corresponding to the authorized DNS server, and this authoritative DNS server, it is the enterprise purchase domain name used to manage domain name resolution of the server. This server has www.baidu.com corresponding IP resolution records, if not at this time, it means that the enterprise operators have to www.baidu.com domain name to do analysis;

Fourth step: baidu.com domain name corresponding to the authorization of the DNS server will www.baidu.com the corresponding final IP resolution records sent to Ldns;

Fifth step: Ldns received from the authoritative DNS server about www.baidu.com corresponding IP resolution records to the client browser, and in Ldns local domain name and IP corresponding resolution cache, so that the next faster return of the same resolution request record;

Sixth step: The client browser obtains the corresponding IP address of the www.baidu.com, then the browser will request the corresponding IP address of the Web server, the Web server receives the customer's request and responds to processing, the content of the customer request to the client browser;

At this point, the complete process of visiting the Web page is complete.

DNS parsing principle

The process of DNS resolution: Computers can only communicate with each other through IP, because the IP is not good to remember, so only use the DNS server to resolve the domain name to the corresponding IP, Here to parse www.baidu.com as an example, when we enter the URL of the return, the browser will first query the browser cache, this cache survival time may only be 1 minutes, if not found, then to query the local DNS cache and Hosts file, if there is www.baidu.com the domain corresponding IP, then Access the Web server directly from this IP. If the local DNS cache and Hosts file is not found, this time will send the request to the network card configuration information in the DNS server, the default is two, only if dns1 can not access, will use DNS2. We also call the DNS in the NIC configuration information local DNS, when local DNS queries its cache first. There is no www.baidu.com corresponding record, if there is, then return to the user, if not, will be access to the root name server, the world a total of 13 root name servers, root name server look, is looking for. com, and will send the IP of the. com top-level domain name server to local DNS, at which point the local DNS again accesses the. com top-level domain server, the. com top-level domain name server, looking for a domain name baidu.com, and then send the Baidu.com IP to local DNS, and then continue to look down, Until you find Www.baidu.com's authoritative DNS's a record or CNAME, this time local DNS sends the IP of the found www.baidu.com to the client and logs it in the cache, so that the next time another user accesses the domain name www.baidu.com, the local DNS cache is logged. When the client receives the IP that the local DNS sends over, it accesses the server via IP and records the IP in the DNS cache.

TCP/IP three-time handshake


After DNS resolution, you get the IP, you can send HTTP requests through IP to the server, because HTTP is working in the seventh layer of application, TCP is working in the fourth layer of transport, so before the HTTP request occurs, the TCP three handshake.
The three-time handshake of TCP is: The client first sends a random number with a SYN ID and a SEQ to the server, and after the service receives it, it needs to respond to the client with a ack,ack value of just the SEQ random number +1, which in the response packet also contains a SYN identifier and a SEQ random number. After the client receives the response packet from the service side, sending a Ack,ack value to the server is the value of SEQ that was sent from the server just now (+1). Once the three steps are complete, the three handshake is complete, and the data is now ready to be transmitted.

OSI Reference Model

TCP/IP model processing process

Data link approximate flow of Ethernet

Route resolution:

Static routes

Dynamic routing

Routing algorithms

HTTP protocol principle (WWW service request process) Request details, message details

This is the beginning of sending HTTP request messages.

HTTP request message, mainly including, request line, request header, blank line, request body

And the request line includes, the request method, the URL, the protocol version, the request method mainly has, POST, PUT, DELETE, Move,url is the Uniform Resource Locator, through this can find the only Web resources on the server, the protocol version, the current mainstream is http1.1 , the beginning of the popular protocol version is http1.0, the relative should http1.0,http1.1 mainly from scalability, cache processing, bandwidth optimization, persistent connection, host header, error notification, message delivery, content negotiation and other aspects of the optimization, the above is the content of the request line
Some more, request the head, request the head of the main media type, language type, support compression, client type, host name, etc., the media type is mainly text files, picture files, video files, etc., language type is to tell the server client's accepted language, support compression, can save bandwidth, client type, Displays the version information of the client browser, operating system information, etc.
A blank line, representing the end of the request header, also represents the beginning of the request body
Request message body, only when using post to submit the form

Large-scale site cluster architecture details

There are three kinds of common Web resources, static pages, dynamic Web pages, pseudo-static
Static Web page is no background database, no php,jsp,asp and other programs, not interactive, developers write what, show is what, there will be no change
Dynamic Web page, there is a background database, support more features, such as user registration, login, post, order, blog, etc., Dynamic Web page does not exist independently of the Web page file on the server, but when the user requests the dynamic program on the server, the server resolves these programs, and calls the database to return a complete page content, It is different from the URL of a static Web page, its URL contains? , & and other special symbols, search engine included in the time there are certain problems. Dynamic Web pages in order to facilitate the collection, often using rewrite technology, the Dynamic Web page URL disguised as static Web page URL, this is pseudo-static.

Different Web resources, open the process is not the same, the following assumes that we are visiting a static site:
The client will download the HTML file on the server through the HTTP protocol, then read the HTML file, according to the link in the HTML page, top-down request, each request is a link, if it is a picture, will download the side rendering, encountered JS, will be loaded JS, when JS comparison content is more complex , the browser will wait, the mouse in a circle, we call this JS blocking, when the JS download is complete and after the completion of the implementation, will be displayed we see the page.

When we visit a Dynamic Web page, the first time the user makes a request, the server receives the request, it is assumed that the server is using Nginx,nginx will transfer this request to php,php will query the database, based on the value returned by the database, generate a complete page content, Sent to the user, the user received, but also the side of the download side rendering, loading JS, after the execution, will show the page we see

When the server access to hundreds of millions of PV, the process of the visit is more complex, the user's request will first visit the National CDN node, through the CDN blocking the national 80% of the request, when the CDN does not, in the Access server cluster, the cluster generally has a 4-layer agent, this 4-tier agent, Use the software to complete, is the LVS, the use of hardware is the f5,4 layer of agents, the next is the 7-layer load balancer, commonly used is Haproxy,nginx, and then is more than one Web server, Web server more time, there are two problems, one is the consistency of user data, Not because of the different Web servers to provide services, and lead to data is not synchronized, this time, we need to use NFS shared storage, the second problem is session, not because of different Web servers to provide services, the session can not find, at this time, We need to use memcached to store and share the session. Because the user access is too large, this time the bottleneck is the pressure of the database, we are generally using distributed cache Memcache,redis, and so on, in addition to the database also need to do the read and write separation optimization, the subsequent process and access to Dynamic Web page similar

HTTP protocol principle (WWW service response process) response details, message details

After the server receives the request message, it will give the response message.

The response message mainly includes the starting line, the response head, the blank line, the response message body

The starting line typically contains the HTTP version number, the number status code, the status condition
And the number of status codes, the following are common
200 means OK
301 Permanent Jump
403 No permission
404 Without this file
500 Unknown error
502 Gateway Error
503 server overload, downtime maintenance
504 Gateway Timeout
Response header, including, server's Web software version, server time, long connection or short connection, set CharSet, etc.
The empty line here is the same as the request message.

HTTP message structure

(1) The HTTP message can be roughly divided into two pieces of message header and message body

(2) Structure example of request message and response message

The http/1.1 specification defines the following 47 header fields

(1) General header field

(2) Request header field

(3) Response header field


(4) Entity header field

TCP/IP four wave waving process

When the browser loads a full page, it also needs to be disconnected from the server, which is a four-time TCP wave
First the client sends a packet with a fin ID and a SEQ random number, after the server receives, will respond to a ack,ack value equal to the value of the SEQ just +1, after sending, the servers will send a package, the package also has a fin logo and a seq random number, after the client receives, Response to a ack,ack value equals just the SEQ value of +1, after completion, the server and the client's 4 waves are finished!

Persistent connections

In the initial version of the HTTP protocol, a TCP connection is disconnected once per HTTP communication. Therefore, each request causes unnecessary TCP connections to be established and disconnected, increasing the overhead of the traffic. To solve this problem, http/1.1 came up with a persistent connection (also known as an HTTP keep-alive), which is characterized by maintaining a TCP connection status as long as the disconnect is not explicitly made at either end.
  

Attack patterns for Web apps

(1) Active attack: The attacker enters the attack mode by direct access to the Web application. The most typical attacks are SQL injection attacks and OS command injection attacks.

(2) Passive attack: Using snare strategy to execute attack code ground attack mode, in the passive attack process, the attacker does not directly attack the target Web App access.

Cross-site scripting attacks (XSS)

A cross-site scripting Attack (Cross-site SCRIPTING,XSS) is an attack that runs an illegal HTML tag or JavaScript script in a browser that is registered by a user in a Web site that has a security vulnerability.
Cross-site scripting attacks can have the following effects:
(1) The use of false input forms to defraud users of personal information.
(2) using a script to steal a user's cookie value, the victim unknowingly helps the attacker to send a malicious request.
(3) display of forged articles or pictures.
  

SQL injection attacks

SQL injection (SQL injection) is an attack on a database that is used by a web app to run illegal SQL. This security risk can be a major threat, sometimes directly leading to the disclosure of personal information and confidential information.
SQL injection attacks can have the following effects:
(1) illegally viewing or tampering with data in the database.
(2) Avoid certification.
(3) Execute the program associated with the database server business and so on.

OS Command injection attack

An OS command injection attack is a means of executing an illegal operating system command through a Web application to achieve an attack. There is a risk of being attacked wherever a shell function can be invoked.

HTTP Header Injection Attack

An HTTP header injection attack is an attack in which an attacker adds an arbitrary response header or subject by inserting a newline within the response header field. belongs to the passive attack mode.

Security vulnerabilities caused by session management negligence

(1) Session hijacking: The attacker gets the user's session ID by some means and illegally uses this session ID to impersonate the user to achieve the purpose of the attack.
(2) Session pinning attack: Forces the user to use the session ID specified by the attacker, which is a passive attack.
(3) Cross-site request forgery (Cross-site requests forgeries,csrf): An attacker who, by setting a good trap, forces certain status updates, such as unintended personal information or setting information, to a user who has completed authentication to be a passive attack.
CSRF may have the following effects:
1, the use of authenticated user rights to update the set of information, etc.;
2, the use of authenticated user rights to purchase goods;
3, the use of authenticated user rights in the message board to make comments, etc.;

Dos attacks

A Dos attack (Denial of service attack) is an attack that keeps a running service in a stopped state. Sometimes called a service stop or denial of service attack. The following two types of Dos attacks are mainly:
(1) The centralized use of access requests caused by resource overload, resource exhaustion, in fact, it is a stopped state.
Simply speaking, is to send a large number of legitimate requests, the server is difficult to distinguish what is a normal request, what is the attack request, it is difficult to prevent Dos attacks. Multiple computer-initiated Dos attacks become DDoS attacks (distributed denial of Service attack), and DDoS attacks typically take advantage of the virus-infected computer as an attack springboard for attackers.
(2) Stop the service by attacking a security vulnerability.

(to) The basic process and principles of the user's access to the site (most of all, none)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.