Computer network summary notes (Part 1)

Source: Internet
Author: User

Computer network Summary Note (I) this log will be summarized in the fourth edition of "Computer Network-top-down method and Internet features. Because the original book is in English, this summary may be very slow. Or enable the reserved bit first. At the same time, I will try to summarize it in a few languages that are as simple as possible. The Chinese and English comparison tables in the summary are provided. Due to the unpredictable amount of content, this article will divide it into two parts and add question solutions and some experimental data with additional logs. Summary Notes (I) will include the overview, application layer and transport layer content. A. Overview 1.1 Overview first, we call A group of independent computers A computer network. The Internet is different. The following describes the concept of Internet in two ways. From a specific perspective, the Internet is interconnected with many devices called hosts or terminal systems. They are considered to be on the edge of the network, and hosts can be divided into clients and servers. These devices are connected through communication links (physical cables), and there are also group switches in the middle of the links. When data is exchanged between hosts, data is sent by group and packet data is allocated by group switches. Common group switches include routers and link layer switches. The host accesses the Internet through the ISP (Internet Service Provider. All Internet communications must follow certain protocols. Its specifications are referred to as Internet standards. From a service perspective, the Internet is a computer network that provides such services: Allows discrete applications to exchange data on terminals, such as downloading software and browsers. He also provides two services for these applications: connection-oriented services and connectionless and non-dependent services. To put it simply, the connection-oriented service has a handshake program and must confirm whether the package is correctly sent and accepted. However, the connectionless service does not need to worry about the problem of the other party, you only need to send or receive messages. 1.2 network core we know that through links and switches, information can be exchanged between hosts. The two basic methods are circuit switching and group switching. The former is mainly used in telephone systems. The following is a brief analysis of two methods of information exchange: circuit exchange: each host is directly connected to a vswitch, and each vswitch has a physical cable. If the two hosts want to transmit information, the corresponding switch must have a reserved circuit. Assuming that each switch has n circuits, the connection obtains 1/n of the link bandwidth during the connection. (Imagine the telephone system will understand) group exchange: various applications need to exchange packets when completing tasks. The packets include the content required by the Protocol. The host groups large packets and sends them to the group switch. A vswitch uses the storage forwarding and transmission mechanism. Simply put, it receives all the packets of a packet before output. This will produce storage forwarding latency. At the same time, for each output link, the group switch also generates an output cache or an output queue for it, because only one group of information can be output to one link at a time, and other information can only be waited in the queue, which will lead to a queue delay. If the queue is full and the new packet group cannot be entered, packet loss occurs. In addition, there are also node processing latency, propagation latency, etc., collectively referred to as the total node latency. 1.3 protocol layer: the concept of protocol stack is introduced in the network. Top-Down: application, transportation, network, link, and physical layer. The main protocols are HTTP & SMTP, UDP & TCP, IP, PPP,... and so on. ----------- B. Application Layer ------------ first we need to study the network protocol from a web application. When a process is communicating, it must send the packet to the socket. This process occurs at the application layer, and the socket sends the packet to the transport layer controlled by the operating system, it is also transmitted to the transport layer of the receiver over the Internet and reaches the destination process through the socket of the other party. Therefore, the socket can be used as an API between the application and the network. Transport Layer Protocol services should include reliable data transmission, throughput, timing, and security. Different applications have different requirements for these services. You can select different transport layer protocols. Common transport layer protocols are TCP and UDP. TCP is a reliable connection-oriented data transmission service with congestion control mechanisms. To put it simply, when the network between the two sides is blocked, TCP can suppress the sending process. This is a technology that has advantages and disadvantages and will be discussed later. UDP is neither connected nor congested, so the speed is faster than TCP. Of course, the reliability cannot be guaranteed. After the protocol is selected, the application also needs to consider the IP address of the target host (described by IP address) and what process on the host. The next problem is solved by the port. That is, different types of web programs have different ports. ------------- HTTP ------------- the application layer protocol specifies how packets are transmitted between applications. The most famous is the HTTP protocol (Hypertext Transfer Protocol). HTTP uses TCP as its support transport layer protocol. Because its server does not store client information, HTTP is also a stateless protocol. In addition, HTTP uses persistent connections by default. We know that if a request or a request is established on TCP, the problem is: whether each request uses a new TCP or uses a TCP to manage all requests, these two methods are called non-persistent connections and persistent connections respectively. Because non-persistent connections are under great pressure on the server, they are usually persistent connections by default. When a link is not used for a period of time, the server terminates the TCP connection. Code ------------ cookies ----------- is used to help servers identify users. The cookie technology consists of four parts: the HTTP request message and Response Message contain a cookie line, which stores cookies in the client system and has a database in the web site background. When a user accesses a server for the first time, the response message contains a Set-cookie field. The value of this field is usually the unique identifier code Set by the server, the browser saves the field and the server host name and other information locally. When users access the site each time, the browser will find the site ID code from cookies and send it in the request message, in this way, the server can identify the user through this unique identifier and take actions through the user's browsing records on the website ---------- web Cache Server --------- is also called a proxy server. When the browser applies for a file, first, create a TCP connection to the web cache. If this file exists on the cache server, it will return. Otherwise, it will establish a TCP connection to the original server to obtain and store the file, and then send the file back to the client. The web Cache can greatly reduce the response time to the client, reduce the link traffic between the internal network and the internet, and save bandwidth. This will bring about a new problem: the files on the web cache may be outdated. Therefore, HTTP introduces the condition get, that is, when the web cache saves a file, it records its last modification time. Each time the client requests a file, the web Cache sends a message with if-modified-since to the original server. if the file is not changed, the cached file is directly returned to the user. Otherwise, the file is downloaded again. ------------- FTP ----------------- Use two parallel TCP connections, one controlling connection and the other being a data connection. ------------ Mail System ------------ the mail system consists of three parts: User proxy, mail server and mail transmission protocol. SMTP is the main application layer protocol for mail. The user agent of user A sends an email to the mail server where the mailbox of user A is located. server A establishes a tcp connection with server B to transmit the mail through SMTP protocol, b's server then sends the email to B's user proxy. If Server B is not started, emails will remain in the message sending queue of server. Next we will compare the SMTP and HTTP protocols: first, HTTP is a pull protocol, that is, HTTP is mainly used to obtain information, while SMTP is a push protocol, mainly used to send information. Next, SMTP requires that each packet, including the subject, use a 7-bit ASCII code. HTTP does not have this restriction. In addition, for multimedia processing, HTTP encapsulates each object in its own response message, while SMTP places all objects in one message. Because SMTP requires all objects to be encoded in ASCII format, you must use multi-purpose Internet Mail Extension (MIME) to add support for images, videos, or non-ASCII characters. Content-Type and Content-Transfer-Encoding must be added to the header of the message. After receiving the message, the SMTP receiver adds a Received ed row in the message to indicate the Received message. Next, we will consider how an email is sent from the recipient's server to the user proxy on the recipient's host, that is, the mail access protocol, popular POP3 and IMAP protocols and HTTP. When the user agent opens a TCP port 110 to the mail server, POP3 starts to work. Phase 1: In the licensing phase, the user proxy sends the user name and password in plain text, and Phase 2: transaction processing. In this phase, the user proxy will retrieve the packets from the server, some packets can also be marked as deleted. Stage 3: update. The server will delete the packets marked as deleted. -------------- DNS -------------- simply put, DNS is used to resolve the Host Name (Domain Name) provided by the user to an IP address. When the host needs to resolve a URL, a DNS request is sent. The DNS on the host receives the request and sends it to the DNS server. The server returns the resolution IP address. Other services provided by DNS include host alias, Mail Server alias, and load distribution. DNS is based on the UDP protocol. The DNS server also has hierarchical settings and obtains IP addresses through recursive and iterative queries. The DNS cache technology is that when the DNS server receives a DNS response, it will save the resolution to the cache for a period of time, which can effectively reduce the number of recursive queries. ------------- P2P --------------------------- C. Transport Layer ---------------- transport layer is to send the information to the application layer to the network layer. Is a transitional protocol layer. When developing a web application, the developer must specify whether the transport layer protocol is TCP or UDP. From another perspective, the Transport Layer Protocol extends the interaction between hosts to the interaction between processes on the hosts (because they are responsible for handing over process information to the hosts ), this is called the multiplexing and multiplexing of the transport layer. Next, the first sentence of Part B is to send the packets from the transport layer to the network layer, which is called multiplexing. The delivery of the packets from the transport segment to the correct socket is called multiplexing. This is also the work of the transport layer. To achieve this goal, we need to uniquely identify the socket. In fact, each packet has a special field that specifies the socket to be delivered, that is, the source port and the target port. Port is 0 ~ A number between 65535 and 0 ~ 1023 is called the accept port number, which is occupied. The application can use the following port number. The UDP socket is the destination IP address and port number. For TCP socket, the additional source port and Source IP address in the socket need to be confirmed. -------------- Advantages of the UDP-----------------UDP include: the application layer can better control the data to be sent and the sending time, no need to establish a connection, no connection status, the first packet overhead is small. The header of the UDP packet segment has four 16-bit fields, namely the source port, destination port, length, and check value. The subsequent content is the data segment occupied by the application layer data. The so-called check value is a UDP error verification mechanism. If we sum up the first three 16-bit characters, it is the check value, which is carried out by the sender UDP, the receiver adds all four fields to get 1111111111111111. Otherwise, an error occurs. ------------ Reliable data protocol -------------- it is quite troublesome to discuss the so-called reliable data transmission protocol first. We only discuss the situation of one-way data transmission. Generally, the sender uses this Protocol to send data to a lower layer (for example, the transport layer to the network layer), and the bottom layer is responsible for transmission, the receiver then extracts the data through this protocol. We call this Protocol rdt (reliable data transfer). We will discuss this problem from the simplicity to complexity: 1. if the underlying layer of the Protocol is completely reliable, the rdt sender only receives data from the top layer, generates a group containing the data, and sends the data to the channel, the receiver accepts only one group from the lower layer, obtains data from the group, and transmits the data to the upper layer. 2. if there is a Bits Error in the lower-layer channel, if two humans send messages, the listener will say OK when they hear the correct message, when you hear a vague message, you must repeat it. This includes three elements: Error Detection: first, there must be a mechanism for the receiver to determine that a message has a bit error. Receiver feedback: the Protocol must also allow the receiver to feedback the message. The feedback includes positive ACK and negative NCK. Re-transmission: the sender will resend the request if no confirmation is received. The retransmission-based protocol is called the automatic retransmission request (ARQ) protocol. If the sender waits for feedback from the receiver after each piece of data is sent, otherwise the next piece of data is not sent, this protocol is called a stop protocol. The problem with this Protocol is that if an ACK or NCK error occurs, for example, if the receiver sends an ACK but is interpreted as an NCK when it is sent to the sender, the sender resends the ACK group but is considered as a new group by the receiver, in this way, a group is accepted twice. To solve this problem, the general method is to use the group sequence number. In this way, the receiver can know whether it is a new package or a resend package by comparing the serial number. Because the sender stops sending a packet, and the serial number must be 01 to distinguish between the old and new packages. 3. Lower-layer channels have bit errors and packet loss. This is the most common situation, not only can be lost in groups, but can also be lost in ACK and NCK. For the sender, the sender only needs to resend the message. At this time, the inverted counting timer is introduced. If the timer starts, it indicates that the packet is not fed back on time, and the sender will resend the packet. We have basically obtained a reliable protocol, also known as the bit alternate protocol (because the serial number is only 01 ). The problem with this protocol is the efficiency of the STOP and so on. The throughput of the sender using the stop protocol is only 0.027%. If the sender can keep waiting, the sender can send data continuously. A protocol based on this idea is called a pipeline protocol. To implement the pipeline protocol, you must increase the serial number range and increase the cache by both parties. Common methods include returning N steps (GBN) and selecting re-transmission (SR ). In GBN, if the earliest unconfirmed group sequence number is defined as the basic sequence number, the next group sequence number is defined as the next sequence number. The entire send group can be divided into four parts, it is sent and confirmed, but unconfirmed, unsent and unavailable. The reason for making part of it unavailable is to control the traffic of the sender. The number of unconfirmed and unsent but available parts is called the window length N, and GBN is also called the sliding window protocol. The sender must have three corresponding events: the upper-layer call (the sender first checks whether the window is full, then sends a group and updates it), and accepts ACK (if the sender accepts ACK from a group, ensure that all groups before the group receive ACK correctly, that is, only the group at the bottom of the unconfirmed group is processed, and then the group is removed from the queue ), timeout event (that is, if no ACK is received, all groups in the unconfirmed group are resold ). SR can avoid the waste of repeated GBN sending. You need to set a timer for each group. For the sender and receiver, the window will be moved once the base number is successfully sent or accepted. ------------- TCP-------------1. TCP is connection-oriented and provides full duplex services, that is, TCP connections are bidirectional and point-to-point. Three handshakes are required to establish TCP. The sender transmits data to the TCP sending cache through a socket. When the data reaches the maximum message length (MSS) TCP adds a TCP header to the cache to form a packet segment and sends it to the receiver's TCP receiving cache. 2. the TCP header is generally 20 bytes, including the source port (16 bit), destination port (16), serial number (32), validation number (32), and length of the first segment (4 ), flag field (6, ACK, URG, PSH, RST, SYN, FIN), receiving window (16), checksum (16), and emergency Data Pointer (16) and variable length options. The confirmation number is the serial number of the next message received by the receiver from the sender. It indicates all groups before the receiver receives the confirmation number, because TCP is bidirectional, if 1-100 and 150-200 have been received, the recipient can enter 101 in the confirmation number. ACK is the acceptance confirmation mark, URG is the emergency information mark, corresponding to the emergency Data Pointer, PSH wants the receiver to receive immediate feedback to the layer, which has no practical value at all. The remaining three flag fields are used for three-way handshakes. The receiving window indicates the number of bytes that the receiver is willing to accept and is used to control the traffic of the sender. The length of the first segment must be declared because of the Variable Length option field. The option section is generally empty. 3. TCP has many details, including timeout judgment and reliable transmission implementation. TCP uses the countdown timer obtained in the reliable transmission section to determine whether a packet is successfully sent. However, the timeout value must be fixed, at least greater than the round-trip delay RTT. Therefore, TCP has the technology to estimate the round-trip delay: sample A group at any time to get sampleRTT (that is, from handing the group over to the bottom layer to the time when ACK is received), and maintain an average EstimatedRTT with the obtained sampleRTT, the formula is ERTT = 0.875 * ERTT + 0.125 * SRTT. the devRTT is called a degree of deviation from the mean of SRTT. The formula is DevRTT = (1-b) * DevRTT + B * | SRTT-ERTT |, and the recommended value of B is 0.25. The offset value follows the sample RTT update obtained each time and shows the current RTT variation degree. Timeout = ERTT + 4 * DevRTT reliable transmission also has some details. For example, if a timer overhead is set for each group, the solution is to allocate a timer to the current base number, after the base number is successfully sent, the timer is reset to the new base number. In addition, most TCP implementations adopt double timeout, that is, after a packet times out, the timeout duration is doubled every time the packet is resending. TCP also has a fast retransmission redundancy ACK mechanism. For the receiver, if the received group is in the order of 1 & 2, its next expected number is 3, but if the receiver receives 5 at this time, when the data stream is generated at intervals (NO 3 or 4), the redundant ACK is sent repeatedly, indicating that the expected serial number is 3. For the receiver, if three redundant ACK packets are received, the request is retransmitted quickly. 4. TCP provides the traffic control service for applications. A problem with TCP is that the sender can send a new group without waiting for the ACK information of each group, the receiver may be connected to too many groups, causing cache overflow. To solve this problem, we can see in the previous analysis that specifying a window length N. Once a group is not confirmed, the number of groups that the sender can send. In addition, the sender may be blocked by IP network congestion, which is called congestion control. 5. establish a TCP connection: the client sends a special packet segment that does not contain application layer data. However, the first SYN is set to 1, and a starting sequence number client_isn is selected and placed in the message sequence number field. After receiving the request, the server allocates the TCP cache and variables for the TCP establishment, and sends the allowed packet segment. The SYN is set to 1, the validation number is client_isn + 1, and the serial number segment is its own initial serial number server_isn, after receiving this message, the client also allocates cache and variables. The third message is sent. The SYN is set to 0 and the validation field is server_isn + 1. The purpose is to confirm the information. This process is called a three-way handshake. When the connection is terminated, the client sends a message with FIN set to 1, and the server uses the message with FIN set to 1. After the client confirms, both parties release the resource. 6. if there is no congestion control, we can see that the connection problems include that when the packet transmission rate is close to the link capacity, the group will experience a huge delay because a large number of groups flood into the route, however, the group that can be routed out is limited by the link capacity. When the queue cache overflows, the route packet loss occurs. Therefore, the sender can only resend the packet to create a vicious circle, or because the queue time is too long, the sender resends the packet, but the packet in the route is not discarded, in this way, the receiver will receive two identical packets, and the network pressure is also increased when packets are repeatedly collected. Finally, if there is more than one vro on the path, when a packet is discarded, this will result in a waste of capacity for all the routes that distribute them. Congestion control can be divided into two types: end-to-end and network assistance. The corresponding network layer provides explicit help for the transport layer congestion mechanism. TCP is an end-to-end congestion mechanism because the IP protocol at the lower layer of the network does not provide a congestion mechanism. As mentioned above, TCP can control the sender's traffic by controlling the value of the congestion window, and determine network congestion by timeout and connection of three redundant ACK. If yes, the window is reduced, otherwise, use the idle bandwidth to enlarge the window. The specific window adjustment algorithm is called the TCP congestion control algorithm. In general, if there is no congestion, TCP expands the window to an MSS (maximum packet segment) every time it receives a normal validation message, that is, multiple messages are sent, if congestion occurs, the window size is reduced by half. This algorithm is called addition and multiplication subtraction. First, because the initial value in the start stage window is generally only one MSS, the speed will be very low for a long period of time, and the available bandwidth may be very large, in order to find out the maximum available bandwidth as soon as possible, every time an RTT passes through the startup phase, the window value is doubled until a packet is lost. This special process ends. This process is called slow start. Second, when a timeout event occurs, TCP sets the window value to one MSS and starts to half of the window value before the packet is lost. However, when three redundant ACK packets are received, they are directly halved instead of being started slowly. This behavior is called fast recovery. -------------------- Chinese and English table --------------------- host end system communication link group switch packet switch route router link layer switch link-layer switch Internet Service Provider, ISP Internet Standard LAN intranet Protocol circuit switch circuit switching packet message group packet cache buffer queuing delay queue delay packet loss packet lost packet segment

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.