Related Articles:Programmer self-cultivation-Operating System
Almost all computer programs involve network communication. Therefore, understanding basic computer network knowledge is very important for every programmer.
This article introduces some basic network knowledge and provides links to a series of high-quality articles for your reference at any time. I believe that through the study in this article, you can have a comprehensive understanding of computer networks!
Before reading this article, we recommend that you read the following two articles to get a rough idea of how computer networks work.
Introduction to Internet protocols (1)
Introduction to Internet protocols (II)
Next, we will introduce some basic network knowledge.
OSI reference model
I came up with the OSI Layer-7 Reference Model. Is it a bit dizzy? If so, read the two articles recommended at the beginning of the article!
Layer 2: Application Layer)
The application layer can communicate with the application interface to display the information to users. Common protocols include HTTP, https, FTP, telnet, ssh, SMTP, and POP3.
Layer 3: Presentation Layer)
The presentation layer can provide different clients with syntax conversion internal codes for data and information, so that the system can interpret the data correctly. It also supports compression, decompression, encryption, and decryption.
Layer 3: Session Layer)
The Session Layer is used to establish a communication mode for both parties and create and cancel sessions (communication between both parties ).
Layer 3: transportlayer)
The transport layer is used to control data traffic and debug and handle errors to ensure smooth communication. The transmission layer of the sender adds serial numbers to the group to facilitate the receiver to reorganize the group into useful data or files.
Layer 3: Network Layer)
The role of the network layer is to determine how to transmit the data of the sender to the receiver. This layer determines the optimal route from node X to node y by considering the degree of network congestion, service quality, transmission priority, and consumption of each route. The well-known routers work on this layer, and the networks become interconnected by constantly receiving and transmitting data.
Layer 2: data link layer (datalinklayer)
First, the function of the data link layer is to manage the BIT data at the first layer and send the correct data to a route without transmission errors. Create and identify the start and exit locations of the data. In addition, it is to deal with the problem of data damage, loss or even repeated transmission errors, so that subsequent levels will not be affected, so it runs the debugging, retransmission or correction of data, it also determines when the device will be transmitted. Device: bridge adapter switch Switch
Layer 2: Physical Layer)
The physical layer defines specifications for all electronic and physical devices. Specifically, the relationships between devices and physical media are defined, including pins, voltages, cable specifications, hubs, reconnections, NICS, host adapters (host adapters used in San) and other device design definitions. Because the physical layer transmits the original BIT data stream, which is designed to ensure that when the signal sent is binary "1, the recipient receives binary "1" instead of binary "0 ". Therefore, you need to define which device has several pins, and the voltage of which pin sends indicates the binary "1" or the binary "0". For example, if a bit needs to last for several microseconds, whether the signal is transmitted at the same time in two directions, how to create the initial connection, and how to terminate the connection.
In order to better understand the differences between the physical layer and the data link layer, the physical layer can be considered as the main one, which is related to the interaction between a single device and the transmission media, the data link layer focuses more on the interaction between multiple devices that use the same communication media (for example, at least two devices. The physical layer is used to tell a device how to send a signal to a media, and how another device receives the signal (in most cases, it does not tell the device how to connect to the media ). Some outdated physical layer standards such as RS-232 do use physical cables to control the access of communication media.
The main features and services of the physical layer are as follows:
- Create and terminate connections between devices and media.
- Participating in the communication process allows resources to be effectively allocated among the shared users. For example, conflict resolution and traffic control.
- Modulation or conversion of signals enables the digital signal definition in your device to match the digital signal actually transmitted over the Channel. These signals can be transmitted through physical cables (such as copper and optical cables) or wireless channels.
Layer-5 TCP/IP Model
Compared with the OSI Layer-7 model, the layer-5 model of TCP/IP is more common. The layer-5 model of TCP/IP combines the application layer, presentation layer, and Session Layer of the ISO layer-7 model into the application layer to obtain the layer-5 model, as shown in:
Three-way handshake and four-way handshake for TCP/IP communication
Three-way handshake:
The first handshake: the client sends the SYN Packet (SYN = x) to the server and enters the syn_send status, waiting for confirmation from the server;
The second handshake: when the server receives the SYN packet, it must confirm the customer's Syn (ACK = x + 1) and send a SYN Packet (SYN = Y), that is, the SYN + ACK packet, the server enters the syn_recv status;
The third handshake: the client receives the server's SYN + ACK package and sends the ACK (ACK = Y + 1) Confirmation package to the server. After the package is sent, the client and server enter the established status, complete three handshakes.
The package transmitted during the handshake does not contain data. After the three-way handshake is completed, the client and the server formally start to transmit data. Ideally, once a TCP connection is established, the TCP connection will remain until either of the two parties closes the connection.
Similar to the "three-way handshake" for establishing a connection, it is required to "four-way handshake" to disconnect a TCP connection ".
The first wave: Send a fin to the active side to the passive side to close the data transmission from the active side to the passive side, that is, the active side tells the passive side: I will not send you any more data (of course, the data sent out before the fin package will still be resold if the corresponding ack confirmation message is not received ), however, the active shutdown side can accept the data at this time.
Second wave: After the passive close party receives the FIN packet, it sends an ACK to the other party and confirms that the serial number is the received serial number + 1 (same as SYN, one fin occupies one serial number ). The third wave: Send a fin from the passive closing party to the data transmission from the passive closing party to the active closing party, that is, tell the active closing party that my data has been sent, no more data will be sent to you. The fourth wave: After the active closing party receives the fin, it sends an ACK to the passive closing party and confirms that the serial number is the received serial number + 1. At this point, the four waves are completed.
Figure of TCP/IP status transition
TCP/IP protocol is a protocol family in the computer network, and it is also the highlight of network programming! Understanding the TCP/IP status transition diagram is important to understanding the TCP/IP protocol.
As shown in, the transition from one state machine to another has been described as a condition that has triggered this transition.
The status chart is described as follows:
1. Closed: The starting point. It enters this status when the connection times out or is closed.
2. Listen: the status of the SVR end when it is waiting for the connection. The SVR end can enter this status by calling socket, bind, and listen functions. This is called passive opening of an application (waiting for the client to connect ).
3. syn_sent: the client initiates a connection and sends SYN to the server. If the server cannot be connected, it directly enters the closed state.
4. syn_rcvd: corresponds to 3. The server accepts the SYN request from the client, and the server enters the syn_rcvd state from the listen state. At the same time, the server must respond to an ACK and send a SYN to the client. In another case, the client receives the SYN request from the server while initiating the syn, the client will be changed from syn_sent to syn_rcvd.
5. Established: After three handshakes are performed between the server and the client, data transmission can be started.
The above describes the status transfer between the server and the client when the connection is established. It is relatively simple and clear. If you are familiar with the three-way handshake, it is easy to understand the status transfer during connection establishment.
Next, let's take a look at the status transfer description when the connection is closed. to close the connection, we need to perform interaction between the two parties four times. We also need to handle some aftermath (time_wait Status). Note, here, either the active or passive closing side does not refer to the server or client, but is relative to who initiates the closing request first:
6. fin_wait_1: The party that proactively closes the service. Status 5 changes to this status. The specific action is to send fin to the other party.
7. fin_wait_2: actively close the side, receive the other side of the FIN-ACK, enter this state. Therefore, you cannot receive data from the other party. However, data can be sent to the other party.
8. close_wait: After receiving the fin, the passively closed party enters this status. The specific action is to receive fin and send ACK at the same time.
9. last_ack: The party that passively closes the request and initiates a close request. The request enters this status from status 8. The specific action is to send the fin to the other party and the closed status is entered when the Ack is received.
10. Closing: when both sides initiate a close request, fin_wait_1 enters this status. The specific action is to receive a fin request and respond to an ACK at the same time.
11. time_wait: The most tangled state is coming. From the status chart, we can see that three States can be converted into one. Let's analyze them one by one:
A. the fin_wait_2 enters this state: when both parties initiate fin at the same time, the party that closes the request will complete the request, the status after receiving the fin from the passive closing side.
B. switch from the closing status to the status: Both Parties initiate both close requests, both initiate fin requests, and switch from the closing status when fin and ACK are received.
C. the status enters from fin_wait_1: Both fin (initiated by the other party) and ACK (Fin response initiated by itself) are received ), the difference with B is that the ACK in the fin response initiated by itself comes prior to the fin request of the other party, while B is in the fin first. In this case, the probability is the least.
The most difficult to understand the four closed connections is time_wait. There are two reasons for time_wait:
1. Reliable termination of TCP full-duplex connections.
2. Allow the old duplicate segments to disappear in the network.
Concept and function of MAC address
MAC address (mediaaccesscontroladdress)Media Access Control Address, or physical address, is used to define the location of a network device. In the OSI model, the layer-3 network layer is responsible for IP addresses, while the layer-2 data link layer is responsible for MAC addresses. A host has an IP address, and each network location has a MAC address dedicated to it.
The purpose and working principle of ARP
Address Resolution Protocol(ADdressREsolutionPThe basic function of rotocol is to query the MAC address of the target device through the IP address of the target device to ensure smooth communication. It is an indispensable network layer protocol in IPv4, but it is no longer applicable in IPv6 and is replaced by the Neighbor Discovery Protocol (NDP.
There is an ARP cache table in each computer or router with TCP/IP protocol installed. The IP addresses in the table correspond to the MAC addresses, as shown in the following table.
| Host Name |
IP address |
MAC address |
| A |
192.168.38.10 |
00-aa-00-62-d2-02 |
| B |
192.168.38.11 |
00-bb-00-62-c2-02 |
| C |
192.168.38.12 |
00-cc-00-62-c2-02 |
| D |
192.168.38.13 |
00-dd-00-62-c2-02 |
| E |
192.168.38.14 |
00-ee-00-62-c2-02 |
| ... |
... |
... |
Take host a (192.168.38.10) as an example to send data to host B (192.168.38.11. When sending data, host a searches for the target IP address in its ARP cache table. If you find the target MAC address (00-bb-00-62-c2-02), you can directly write the target MAC address into the frame and send it. If no corresponding IP address is found in the ARP cache table, host a sends a broadcast (arprequest) over the network. The target MAC address is "ff. ff. ff. ff. ff. FF ", which means to send such a question to all hosts in the same network segment:" What is the MAC address of 192.168.38.11?" Other hosts on the network do not respond to ARP requests. Only when host B receives the frame will it respond to host a as follows (arpresponse): "The MAC address of 192.168.38.11 is (00-bb-00-62-c2-02) ".
In this way, host a knows the MAC address of host B and can send information to host B. At the same time, it also updates its ARP cache table. The next time it sends a message to host B, it can directly find it from the ARP cache table. The ARP cache table adopts an aging mechanism. If a row in the table is not used for a period of time, it will be deleted. This can greatly reduce the length of the ARP cache table and speed up query.
Understand the concepts of vswitches, routers, and gateways, and understand their respective uses.
1) vswitch
In a computer network system, switches are designed to address the weakness of the shared working mode. The vswitch has a high-bandwidth back bus and an internal switching matrix. All the ports of the switch are attached to the back bus. When the control circuit receives the data packet, the processing port searches for the address table in the memory to determine the target MAC address (the hardware address of the NIC) the port on which the NIC is attached, and data packets are quickly transmitted to the destination port through the internal switching matrix. If the target Mac does not exist, the switch broadcasts all the ports. After receiving the port response, the switch "learns" the new address and adds it to the internal address table.
The switch works on the Layer 2 of the OSI reference model, that is, the data link layer. When each port is successfully connected, the CPU inside the switch learns its MAC address through ARP and saves it as an ARP table. In future communication, packets sent to the MAC address will only be sent to the corresponding port instead of all ports. Therefore, a vswitch can be used to divide the data link layer broadcast, that is, the conflict domain. However, it cannot divide the network layer broadcast, that is, the broadcast domain.
Vswitches are widely used in L2 network switching, also known as L2 switches ".
There are two types of vswitches: Layer 2 vswitches, Layer 3 vswitches, Layer 4 vswitches, and Layer 7 vswitches which work on Layer 2, Layer 3, and Layer 7 of the OSI Layer 7 model respectively.
2) vro
Vro(RouterIs a computer network device and provides two important mechanisms: routing and forwarding, this process is called a route; transfers data packets at the router end to an appropriate router output end (which is performed inside the router. Routing works on the layer 3 of the OSI model, that is, the network layer, such as the Internet protocol.
One role of a router is to connect different networks, and the other is to select the information transmission line.The difference between a vro and a switch is that the vroosi belongs to the third layer of OSI, and the switch is the second layer of OSI ).
3) Gateway
Gateway(GATEWAY ),GatewayAs the name suggests, it is a device connecting two networks. It is different from a router (due to historical reasons, many documents on TCP/IP once called the router used at the network layer as a gateway, today, many local networks use routes to access the network. Therefore, the gateway usually refers to the IP address of the router. It is often used in a home or small enterprise network to connect the LAN and the Internet. A gateway is also a device that converts a protocol to another protocol, such as a voice gateway.
In traditional TCP/IP terms, there are only two types of network devices: Gateway and host ). The gateway can transmit data packets between networks, but the host cannot forward data packets. In the host (also known as the terminal system, endsystem), data packets must be processed by the TCP/IP layer-4 Protocol. However, in the gateway (also called the intermediary system, intermediatesystem), you only need to reach the internetlayer ), after determining the path, you can transfer it. At that time, there was no difference between gateways and routers.
In modern network terms, gateways and routers have different definitions. A gateway can move data between different protocols, while a router moves data between different networks, which is equivalent to an IP gateway ).
A gateway is a device connected to two networks. For a Voice Gateway, it can connect to the PSTN network and Ethernet. This is equivalent to VoIP, which converts analog signals from different phones to digital signals through the gateway, and add the Protocol to transfer again. At the receiving end, the analog phone signal can be restored through the gateway before it can be heard on the phone.
The Gateway in the Ethernet can only forward data packets over three layers, which is the same as the routing. The difference is that there is no route table in the gateway, and it can only forward according to different preset network segments. The most important aspect of a gateway is port ing. users in the subnet only have different IP addresses corresponding to different ports on the Internet. In this way, users in the subnet will be protected.
First recognized route table
A routingtable or routinginformationbase is a workbook or class database stored in a vro or networked computer. The route table stores the path pointing to a specific network address (in some cases, it also records the path measurement value ). The routing table contains the topology information about the network. The main objective of creating a route table is to achieve routing protocol and static route selection.
The route table uses the idea similar to that of using map shipping packages. As long as a node on the network needs to send data to another node on the network, it must know where to send the data. The device cannot directly connect to the target node. It needs to find another method to send data packets. In the LAN, the node does not know how to send IP packets to the gateway. Sending data packets to the correct address is a complex task. The Gateway needs to record the path information of the sent data packets. The route table stores such path information. Like a map, it is a database that records path information and provides services for nodes that need this information.
Shows a route table:
Route table parameter description:
Destination: Destination CIDR Block mask: Subnet Mask interface: the egress IP address of the router that arrives at the destination Gateway: IP address of the next hop router entry, A Router defines a link to the next vro through the interface and gateway. In general, the interface and gateway are metric in the same network segment: Number of hops. The quality of this route record. Generally, if there are multiple route records that reach the same destination, the router will use the route with a small metric value.
MTU
Maximumtransmissionunit (MTU) refers to the maximum data packet size (in bytes) that can be passed over a layer of a communication protocol ). The maximum transmission unit parameter is usually related to the communication interface (network interface card, serial port, etc ).
The Internet Protocol allows IP sharding so that data packets can be divided into small fragments to pass through links with the maximum transmission unit smaller than the original size of the data packet. This fragment process occurs at the network layer (the third layer of the OSI model). The fourth layer is the transport layer, and the transport layer is the most important layer in the OSI model. Here, transmission is controlled according to the window, instead of MTU. The transmission protocol simultaneously controls traffic or specifies an appropriate sending rate based on the speed at which the recipient can receive data.
In addition, the transport layer forcibly splits long data packets according to the maximum size that can be processed by the network. For example, Ethernet cannot receive packets larger than 1500 bytes. The transmission layer of the sender node divides the data into smaller data slices, and arranges a serial number for each data segment so that the data can reach the transmission layer of the receiver node, the process can be reorganized in the correct order, which is called sorting. It uses the maximum transmission unit value that sends a group to the network interface on the link.
The Ethernet MTU value is1500 bytes.
Knowledge about Rip, OSPF, and BGP
Routinginformationprotocol (RIP) is the most widely used Internal Gateway Protocol (IGP ). (IGP) is the routing protocol used on the internal network (in rare cases, it can also be used to connect to the Internet ), it can dynamically adapt the vro to the changes in network connections by constantly exchanging information, including the networks that each vro can reach and how far these networks are. Rip belongs to the network layer.
Openshortestpathfirst (OSPF) is an implementation of the Link State routing protocol. It is the most widely used IGP (interiorgatewayprotocol) protocol in large and medium-sized networks and operates inside the autonomous system. The famous dikplus algorithm is used to calculate the Shortest Path Tree. It uses "cost (cost)" as a routing metric. The link status database (lsdb) is used to save the current network topology. It is the same on all routers in the same region.
Border Gateway Protocol (BGP) is a routing selection protocol between autonomous systems. It is a core decentralized autonomous routing protocol on the Internet.
BGP is the only protocol used to process networks like the Internet. It is also the only protocol that can properly handle multi-channel connections between related routing domains. BGP is built on the experience of EGP. The main function of the BGP system is to exchange network accessibility information with other BGP systems. Network accessibility information includes information about the listed autonomous systems (. This information effectively constructs the topology of the as interconnection, removes the routing loop, and implements policy decision-making at the AS level.
DNS
DNS (domainnamesystem, Domain Name System) is a distributed database that maps domain names and IP addresses on the Internet, allowing users to access the Internet more conveniently, instead of remembering the number of IP address strings that can be directly read by machines. The process of obtaining the IP address corresponding to the host name through the host name is called domain name resolution (or host name resolution ). The DNS protocol runs on the UDP protocol and uses the port number 53.
TCP, UDP, and HTTP are different from each other
The TCP/IP protocol is a protocol cluster that contains many protocols. TCP, UDP, and HTTP are only members of the TCP/IP protocol cluster. The reason for naming is TCP/IP, because TCP and ipprotocol are two important protocols, they are used for naming.
1) TCP/IP protocol clusters can be divided into three layers: network layer, transmission layer and application layer. The IP protocol, ICMP protocol, ARP protocol, RARP protocol, and BOOTP protocol are available at the network layer. There are TCP and UDP protocols in the transport layer. The application layer includes FTP, HTTP, telnet, SMTP, DNS, and other protocols. HTTP is also a transfer protocol that transfers hypertext from a Web server to a local browser.
2) HTTP is based on the request/response model. First, the customer establishes a TCP link with the server and sends a request to the server. The request contains the request method, Uri, Protocol version, and related mime-style messages. The server responds to a status line that contains the Protocol version of the message, a success and failure code, and related mime-style messages. HTTP/1.0 creates a new TCP link for each HTTP Request/response. Therefore, a page containing HTML content and images requires multiple short-term TCP links. Three handshakes are required for the establishment of a TCP link. In addition, TCP takes an additional loop connection time (RTT) to get the appropriate transmission speed ). The establishment of each link requires such regular overhead, but does not carry actual useful data, but only ensures the reliability of the link. Therefore, HTTP/1.1 proposes the implementation method of sustainable link. HTTP/1.1 will only establish a TCP link once and repeatedly use it to transmit a series of request/response messages, thus reducing the number of connection establishment times and regular link overhead.
3) Although HTTP itself is a protocol, it is ultimately based on TCP. At present, some people are studying the TCP + UDP-based HTTP protocol.
What happened after entering a website in the browser
For details, refer to: Introduction to Internet Protocol (2)
The above describes some terms and concepts related to computer networks. Of course, everything is just getting started.
If you need to know more. I can tell you responsibly: the following link is the best resource for you to learn more about the working mechanism of computer networks, is the essence of the book "TCP-IP details-Volume I", which is regarded as the Bible, from vamei.
TCP-IP protocol (1) Postmaster and post office (network protocol overview)
TCP-IP protocol (2) small speakers start broadcasting (Ethernet and WiFi Protocol)
TCP-IP protocol details (3) IP race (IP, ARP, RIP and BGP protocols)
TCP-IP protocol (4) address depletion crisis (IPv4 and IPv6 addresses)
TCP-IP protocol details (5) I try my best (IP protocol details)
TCP-IP protocol details (6) Swiss Army Knife (ICMP protocol)
TCP-IP protocol (7) slave (UDP Protocol)
TCP-IP protocol details (8) do not give up (TCP and stream communication)
TCP-IP protocol details (9) Love sound transmitter (TCP connection)
TCP-IP protocol details (10) devil details (TCP Sliding Window Management)
TCP-IP protocol (11) Nirvana (TCP resend)
TCP-IP protocol (12) the world for public (TCP congestion control)
TCP-IP protocol (13) 9527 (DNS Protocol)
TCP-IP protocol (14) Reverse Attack (CIDR and NAT)
TCP-IP agreement details (15) Mr., do you want to order? (HTTP overview)
If you have some knowledge in the above articles, we will almost understand the principles. Next, we need practice. Well, let's implement the following network program.
1. Implement a simple server/client model.
2. implement a program (a blocking Network Program) that serves multiple clients simultaneously on a server in the form of multiple processes/threads ).
3. Implement an event-driven server network program (such as Linux epoll) (asynchronous non-blocking Network Program ).
If all of the above are done, let's continue! As nginx swept the globe, everyone may be curious to find out. The following is a summary of several excellent articles on nginx learning!
Nginx learning resource Summary
Finally, there are two good articles related to computer networks that should not be missed:
TCP (I)
What about TCP (below)