Ext.: http://kb.cnblogs.com/page/211867/
Almost all computer programs will involve network communication. Therefore, it is very important for every programmer to understand the basic knowledge of computer network.
This article introduces some basic network knowledge at the same time, gives some high-quality article link, in order to facilitate everyone to refer to the study at any time. I believe that through this study, you can have a comprehensive understanding of the computer network!
Before reading this article, it is recommended to read the following two times to have a general understanding of how the computer network works.
Introduction to Internet Protocol (i)
Introduction to Internet Protocol (ii)
Next, we introduce some basic network knowledge.
OSI Reference Model
Is the OSI seven-layer reference Model a bit dizzy? If so, read the first two articles that are recommended at the beginning of the article.
Layer 7th: Application layer (Applicationlayer)
The application layer is able to communicate with the application interface to achieve the intended purpose of presentation to the user. In this common protocol are: HTTP,HTTPS,FTP,TELNET,SSH,SMTP,POP3 and so on.
Layer 6th: Presentation Layer (Presentationlayer)
The presentation layer can provide data and information for different clients in the syntax of the conversion code, so that the system can interpret the correct data. At the same time, can also provide compression decompression, encryption and decryption.
Level 5th: Session layer (Sessionlayer)
The session layer is used to develop communication methods for both sides of the communication and to create and unregister sessions (both parties communicate).
Layer 4th: Transport Layer (Transportlayer)
The transport layer is used to control data traffic and debug and error-handling to ensure smooth communication. The transmission layer of the transmitting side will be grouped by the serial number, so that the receiving end of the packet reorganization into useful data or files.
Layer 3rd: Network layer (Networklayer)
The role of the network layer is to determine how the sender's data is transmitted to the receiver. This layer determines the optimal path for node x to node y by considering network congestion, quality of service, sending priority, and the cost of each route. Our well-known routers work at this level, and the network becomes interconnected through constant reception and transmission of data.
Layer 2nd: Data link layer (datalinklayer)
First, the function of the data link layer is to manage the first tier of bit data and send the correct data to a route that has no transmission errors. The creation and identification of the data start and exit locations are marked at the same time. In addition, it deals with the problem of data corruption, loss, or even repeated transmission errors, so that subsequent hierarchies are unaffected, so it runs debugging, retransmission, or remediation of data, and determines when the device is transferred. Devices are: Bridge bridging switch Switches
Layer 1th: Physical layer (Physicallayer)
The physical layer defines the specifications for all electronic and physical devices. This specifically defines the relationship between the device and the physical medium, which includes the design definitions for pins, voltages, cable specifications, hubs, repeaters, network cards, host adapters (host adapters used in the San), and other devices. Because the physical layer transmits the raw bit data stream, the purpose of the design is to ensure that when the signal is sent to the binary "1", the other party receives the binary "1" instead of the binary "0". As a result, it is necessary to define which device has several pins, which pin sends a binary "1" or a binary "0", and for example a bit needs to last a few microseconds, whether the transmission signal is in both directions, how the initial connection was created and how the final termination of the problem.
In order to better understand the difference between the physical layer and the data link layer, the physical layer can be thought of as primary, related to the interaction between a single device and a transmission medium, while the data link layer focuses more on interactions between multiple devices using the same communication medium (for example, at least two devices). The physical layer's role is to tell a device how to send a signal to a communication medium, and how another device receives the signal (in most cases it does not tell the device how to connect to the communication medium). Some outdated physical layer standards, such as RS-232, do use physical cables to control access to communications media.
The main functions of the physical layer and the services provided are as follows:
- Creates and terminates a connection between the device and the transport medium.
- Participating in the communication process allows resources to be efficiently distributed among shared multiuser users. For example, conflict resolution mechanisms and traffic control.
- The modulation or conversion of a signal enables the definition of a digital signal in a user's device to match the digital signal actually transmitted on the channel. These signals can be transmitted via physical cables, such as copper and optical cables, or wireless channels.
5-tier model of TCP/IP
Compared to the OSI seven-layer model, the 5-tier model of TCP/IP is more commonly used. The 5-tier model of TCP/IP combines the application layer, presentation layer, and session layer of the ISO seven-tier model into the application layer, resulting in a five-tier model as shown:
Three handshake, four waves of TCP/IP communication
Three-time handshake:
First handshake: The client sends a SYN packet (SYN=X) to the server and enters the Syn_send state, waiting for the server to confirm;
Second handshake: The server receives the SYN packet, it must confirm the customer's SYN (ACK=X+1), and also send itself a SYN packet (syn=y), that is, the Syn+ack packet, when the server enters the SYN_RECV state;
Third handshake: The client receives the server's Syn+ack packet, sends the acknowledgment packet ack (ACK=Y+1) to the server, the packet is sent, the client and the server enter the established state, and the handshake is completed three times.
The data is not included in the packets that are delivered during the handshake, and the client and server formally begin transmitting the data after the three handshake is complete. Ideally, once a TCP connection is established, the TCP connection is maintained until either side of the communication actively closes the connection.
Similar to the "three-time handshake" that establishes a connection, disconnecting a TCP connection requires a "four-time handshake".
First wave: The active shutdown sends a FIN, which is used to close the active side to the passive closed side of the data transfer, that is, the active shutdown to tell the passive shut-off party: I will no longer send you the data (of course, the data sent before the FIN packet, if not received the corresponding ACK acknowledgement message, The active shutdown will still re-send the data), but the active shut-off party can also accept the data at this time.
Second wave: The passive closing party receives the fin packet, sends an ACK to the other, confirms that the serial number is received ordinal +1 (same as SYN, one fin occupies a serial number).
Third wave: The passive shut-off side sends a fin, used to close the passive shut-off side to the active shutdown of the data transfer, that is, to tell the active shut-off party, my data is sent out, will not send you data.
Wave for the fourth time: the active closing party receives fin, sends an ACK to the passive closing party, confirms the serial number to receive the serial number +1, to this point, completes four times the wave.
The play says the TCP/IP status transformation diagram
TCP/IP protocol is a protocol family in computer network, and it is also a play in the programming of networks. Understanding the TCP/IP status conversion diagram is important to understand the working process of TCP/IP protocol.
As shown, the transition of a state machine to another state machine, which has triggered the condition of this state transition, is described.
The status diagram is described in detail below:
1.CLOSED: The starting point, which enters this state when the time-out or connection is closed.
The 2.listen:svr side waits for the connection to come over the state, the SVR side to call the Socket,bind,listen function, can enter this state. This is called the application passive open (waiting for the client to connect).
3.syn_sent: The client initiates the connection and sends the SYN to the server side. If the server side cannot connect, go directly to the closed state.
4.SYN_RCVD: Corresponds to 3, the server side accepts the client's SYN request, and the server driven by listen status into the SYN_RCVD state. At the same time the server side to respond to an ACK, while sending a SYN to the client; in another case, the client receives a SYN request from the server while initiating the SYN, and the client is syn_sent to the SYN_RCVD state.
5.ESTABLISHED: The server side and the client have completed 3 handshake entry States, indicating that data can already be transferred.
These are the state transfer instructions generated by the server side and the client when the connection is established. Relatively simple and straightforward, if you are familiar with the three-time handshake, it is easy to understand the state transfer when establishing a connection.
Below, we take a look at the state transfer instructions when the connection is closed, close the need to do 4 interaction between the two sides, but also to deal with some of the aftermath (time_wait state), note that the active shutdown here or the passive side of the party does not refer specifically to the server side or the client, is relative to the person who first initiated the closing request:
6.fin_wait_1: The active closed party, which enters this state by state 5. The specific action is to send fin to each other.
7.fin_wait_2: The active closed party, receives the other party's Fin-ack, enters this state. This can no longer receive data from each other. But the ability to send data to each other.
8.close_wait: After receiving fin, the passive closed side enters this state. The specific action is to receive the fin, while sending an ACK.
9.last_ack: The passive closed party initiates a shutdown request, which is entered into this state by State 8. The specific action is to send fin to the other side, while receiving an ACK to enter the closed state.
10.CLOSING: When both sides initiate a close request, this state is entered by Fin_wait_1. The specific action is to receive a FIN request while responding to an ACK.
11.time_wait: The most tangled state has come. As can be seen from the state diagram, there are 3 states that can be transformed into it, and we hit analyze:
A. Enter this state by fin_wait_2: In the case where the two parties do not simultaneously initiate fin, the active shut-off party receives a status of the passive closed side's fin after it completes its own initiated shutdown request.
B. Entered by the closing state: Both sides initiated the closure, both made the request to initiate fin, and received the fin and made an ACK in the case of the closing state entered.
C. Entered by the Fin_wait_1 state: At the same time received to fin (each other initiates), the ACK (itself initiates the fin response), differs from B in that the ACK of the fin response itself originated before the other's fin request arrives, and B is the fin arrives first. The probability of this is minimal.
Closed 4 Connections The most difficult to understand state is time_wait, there are 2 reasons for time_wait:
1. The termination of TCP full-duplex connections is implemented reliably.
2. Allow old repeating sections to fade out of the network.
The concept and function of MAC address
The MAC address (mediaaccesscontroladdress), the media access control address, or the physical address, is used to define the location of the network device. In the OSI model, the third layer of network layer is responsible for the IP address, the second layer of data link is responsible for the MAC address. A host will have an IP address, and each network location will have a MAC address dedicated to it.
The purpose of the ARP protocol and how it works
Address Resolution Protocol (AddressResolutionProtocol), whose basic function is to query the MAC address of the target device through the IP address of the target device, To ensure the smooth conduct of communications. It is an essential protocol for the network layer in IPV4, but it is no longer applicable in IPv6 and is replaced by the Neighbor Discovery Protocol (NDP).
There is an ARP cache table in each computer or router that has the TCP/IP protocol installed, and the IP address in the table corresponds to the MAC address, as shown in the following table.
Host Name |
IP Address |
MAC address |
A |
192.168.38.10 |
00-aa-00-62-d2-02 |
B |
192.168.38.11 |
00-bb-00-62-c2-02 |
C |
192.168.38.12 |
00-cc-00-62-c2-02 |
D |
192.168.38.13 |
00-dd-00-62-c2-02 |
E |
192.168.38.14 |
00-ee-00-62-c2-02 |
... |
... |
... |
Take host A (192.168.38.10) to Host B (192.168.38.11) to send data as an example. When data is sent, host a looks for the destination IP address in its own ARP cache table. If found, also know that the target MAC address is (00-BB-00-62-C2-02), directly to the target MAC address into the frame to send it, if the corresponding IP address is not found in the ARP cache table, Host A Will send a broadcast (arprequest) on the network with the target MAC address "FF." Ff. Ff. Ff. Ff. FF ", which means that the query is sent to all hosts in the same network segment:" What is the MAC address of the 192.168.38.11? " "Other hosts on the network do not respond to ARP queries, and only Host B responds to host a when it receives the frame (arpresponse):" 192.168.38.11 's MAC address is (00-BB-00-62-C2-02).
In this way, host a knows the MAC address of Host B and it can send messages to Host B. It also updates its own ARP cache table, and the next time it sends information to Host B, it can be found directly from the ARP cache table. The ARP cache table uses an aging mechanism that, for a period of time, is deleted if a row in the table is not used, which can greatly reduce the length of the ARP cache table and speed up the query.
Understand the concepts of switches, routers, gateways, and know their purpose
1) switch
In the computer network system, the switch is aimed at the weakness of the shared working mode. The switch has a high-bandwidth back bus and an internal switching matrix. All the ports on the switch are hooked up on the back bus, and when the control circuit receives the packet, the processing port looks for the in-memory address table to determine which port the NIC (NIC) of the destination Mac (the hardware address of the network card) is hooked on, and the internal switch fabric quickly transmits the packet to the destination port. If the destination Mac does not exist, the switch broadcasts to all ports, and after the receive Port responds the switch "learns" the new address and adds it into the internal address table.
The switch works on the second layer of the OSI Reference Model, the data link layer. The CPU inside the switch learns its MAC address through the ARP protocol and saves it as an ARP table when each port is successfully connected. In future communications, packets destined for that MAC address will be sent only to their corresponding port, not all ports. Therefore, the switch can be used to divide the data link layer broadcast, namely the conflict domain, but it cannot divide the network layer broadcast, namely broadcast domain.
Switches are widely used in two-layer network switching, commonly known as "two-layer switch".
The types of switches are: Two-layer switch, three-layer switch, four-layer switch, seven-layer switch work in the OSI seven-layer model of the second layer, the third layer, the fourth box seventh layer, and hence the name.
2) Router
A router (Router) is a computer network device that provides two important mechanisms for routing and forwarding, which can determine the routing path that packets pass from the source to the destination (the transfer path between host and host), a process called routing , the packet of the router input is transferred to the appropriate router output (inside the router), which is called forwarding. Routing works on the third layer of the OSI model-the network layer, such as the Internet Protocol.
one function of routers is to connect to different networks, and the other is to choose the route of information transmission. the router is the third layer of the OSI product, and the switch is the second layer of the OSI product (specifically the two-layer switch).
3) Gateway
Gateway , the Gateway , as the name implies is connected to two network devices, different from the router (for historical reasons, many of the literature about TCP/IP has used the network layer of the router (Router) called the Gateway, Today, many local area network adoption is the road access network, so now usually refers to the gateway is the router's IP, often used in the home or small business network, used to connect LAN and the Internet. Gateways also often refer to devices that turn a protocol into another protocol, such as a voice gateway.
In traditional TCP/IP terminology, network devices are divided into two types, one gateway and one host. Gateways can transmit packets between networks, but the host cannot forward packets. In the host (also known as the Terminal System, Endsystem), the packet is processed by the TCP/IP four layer protocol, but in the gateway (also known as the intermediary system, Intermediatesystem) only need to reach the internetwork layer (Internetlayer), the decision path can be forwarded. At the time, there was no difference between gateways and routers (router).
In modern network terminology, gateways (gateway) are defined differently from routers (router). Gateways can move data between different protocols, while routers (router) move data between different networks, equivalent to the traditional IP gateways (ipgateway).
Gateway is a device connecting two networks, for the Voice gateway, he can connect the PSTN network and Ethernet, which is equivalent to VoIP, the different phones in the analog signal through the gateway and converted to digital signals, and to join the protocol to transfer. At the receiving end, the gateway is then restored to the analog phone signal, and finally can be heard on the telephone.
For gateways in Ethernet, you can only forward packets above layer three, which is the same as routing. The difference is that the gateway does not have a routing table, he can only be forwarded in accordance with a predetermined set of different network segments. The most important point of the gateway is the port mapping, the subnet of the user outside the network appears to be only the IP address of the external network corresponding to different ports, so it appears to protect the users within the subnet.
Initial knowledge of routing tables
The routing table (routingtable) or routing Domain Information Base (, routinginformationbase) is a spreadsheet (file) or class database that is stored on a router or networked computer. The routing table stores a path to a specific network address (in some cases, route metrics with a path are also logged). The routing table contains topology information for the network perimeter. The main goal of routing table is to implement routing protocol and static route selection.
The routing table uses a similar idea of delivering parcels using maps. As long as a node on the network needs to send data to another node on the network, it must know where to send the data. The device cannot connect directly to the destination node, it needs to find another way to send the packet. In a local area network, the node also does not know how to send IP packets to the gateway. Sending a packet to the correct address is a complex task, and the gateway needs to record the path information of the sending packet. The routing table stores such path information, like a map, as a database that records path information and serves the nodes that need that information.
As shown in a routing table:
Routing Table parameter Description:
Destination: Destination network segment
Mask: Subnet Mask
Interface: The egress IP of the router to which the destination is reached
Gateway: IP for next-hop router ingress, routers define a link to the next router via interface and gateway, typically, interface and gateway are the same network segment
Metric: Hop count, the quality of the route record, in general, if there are more than one route to the same destination, the router will take the route with a small metric value
Mtu
The Maximum Transmission Unit (MAXIMUMTRANSMISSIONUNIT,MTU) refers to the maximum packet size (in bytes) that can be passed on a layer of a communication protocol. Maximum Transmission Unit This parameter is usually related to the communication Interface (network interface card, serial port, etc.).
The Internet protocol allows IP shards so that packets can be divided into fragments that are small enough to pass through those links whose maximum transmission unit is smaller than the original size of the packet. This fragmentation process occurs at the network layer (the third layer of the OSI model), the fourth layer is the transport layer, and the transport layer is the most important layer in the OSI model, where the transmission is controlled by the window, not the MTU. The transmission protocol carries out both traffic control or the appropriate sending rate based on how quickly the receiver can receive the data.
In addition, the transport layer forcibly splits the long packets according to the maximum size the network can handle. For example, Ethernet cannot receive packets larger than 1500 bytes. The transport layer of the sender node splits the data into smaller pieces of data, and arranges a sequence number for each piece of data, so that when the data arrives at the receiver node's transport layer, it can be reorganized in the correct order, and the process is called sequencing. It uses the value of the maximum transmission unit that sends packets to the network interface on the link.
The Ethernet MTU value is 1500 bytes .
RIP, OSPF, BGP awareness
The Routing Information Protocol (ROUTINGINFORMATIONPROTOCOL,RIP) is one of the most widely used internal Gateway Protocol (IGP). (IGP) is a routing protocol used on an internal network (and in a few cases, a network connected to the Internet) that allows routers to dynamically adapt to changes in network connectivity through constant exchange of information, including which networks each router can reach, how far these networks are, and where RIP belongs to the network layer.
Open Shortest Path First (OPENSHORTESTPATHFIRST,OSPF) is an implementation of the link state routing protocol, which is the most widely used IGP (interiorgatewayprotocol) protocol in large and medium-sized networks, and operates within the autonomous system. The famous DICKERSGA algorithm is used to calculate the shortest path tree. It uses "cost" as a route metric. The link state database (LSDB) is used to hold the current network topology, which is the same across all routers in the same region.
BGP (Border Gateway Protocol, BORDERGATEWAYPROTOCOL) is a routing protocol between Autonomous systems, which is a core de-centralized autonomous routing protocol on the Internet.
BGP is the only protocol that handles Internet-sized networks and is the only protocol that can properly handle a multi-channel connection between unrelated routing domains. BGP builds on the experience of EGP. The main function of the BGP system is to exchange networks with other BGP systems to reach information. The network-accessible information includes the listed autonomous system (AS) information. This information effectively constructs the topology diagram of the as interconnect and thus clears the routing loop, and can implement policy decisions at the AS level.
Dns
DNS (domainnamesystem, Domain Name System), a distributed database of domain names and IP addresses mapped on the Internet, makes it easier for users to access the Internet without remembering the number of IP strings that can be read directly by the machine. The process of obtaining the IP address of the host name through the hostname is called Domain name resolution (or hostname resolution). The DNS protocol runs on top of the UDP protocol, using the port number 53.
TCP, UDP, and HTTP differ from the contact
The TCP/IP protocol is a protocol cluster that contains many kinds of protocols, TCP, UDP, and HTTP are only members of the TCP/IP protocol cluster. The TCP/IP protocol is named because the TCP,IP protocol is two very important protocols, named after him.
1) The TCP/IP protocol cluster can be broadly divided into three levels: Network layer, Transport layer and application layer.
At the network layer are IP protocols, ICMP protocols, ARP protocols, RARP protocols, and BOOTP protocols.
There are TCP protocols and UDP protocols in the transport layer.
In the application layer, there are FTP, HTTP, TELNET, SMTP, DNS and other protocols.
HTTP is also a protocol that transmits hypertext to a local browser from a Web server.
2) The HTTP protocol is built on the request/response model. First the client establishes a TCP link to the server and sends a request to the server that contains the request method, URI, protocol version, and the associated MIME-style message. The server responds to a status line that contains the protocol version of the message, a success and failure code, and the associated MIME style message.
Http/1.0 creates a new TCP link for each HTTP request/response, so a page containing HTML content and pictures will need to establish multiple short-term TCP links. The establishment of a TCP link will require a 3-time handshake.
In addition, in order to obtain the appropriate transfer speed, TCP is required to spend additional loop link time (RTT). Each time the establishment of the link requires this recurring overhead, and it does not have the actual useful data, just to ensure the reliability of the link, so http/1.1 put forward a sustainable link implementation method. http/1.1 will use it to transmit a series of request/response messages repeatedly using only one TCP link, thus reducing the number of link builds and the frequent link overhead.
3) Although HTTP itself is a protocol, it is ultimately TCP-based. Currently, some people are studying the HTTP protocol based on the TCP+UDP hybrid.
What happens when you enter a Web site in a browser
See also: Introduction to Internet Protocol (ii)
The above describes some of the computer network related terminology, concepts, of course, everything has just begun.
If you need to know more. I can be very responsible to tell you: The following link is you more in-depth understanding of the computer network working mechanism of the best resources, is the Bible "tcp-ip Detailed-Volume I" A book of the essence of the content, from Vamei June.
TCP-IP protocol Details (1) Postman and Post Office (Network Protocol Overview)
TCP-IP protocol Details (2) Small speakers start Broadcasting (Ethernet and WiFi protocol)
TCP-IP protocol Detailed (3) IP relay (IP, ARP, RIP and BGP protocol)
TCP-IP Protocol Details (4) Address exhaustion crisis (IPV4 and IPV6 addresses)
TCP-IP Protocol Details (5) I do my best (IP protocol explained)
TCP-IP protocol detailed (6) Swiss Army Knife (ICMP protocol)
TCP-IP protocol Detailed (7) puppet (UDP protocol)
TCP-IP protocol Details (8) Do not discard (TCP protocol and Flow communication)
TCP-IP Protocol Details (9) Love of the Megaphone (TCP connection)
TCP-IP protocol details (10) Devil's detail (TCP sliding window management)
TCP-IP protocol Detailed (11) Nirvana (TCP Resend)
TCP-IP Protocol Details (12) Ephah for (TCP congestion control)
TCP-IP protocol Detailed (9527) (DNS protocol)
TCP-IP protocol Details (14) Reverse Attack (CIDR and NAT)
TCP-IP protocol Details (15) Sir, would you like to order? (HTTP protocol Overview)
If the knowledge of the above article is understood, then the original rational things we almost all indefinitely. Next, what we need is the actual combat. Well, let's implement the following network program.
1. Implement a simple one-answer server/client model.
2, a multi-process/thread to implement a server at the same time for multiple Clients Service program (blocking network program).
3, implement an event-driven server network program (such as Linux Epoll) (asynchronous non-blocking network program).
If all of these things are done, Next, wonderful continue! When Nginx swept the world, everyone was curious to find out. Here are a few pretty good learning nginx Essentials Article Summary!
Nginx Learning Resources Summary
Finally, the computer network related two good articles should not be missed:
The things about TCP (UP)
What happens with TCP (bottom)
Self-cultivation of programmers (2)--Computer network