I. Preface
At present, the speed of network development is very fast, and more people are learning the network. People with a little knowledge about the network know it.TCP/IPThe Protocol is the basis of the network and the language of the Internet.TCP/IPThe Protocol has no Internet today. At present, there are a lot of people who are known to be engaged in the Internet. Many people are connected to the network from a clamp and a tester. If they are just playing online, just know a few ping and other commands, if you want to develop more in the network, whether it is black or red, you mustTCP/IPThe Protocol is very clear.
LearnedTCP/IPThere was a feeling that the protocol was too abstract and there was no data instance. I forgot it soon. This article introduces an intuitive learning method that uses protocol analysis tools for learning.TCP/IPIn the learning process, you can intuitively see the specific data transmission process.
To make it easier for beginners to understand, this article will build a simple network environment that does not contain subnets.
Ii. Test Environment
1. Network Environment
1.
Figure 1
For ease of expression, machine 208 is the computer with the address 192.168.113.208 and machine 1 is the computer with the address 192.168.113.1.
2. Operating System
Both machines are Windows 2000 and machine 1 are used as servers to install the FTP service.
3. Protocol Analysis Tools
Common tools in Windows include Sniffer Pro, natxray, Iris, and network monitor in Windows 2000. This document uses Iris as a protocol analysis tool.
Install Iris software on the client machine 208.
Iii. Test process
1. test example: download a file from machine 1 to machine 208 through FTP.
2. Iris settings.
Because Iris has the network listening function, if there are other machines in the network environment that will capture many other data packets, this will bring a lot of inconvenience to learning, to clearly understand the transmission process of the above example, Iris is set to capture only data packets between machine 208 and machine 1. The procedure is as follows:
1) press Ctrl + B to enter the address tableIPAddress, in order to better understand the captured packets, do not add the Host Name (name). Close this window after setting.
Figure 2
2) press the shortcut key Ctrl + e to bring up the filter settings. Select"IPIn the right column, drag the address in the address book to the bottom, set it, and click OK. In this way, the packets between the two computers are captured.
Figure 3
3. packet capture
Click Start in the iris toolbar. Enter ftp: // 192.168.113.1 in the browser, find the file to be downloaded, right-click the file, and select copy to folder in the pop-up menu to start downloading, click the button in the iris toolbar to stop packet capture. Figure 4 shows the entire FTP process. Next we will analyze this process in detail.
Figure 4
Note: To capture ARP packets, run ARP-D in Windows 2000 to clear ARP cache.
Iv. Process Analysis
1,TCP/IPBasic principles
The focus of this article is to analyzeTCP/IPBut you must give a brief description of the process.TCP/IP.
A. The network is layered, and each layer is responsible for different communication functions.
TCP/IPGenerally considered as a layer-4 protocol system,TCP/IPA protocol family is a group of different protocol families. Although this protocol family is usually calledTCP/IP,TCPAndIPOnly two protocols are supported, as shown in table 1. Each layer is responsible for different functions:
Table 1
The concept of hierarchy is very simple, but it is very important in practical applications. It is well understood in network settings and troubleshooting, it will be of great help to the work. For example, set the route to the network layerIPThe MAC address is ARP at the link layer. The commonly used ping command is done by ICMP.
Figure 5 shows the relationship between protocols at different layers. Understanding the relationship between them is very important for the following protocol analysis.
Figure 5
B. Data is sent from top to bottom, and coded layer by layer. Data is received from the bottom up and decoded layer by layer.
When the application usesTCPWhen transmitting data, the data is sent to the protocol stack, and then passes through each layer one by one until it is sent to the network as a string of bit streams. Each layer adds some header information (and sometimes tail information) to the received data, as shown in Step 6.TCPPassIPThe data unit is calledTCPThe packet segment or abbreviation isTCP. The data unit that I p transmits to the network interface layer is calledIPDatagram. A bit stream transmitted over Ethernet is called a frame ).
The data is sent from top to bottom according to Figure 6, and the data is received from bottom up and decoded layer by layer.
Figure 6
C. Logically, communication is completed at the same level.
The vertical structural hierarchy is a common process of data processing. Each layer has interfaces with its adjacent layers. For communication, the two systems must transmit data, commands, addresses, and other information between different layers. The logical flow of communication is different from that of real data streams. Although the communication process passes through various layers vertically, each layer can communicate directly with the corresponding layer of the remote computer system logically.
As shown in figure 7, communication is actually performed in the vertical direction, but logically the communication is performed at the same level.
Figure 7
2. process description
For better analysis protocols, we will first describe the data transmission steps in the above example. 8:
1) FTP client requestTCPServerIPAddress to establish a connection.
2)TCPSend a connection request to the remote host in segments.IPSend one copy at the addressIPDatagram.
3) if the target host is on the local networkIPThe datagram can be directly sent to the target host. If the target host is on a remote networkIPThe routing function is used to determine the next route address on the local network and forward itIPDatagram. In both cases,IPAll data packets are sent to a host or router located on the local network.
4) in this example, the sender host must have a 32-bitIPThe address is changed to a 48-bit ethernet address, also known as a MAC address. It is the world's only hardware address written to the NIC at the factory. SetIPIt is done by ARP to translate the address to the corresponding MAC address.
5) as shown in the dotted line, ARP sends an Ethernet data frame called an ARP request to each host on the Ethernet. This process is called broadcast. The ARP request data frame containsIPAddress, which means "if you areIPThe address owner. Please reply to your hardware address ."
6) After receiving the broadcast, the ARP layer of the target host identifies that the sender is asking for it.IPThen, an ARP response is sent. This ARP response contains the I P address and the corresponding hardware address.
7) after receiving the ARP response, make the ARP request-response exchangeIPData packets can be transmitted now.
8) SendIPReport data to the target host.
Figure 8
3. instance analysis
The following is an analysis of the packets captured by IRIS.TCP/IPFor a clearer explanation of the data transfer process, we capture four groups of data at different stages of transmission: finding the server, establishing a connection, transmitting data, and terminating the connection. For each group of data, follow the three steps below.
Show data packets
Interpret this packet
Analyze the packet header information by Layer
First group search Server
1) displays data in rows 1 and 2.
Figure 9
2) interpret data packets
The two rows of data are the process of finding the server and server response.
In row 3, the MAC address of the source host is 00: 50: FC: 22: C7: Be. The MAC address of the target host is FF: FF. This address is expressed in hexadecimal notation, and F is converted to binary 1111, the address in full 1 is the broadcast address. The so-called broadcast is to send information to each network device on the Internet. Each Ethernet interface on the cable must receive and process the data frame. This line reflects Step 5) ARP sends an Ethernet data frame called an ARP request to each host on the Ethernet. Each Nic In the network receives the following message: "Who is 192.168.113.1?IPThe address owner. Please tell me your hardware address ".
Row 2nd reflects the content of Step 6. Each machine on the same Ethernet will "receive" this packet. However, in normal conditions, other hosts except HOST 1 will ignore this packet, when the arp layer of host 1 receives the broadcast packet, it identifies that the sender is asking for it.IPThen, an ARP response is sent. Inform yourselfIPAnd MAC address. Line 2nd clearly shows the information of the No. 1 answer _ your MAC address 00: 50: FC: 22: C7: Be.
These two lines reflect the Q & A communication process between the data link layer. This process is like finding a person named "James" in a classroom filled with people and shouting "James" at the door. Everyone heard this, this is called broadcast. After hearing this, James responded, and others did not respond, so she got in touch with Michael.
3) header information analysis
As shown in the left column, the 1st packet contains two headers: Ethernet and ARP.
Figure 10
Table 2 in the following table lists the Ethernet header information. The numbers in the brackets indicate the number of bytes occupied by this field. The first two fields in the Ethernet header are the Ethernet source address and destination address. The special address with the destination address of all 1 is the broadcast address. All Ethernet interfaces on the cable must receive broadcast data frames. Two bytes long Ethernet frame types indicate the type of the subsequent data. For an ARP request or response, the value of this field is 0806.
As shown in row 2nd, although ARP requests are broadcast, the destination address of the ARP response is HOST 1 (00 50 FC 22 C7 be ). ARP responses are directly sent to the Request Host.
Table 2
Table 3 contains the ARP header information. The hardware type field indicates the hardware address type. The value 1 indicates the ethernet address. The protocol type field indicates the Protocol address type to be mapped. The value is 0800.IPAddress. The value is the same as the value of the type field in the Ethernet data frame containing the I p datagram. The following two 1-byte fields indicate the length of the hardware address and the Protocol address respectively, in bytes. For EthernetIPFor ARP requests or responses of addresses, their values are 6 and 4, respectively. OP indicates the operation (opoperation). 1 indicates the ARP request, 2 indicates the ARP response, 3 indicates the RARP request, and 4 indicates the RARP response. In the second row, 2 indicates the response. The following four fields are the sender's hardware address and sender'sIPAddress, destination hardware address, and destinationIPAddress. Note that there is some duplicate information: both the Ethernet data frame header and the ARP request data frame have the sending hardware address. For an ARP request, all fields except the target hardware address have a fill value.
As shown in table 3, when the system receives an ARP request message from the local machine, it fills in the hardware address, then replace the two sender addresses with the two destination addresses, set the operation field to 2, and send it back.
Table 3
The second group establishes a connection
1) displays 3-5 rows of data.
Figure 11
2) interpret data packets
The three rows of data are the process of establishing a connection between the two machines.
The core meaning of these three lines isTCPThe three-way handshake of the Protocol.TCPData Packets depend onIPProtocol. HoweverIPThe Protocol only sends data out, but it cannot be guaranteed.IPData packets can successfully reach the destination, ensuring reliable data transmission depends onTCPProtocol. When the receiving end receives a message from the sending end, the receiving end sends a short response message, which means: "I have received your message ." The third group of data will be able to see this process.TCPIs a connection-oriented protocol. Before sending data to the other party, a connection must be established between the two parties. The process of establishing a connection is the process of three handshakes.
This process is like asking me to borrow some books from Zhang Weidong. Step 1: I said, "Hello, I am the burden." Step 2: Zhang San said, "Hello, "I'm James," said the third step. "I want to borrow some books from you." In this way, the identity of the other party is confirmed through the Q & A and the contact is established. [Page]
Next we will analyze the three-way handshake process in this example.
(1) the requester sends an initial serial number (SEQ) 208 to machine 1 on machine 987694419.
(2) After Server 1 receives the serial number, it adds the value of 1 to 987694419 as the response signal (ACK) and generates an initial sequence number (SEQ) 1773195208 at random, the two signals are sent back to the requester's server 208 at the same time, which means: "The message has been received. Let's start with the number of data streams 1773195208."
(3) After receiving the request, server 208 sets the validation sequence number to the initial sequence number of the server (SEQ) 1773195208 plus 1 to 1773195209 as the response signal.
The three handshakes are completed in the preceding three steps. Both parties have established a channel for data transmission.
The following analysisTCPThe header information shows that during the handshakeTCPThe related fields in the header have also changed.
3) header information analysis
As shown in figure 12, the 3rd packet contains three headers: Ethernet andIPAndTCP.
More ARP header information is missingIP,TCPIn this way, ARP is not involved in the following process. In the LAN, ARP is responsible for finding the computer to be found among many networked computers.
The difference between Ethernet header information and 1st and 2 is that the frame type is 0800, indicating that the frame type isIP.
Figure 12
IPProtocol header information
IPYesTCP/IPThe most core protocol in the protocol family. Figure 5 shows allTCP, UDP, ICMP, and IGMP dataIPA metaphor for the data transmitted in the datagram formatIPThe agreement is like a delivery truck that delivers a car's goods to the destination. The main cargo isTCPOr UDP is assigned to it. Note thatIPIt provides unreliable and connectionless datagram transmission. That is to say, I p only provides the best transmission service, but it is not guaranteed.IPThe datagram can successfully reach the destination. If you see this, will you worry about whether your e_mail will be sent to a friend? In fact, you don't have to worry about it. I mentioned above to ensure that the data is accurate to the destination.TCP.
For example, table 4 isIPThe header of the Protocol.
Table 4
IPFields in the datagram format and header
In Figure 12, the minute 45 00-71 01 isIP. These numbers are represented in hexadecimal notation. A number occupies 4 digits. For example, the binary value of 4 is 0100.
4-digit version: indicates the current Protocol version number. If the value is 4, the version number is 4.IPIt is also calledIPV4;
4-bit Header Length: the length of the header, which is measured in 32 bits (4 bytes). The value is 5.IPThe header length is 20 bytes.
8-bit service type (ToS): 00. This 8-bit field consists of three priority subfields, which are ignored now, the four-digit TOS sub-fields and one-digit unused fields (currently 0) constitute. The four-bit sub-fields of TOS include: Minimum latency, maximum throughput, maximum reliability, and minimum cost. Each of the four one-bit sub-fields can have only one value, and in this example, they are all 0, indicates a General Service.
16-bit total length (number of bytes): the total length field refers to the entireIPThe length of the datagram, in bytes. The value is 00 30, which is converted to 48 bytes in decimal format and 48 bytes = 20 bytes.IPHeader + 28 bytesTCPHeader. This datagram is only the control information transmitted, and real data has not yet been transferred. Therefore, the total length currently seen is the header length.
16-bit identifier: the Identifier Field uniquely identifies each datagram sent by the host. Generally, the value of each sent packet is 1, 3rd is 21, 5th is 22, and 7th is 23. The flag field and the part offset field are involved in fragment. This article does not discuss these two fields.
TTL: TTL (time-to-live) specifies the maximum number of routers that a datagram can pass. It specifies the survival time of the datagram. The initial value of TTL is set by the source host. Once a router is processed, the value of TTL is equal to 1. You can determine the system and router used by the server based on the TTL value. In this example, the value is 80, which is converted to 128 in decimal format. In Windows, the TTL value is generally 128, and in UNIX, the TTL value is 255, in this example, the two machines are in the same network segment and the operating system is windows.
8-bit Protocol: indicates the protocol type, and 6 indicates that the transport layer isTCPProtocol.
16-bit header test: After receiving an I p datagram, the 16-bit binary inverse code is also summed. Because the receiver contains a checksum in the sender's header during the computation process, if the header has no errors during transmission, the receiver's computation result should be 1 in total. If the result is not all 1, that is, check and error, thenIPDiscard the received datagram. However, no error messages are generated. The upper layer discovers the lost data packets and re-transmits them.
32-bit SourceIPAddress and 32-bit destinationIPAddress: Actually this isIPThe core part of the protocol, but there are a lot of articles about this. This article builds a simple network structure that does not involve routing. This article only gives a brief introduction to this, for more information, see other articles. 32-bitIPAn address consists of a network ID and a host ID. SourceIPThe address is C0 A8 71 D0, which is converted to 192.168.113.208 in decimal format.IPThe address is C0 A8 71 01, and the conversion is decimal: 192.168.113.1. The network address is 192.168.113, and the host address is 1 and 208 respectively. Their network addresses are the same, so they are in the same network segment, so that data can be directly transmitted.
TCPProtocol header information
For example, table 5 shows the ICP header information.
Table 5
TCPPackage Header
Row 3TCPHeader information: 04 28 00 15 3A DF 05 53 00 00 00 00 70 02 40 00 9A 8d 00 00 02 04 05 B4 01 01 04 02
Port: FTP accounts for port 21, HTTP accounts for port 80, and telnet accounts for port 23.TCPOr UDP port, the port is the same as the door at both ends of the channel, when the two machines for communication, the door must be open. The source port and destination port each occupy 16 bits, and the power of 2 is equal to 65536, which is the "door" that each computer can contact other computers ". Generally, the port number of each service on the Service side is fixed. In this example, the port number is 00 15, which is converted to 21 in decimal format. This is the default FTP port. It must be noted that this is the FTP control port and another port is used for data transmission, the analysis in the third group shows this. When the client contacts the server, a port greater than 1024 is randomly opened. In this example, the port is 04 28, which is converted to decimal 1064. A Trojan in your computer will also open a service port. It is very important to observe the port, not only to see the normal service provided by the local machine, but also to see the abnormal connection. Netstat is used to view port commands in windows.
32-bit sequence number, also known as sequence number, is short for seq. From the analysis of the three handshakes above, we can see that, when one party wants to contact the other party, it sends an initial serial number to the other party, which means: "Let's establish a contact? ", After receiving the message, the service provider sends an independent serial number to the sender, which means "the message is received and the data stream starts with this number ." From this we can see that,TCPThe connection is completely bidirectional, that is, the data streams of both parties can be transmitted simultaneously. The data of both parties is independent during transmission, so eachTCPThe connection must have two sequence numbers corresponding to different data streams.
32-bit validation number: Acknowledgment number, abbreviated as ack. In the handshake phase, the sender's serial number plus 1 is used as the answer. In the data transmission phase, the sender's serial number plus the size of the sent data is used as the answer, indicating that the sender does receive the data. This process is shown in the analysis of the third group.
Four-digit Header Length :. This field occupies 4 bits. Its unit is 32 bits (4 bytes ). In this example, the value is 7,TCPThe header length is 28 bytes, equal to the normal length of 2 0 bytes plus the optional 8 bytes .,TCPThe length of the header can be up to 60 bytes (Binary 1111 is converted to 15, 15*4 bytes = 60 bytes in decimal format ).
6 flag bits.
URG emergency pointer, telling the recipientTCPModule critical pointer field pointing to critical data
When Ack is set to 1, it indicates the confirmation number (valid. When it is set to 0, it indicates that the data segment does not contain the confirmation information, and the confirmation number is ignored.
When PSH is set to 1, the requested data segment can be directly sent to the application after the receiver receives it, instead of waiting until the buffer is full.
When RST is set to 1, the connection is rebuilt. If an rst bit is received, some errors usually occur.
When SYN is set to 1, it is used to initiate a connection.
When Fin is set to 1, the sender completes the sending task. To release the connection, it indicates that the sender has no data to send.
The three figures in Figure 13 are 3-5 rows, respectively.TCPThe header information of the Protocol. These three lines are three-way handshakes. Let's see what happens to the flag of the handshake process?
13-1 the requester sends an initial serial number (SEQ) 208 to machine 1 on machine 987694419. Set the flag SYN to 1.
Server 1 of server 13-2 receives this serial number and then sends the response signal (ACK) and a random initial serial number (SEQ) 1773195208 back to server 208 of the requesting end, because there is a response signal and an initial serial number, both the flag ack and SYN are set to 1.
After receiving the signal from Server 1, server 208 of the 13-3 request end sends back the information to Server 1. The flag Ack is set to 1, and the other flag is 0. Note that the SYN value is 0, Syn indicates that the connection is initiated, and the previous two connections have been completed.
16-bit window size:TCPThe traffic control is provided by each end of the connection through the declared window size. The window size is the number of bytes, starting from the value specified in the validation serial number field. This value is the byte that is expected to be received. The window size is a 16-byte field, so the window size is up to 65535 bytes.
16-bit inspection: checks and covers the entireTCPPacket segment:TCPHeader andTCPData. This is a mandatory field, which must be calculated and stored by the initiator and verified by the receiver.
16-bit emergency pointer: The emergency pointer is valid only when the u r g flag is set to 1. The emergency pointer is a positive offset, and the sum of values in the serial number field indicates the sequence number of the last byte of the emergency data.
Option: Figure 13-1 and figure 13-2 have eight-byte options. Figure 13-3 does not have any options. The most common optional field is the maximum Message Size, also known as MSS (maximum segment size ). Each connector usually specifies this option in the first step of handshake. It specifies the maximum length of packets that can be received by the local end. Figure 13-1 shows that the maximum number of bytes accepted by machine 208 is 1460 bytes, and 1460 is the default Ethernet size, in the third group of data analysis, we can see that the data transfer is transmitted in 1460 bytes.
Handshake Summary
We talked about handshakes three times separately above. It seems a little scattered. Now let's make a summary.
Group 3 Data Transmission
1) displays-60 rows of data.
Figure 14
2) interpret data packets
These four rows of data are a process of sending and receiving data during data transmission.
As mentioned above,TCPIt provides a connection-oriented and reliable byte stream service. When the receiving end receives the message from the sending end, the receiving end sends a response message, indicating that the message is received. When data is transferredTCPSplit into data blocks that are most suitable for sending. Generally, during transmission over EthernetTCPData is divided into 1460 bytes. That is to say, the data is sent in one piece by the sender. After receiving the data, the receiver combines the data.
Line 57 shows that machine 1 sent 208-byte data to machine 1514. Note that we have mentioned that the data is sent with a protocol header layer by layer, 1514 bytes = 14 bytes Ethernet header + 20 bytesIPHeader + 20 bytesTCPHeader + 1460 bytes of data
The response signal ack displayed in line 58 is: 1781514222, this number is 57 rows of seq No. 1781512762 plus the transmitted data no. 1460,208 sends the response signal to machine 1, indicating that the received data has been sent.
Rows 59 and 60 show the process of continuing to transmit data.
This process is like I borrowed some books from James and I want to say, "I have borrowed some of your books .", He said, "Yes ".
3) header information
Figure 15-1 and figure 15-2 show the header information of rows 57 and 58, respectively. For details, refer to the second group.
Group 4 terminate the connection
1) Data in rows 3-96 is displayed.
Figure 16 [Page]
2) interpret data packets
93-96 is the process of disabling communication between the two machines.
Three handshakes are required to establish a connection, and four handshakes are required to terminate a connection. This is becauseTCPThe connection is full duplex (that is, data can be transmitted simultaneously in both directions), and each direction must be closed separately. The four handshakes are actually closed by both parties.
In this example, after downloading the file, close the browser and terminate the connection to the server. Line 93-96 in Figure 16 shows that the connection is terminated after four handshakes.
93 rows of data show that after the browser is closed, server 208, as shown in figure 17-1, sends Server 1 fin with serial number (SEQ) 987695574 to Server 1 to terminate the connection.
Line 94 and Figure 17-2 show that after receiving the fin close request, machine 1 sends back a confirmation and sets the response signal to receive the serial number plus 1, thus terminating the transmission in this direction.
The 95th row of data and figure 17-3 show that server 1 sets fin to 1 together with serial number (SEQ) 1773196056 and sends the request to server 208 to terminate the connection.
The 96 rows of data and Figure 17-4 show that after receiving the fin close request from machine 208, a confirmation is sent back and the response signal is set to receive the serial number plus 1.TCPThe connection is completely closed.
3) header information
6. Scan instances
Next, let's take a ping instance to test whether a computer is connected. The most common command is the ping command. When you ping a computer, the interface shown in "18" is accessible. If the interface shown in "19" is unavailable, either the computer does not exist or the network cable is not connected, second, the computer is installed with a firewall and is set to not allow Ping. How can we differentiate these two situations? Iris is used to track the above situation.
Figure 18
Figure 19
20 is the case of Ping.
21 is that the computer cannot be pinged. The figure shows that the ARP request does not respond.
22. Ping failure. The computer has a firewall installed. The figure shows that the ARP request has a response. But the ICMP request does not respond.
From the analysis, we can see that although the surface phenomena in the last two cases are the same, the essence is indeed the opposite. The header information clearly shows that ping is
ICMP protocol, the communication process is completed on the third layer, and the fourth layer is not usedTCPProtocol.
Figure 20
Figure 21
Figure 22
VII. Postscript
This document is not a tutorial, and many problems are not involved, suchTCPResend,IPDecomposition, routing, and so on, just put forward a learning idea, hoping to play a role in attracting others.TCP/IPThe protocol family is very complex, but it is still difficult to understand it. Finally, I would like to ask a friend who is interested: Telnet three machines, one normal port 23 is open, one network is open but port 23 is not open, and the other does not exist. Follow up with the methods we have learned to compare the three differences. This is actually usedTCPScan to determine whether the other machine is online.