One; preface
The people who have learned the TCP/IP protocol have a feeling, this thing is too abstract, there is no data instance, after reading soon forget. This article will introduce an intuitive learning method, using the Protocol analysis tool to learn TCP/IP, in the process of learning to visually see the specific transmission process of data.
For beginners to understand more easily, this article will build a simplest network environment, does not include subnets.
Second, the test environment
1. Network environment
1 is shown
Figure 1
In order to express the convenience, the following No. 208 machine refers to the address of 192.168.113.208 computer, machine 1th refers to the computer address is 192.168.113.1.
2. Operating system
Two machines are Windows 2000, machine 1th machine as server, install FTP service
3. Protocol Analysis Tools
Common tools in the Windows environment are: Sniffer Pro, Natxray, Iris, and the Network Monitor that comes with Windows 2000. This paper chooses Iris as the Protocol analysis tool.
Install the IRIS software on the client No. 208 machine.
Third, the test process
1. Test example: A file from machine 1th is downloaded to the number No. 208 by FTP.
2, iris settings.
Because Iris has the function of network monitoring, if there are other machines in the network environment will catch a lot of other data packets, so that the learning caused many inconvenience, in order to clearly see the above example of the transmission process first set Iris to catch only No. 208 machine and 1th number between the packet. The setup process is as follows:
1) with hot-key ctrl+b popup Address table, in the table to fill the machine's IP address, in order to grasp the package to see more clearly do not add the host name (name), set up and close this window.
Figure 2
2) with hot-key ctrl+e pop-up filter settings, select the left column "IP address", the right column by the Address Book to drag the addresses below, set up OK, so this grab the two computers between the package.
Fig. 3 3, grasping the bag
Press the Start button on the Iris toolbar. In the browser, enter: FTP://192.168.113.1, locate the file to download, right-click the file, select "Copy to Folder" in the pop-up menu to start the download, after downloading, press the button in the Iris toolbar to stop grabbing the bag. Figure 4 shows the entire process of FTP, which we will analyze in detail below.
Figure 4
Description: In order to be able to catch the ARP protocol package, run Arp–d in Windows 2000 to clear the ARP cache.
Iv. Process Analysis
1, the basic principle of TCP/IP
Although the focus of this paper is to parse TCP/IP based on an instance, it is important to understand the basic principles of TCP/IP in the following procedure.
A The network is layered and each layer is responsible for different communication functions.
TCP/IP is generally considered to be a four-layer protocol system, and the TCP/IP protocol family is a group of protocols that are composed of a different set of protocols. Although the protocol family is commonly referred to as TCP/IP, TCP and IP are just two of these protocols, as shown in table 1. Each layer is responsible for the different functions:
Table 1
The concept of layering is very simple, but it is very important in the actual application, in the network setup and troubleshooting when the network level is understood very well, will be a great help to the work. For example, to set up routing is the Network layer IP protocol, to find the MAC address is the link layer arp thing, the Common ping command is made by the ICMP protocol.
Figure 5 shows the relationship of each layer protocol, and understanding the relationship between them is very important for the following protocol analysis.
Figure 5 B. The data is sent from top to bottom, Shengar, and the data is received from up-down, layer-by-layer decoding.
When an application transmits data using TCP, the data is fed into the protocol stack and then passes through each layer one at a time until it is sent to the network as a stream of bits. Each of these layers adds some header information (and sometimes additional trailer information) to the data received, as shown in procedure 6. The data unit that TCP transmits to the IP is called a TCP segment or simply a TCP fragment. The data unit that I p transmits to the network interface layer is called an IP datagram. A bit stream transmitted over Ethernet is called a frame.
Data is sent in accordance with figure 6 top-down, Shengar; Data is received from the bottom up, layered decoding.
Figure 6
C. Logical communication is done at a peer level
The vertical structural hierarchy is the functional flow of today's universally accepted data processing. Each layer has an interface to its adjacent layer. In order to communicate, the two systems must pass data, instructions, addresses and other information between the layers, and the logical flow of communication differs from the real data stream. Although the communication process flows vertically through all levels, each layer logically communicates directly with the corresponding layer of the remote computer system.
As you can see from Figure 7, the traffic is actually in the vertical direction, but logically the communication is done at the same peer.
Figure 7
2. Process Description
To better analyze the protocol, let's first describe the transmission steps for the above example data. 8 is shown below:
1) The FTP client requests TCP to establish a connection with the IP address of the server.
2) TCP sends a connection request to a remote host, which sends an IP datagram with the above IP address.
3) If the destination host is on a local network, then the IP datagram can be sent directly to the destination host. If the destination host is on a remote network, the IP routing function is adopted to determine the next-station router address on the local network and let it forward the IP datagram. In both cases, the IP datagram is sent to a host or router located on the local network.
4) This example is an Ethernet, then the sending side host must transform the 32-bit IP address into a 48-bit Ethernet address, which is also known as the MAC address, it is the factory write to the network card on the world's only hardware address. Translating the IP address to the corresponding MAC address is done by the ARP protocol. 5), the ARP sends an Ethernet data frame called an ARP request to each host on the Ethernet, a process known as broadcast. The ARP Request data frame contains the IP address of the destination host, which means "if you are the owner of this IP address, please answer your hardware address." ”
6) The ARP layer of the destination host receives this broadcast, recognizes that it is the sending side asking for its IP address, and sends an ARP response. This ARP response contains the I p address and the corresponding hardware address.
7) after receiving the ARP reply, the IP packets that make the ARP request-answer interchange can now be delivered.
8) Send IP data to the host for reporting purposes.
Figure 8
3. Example Analysis
In order to clarify the process of data transmission, we have captured four sets of data at different stages of transmission, namely, locating the server, establishing the connection, transmitting data and terminating the connection by analyzing the packet of iris capture to analyze the working process of TCP/IP. Each set of data is explained in three steps below.
Show Packets
Explain the packet
Analyze header information for this package by layer
First set of lookup servers
1) 1, 2 rows of data are displayed
Figure 9
2) Interpreting the packet
These two rows of data are the process of finding server and server answers.
In line 1th, the source-side host's MAC address is 00:50:fc:22:c7:be. The MAC address of the destination host is FF:FF:FF:FF:FF:FF, this address is hexadecimal, the F conversion is binary is 1111, the address of all 1 is the broadcast address. The so-called broadcast is to each network device on the Internet to send information, each Ethernet interface on the cable to receive the data frame and processing it, this line reflects the content of step 5), ARP sends an Ethernet data frame called ARP request to each host on the Ethernet. Each network card in the network receives the message "Who is the owner of the IP address of the 192.168.113.1, please tell me your hardware address".
The 2nd line reflects the content of step 6). In the same Ethernet each machine will "receive" to this message, but the normal state of the other than the 1th machine outside the host should ignore this message, and the host of the 1th ARP layer received this broadcast message, recognized that this is the sending side of the IP address, and then send an ARP reply. Inform yourself of the IP address and MAC address. Line 2nd can clearly see the answer to the 1th number of the information of their MAC address 00:90:27:f6:54:53.
These two lines reflect a one-to-one communication process between the data link layers. This process is like I want to sit in a room filled with a person called "Zhang San", shouted at the door "Zhang San", this sound everyone heard, this is called radio. Zhang San heard after the response, others heard did not respond, so that with Zhang San made contact.
3) Head information Analysis
As shown in the left column, the 1th packet contains two header information: Ethernet (Ethernet) and ARP.
Figure 10
Table 2 Below is the header information for Ethernet, the number of bytes in parentheses, the first two fields in the Ethernet header are the source and destination addresses of the Ethernet. The destination address is a broadcast address for all 1 of the special address. All Ethernet interfaces on the cable will receive the broadcast data frame. A two-byte long Ethernet frame type represents the type of data that follows. For ARP requests or replies, the value of this field is 0806.
As can be seen in line 2nd, although the ARP request is broadcast, the destination address of the ARP reply is the 1th-number machine (C7 be). The ARP response is sent directly to the requesting side of the host.
Table 2
Table 3 below is the header information for the ARP protocol. The Hardware Type field represents the type of hardware address. Its value is 1, which means the Ethernet address. The protocol type field represents the type of protocol address to map. Its value is 0800, which means the IP address. Its value is the same as the value of the Type field in the Ethernet data frame that contains the I P datagram. The next two 1-byte fields, the hardware address length and the protocol address length, respectively, indicate the length of the hardware address and protocol address, in bytes. For ARP requests or responses for IP addresses on Ethernet, they are 6 and 4, respectively. OP is operation (opoperation), 1 is the ARP request, 2 is the arp answer, 3 is the RARP request, and 4 is the rarp answer, and the field value in the second row is 2 for the answer. The next four fields are the hardware address of the sending side, the IP address of the sender, the hardware address of the destination, and the destination IP address. Note that there is some duplication of information: there is a hardware address for the sending side in the data frame header of the Ethernet and the ARP request data frame. For an ARP request, all other fields except the destination hardware address have padding values.
Table 3 of the 2nd behavior response, when the system received a destination for the local ARP request message, it put the hardware address into, and then use two destination address to replace two sender address, and set the Operation field 2, and then send it back.
Table 3
The second group establishes the connection
1) 3-5 rows of data are displayed
Figure 11
2) Interpreting the packet
These three rows of data are the process of establishing a connection between two machines.
The core of these three lines is the three-time handshake of the TCP protocol. TCP packets are transmitted by IP protocol. But the IP protocol is to send the data to go out, but there is no guarantee that the IP datagram can successfully reach the destination, to ensure that the reliable transmission of data is done by the TCP protocol. When the receiving side receives the message from the sender, it receives a short haircut and sends a reply message, meaning: "I have received your message." "The third set of data will be able to see this process. TCP is a connection-oriented protocol. A connection must be established between the two parties in either direction before the other party sends the data. The process of establishing a connection is a three-time handshake.
This process is like asking me to find Zhang San to borrow some books from him, the first step: I said: "Hello, I am the burden", the second step: Zhang San said: "Hello, I am Zhang San", the third step: I said: "I ask you to borrow a few books." This way, we establish a connection by confirming the identity of each other through questions and answers.
The following is an analysis of the three handshake processes for this example.
1)) The request-side NO. 208 machine sends an initial sequence number (SEQ) 987694419 to the 1th number machine.
2) Server 1th after receiving this sequence number, this sequence number plus 1 value of 987694419 as the answer signal (ACK), and randomly generate an initial sequence number (SEQ) 1773195208, both signals are sent back to the requesting side of the machine No. 208, meaning: "The message has been received, Let's start with a 1773195208 of the data flow. ”
3)) After the request-side No. 208 is received, the confirmation sequence number is set to the server's initial sequence number (SEQ) 1773195208 plus 1 is 1773195209 as the response signal.
The above three steps completed three handshake, the two sides established a channel, the next can be transmitted data.
The following analysis of TCP header information can be seen in the handshake process of the TCP header related fields have also changed.
3) Head Information analysis
As shown in 12, the 3rd packet contains three header information: Ethernet (Ethernet) and IP and TCP.
Header information less ARP more IP, TCP, the following process also does not participate in ARP, it can be understood that in the LAN, ARP is responsible for the many networked computers found in the need to find the computer, find work is completed.
The Ethernet header information differs from line 1th and 2 in that the frame type is 0800, indicating that the frame type is IP.
Figure 12
IP protocol Header information
IP is the most important protocol in the family of TCP/IP protocols. As you can see from Figure 5, all TCP, UDP, ICMP, and IGMP data are transmitted in an IP datagram format, and there is an image of the analogy IP protocol like a truck that transports a car to its destination. The main cargo is TCP or UDP assigned to it. It should be noted that IP provides unreliable, non-connected datagram delivery, that is, I p only provides the best transport service but does not guarantee that IP datagrams will successfully reach their destination. See this you will not worry about your e_mail will not send to friends that, in fact, do not worry, the above mentioned to ensure that the data correctly arrived at the destination is TCP work, we will explain in detail later.
The header information for the IP protocol is shown in table 4.
Table 4 the IP datagram format and the fields in the header
in Figure 12, 45 00-71 01 is the header information for the IP. These numbers are in hexadecimal notation. A number occupies 4 bits, for example: 4 of the binary is 0100
4-bit version: represents the current protocol version number, the value is 4 for version 4, so IP is sometimes referred to as ipv4;
4-bit header length: the length of the head, Its unit is 32 bits (4 bytes) and a value of 5 indicates that the IP header length is 20 bytes.
8-bit service type (TOS): 00, this 8-bit field is made up of 3-bit priority sub-fields, which are now ignored, 4-bit TOS subfield, and 1-bit unused fields (now 0). The 4-bit TOS sub-field contains: Minimum delay, maximum throughput, maximum reliability, and minimum cost composition, the four 1 bits can have a maximum of 1, in this case 0, is the General Service.
16-bit total length (bytes): The total Length field refers to the length of the entire IP datagram, in bytes. The value is 00 30, converted to decimal 48 bytes, 48 bytes = 20 bytes of IP header + 28 bytes of TCP header, this datagram is only the control information transmitted, and has not transmitted real data, so the total length of the current see is the length of the header.
16-bit ID: the identity field uniquely identifies each datagram sent by the host. Typically, each message is sent with a value of 1, a 3rd action value of 30 21, a 5th act of 30 22, and a 7th behavior of 30 23. When sharding involves a flag field and a slice offset field, these two fields are not discussed in this article.
8-bit time-to-Live (TTL): The TTL (time-to-live) Time-to-live field sets the maximum number of routers that datagrams can pass. It specifies the time-to-live of the datagram. The initial value of the TTL is set by the source host, and once it passes through a router that handles it, its value is subtracted by 1. Depending on the TTL value, you can determine what system the server is and what router it passes through. In this case, 80, the TTL initial value for the 128,windows operating system is typically 128,unix operating system with an initial value of 255, this example indicates that two machines are in the same network segment and the operating system is windows.
8-bit protocol: Represents the protocol type, and 6 indicates that the transport layer is the TCP protocol.
16-bit first Test and: When an I-P datagram is received, the sum of the binary codes for each 16-bit in the header is also summed. Since the receiver contains the test in the header of the sender in the calculation process, the receiver calculates a total of 1 if the header does not have any errors in the transmission process. If the result is not full 1, which is the test and error, then IP discards the received datagram. But do not generate error messages, from the top to find the lost data and re-transmission.
32-bit Source IP address and 32-bit destination IP address: Actually this is the core part of the IP protocol, but the introduction of this aspect of the article is very much, this article is set up is a simple network structure, does not involve routing, this article only to do a brief introduction, related knowledge Please refer to other articles. A 32-bit IP address consists of a network ID and a host ID. This example source IP address is C0 A8 D0, converted to decimal: 192.168.113.208; The destination IP address is C0 A8 71 01, converted to decimal: 192.168.113.1. The network address is 192.168.113, the host address is 1 and 208 respectively, their network address is the same, so in a network segment, so that the data in the transmission process can be directly reached.
TCP protocol Header information
The header information for the ICP Agreement is shown in table 5.
Table 5 TCP Packet Header
The third line of TCP header information is: 3 A DF----9A 8D B4 01 01 04 02
Port number: Often said that FTP accounted for 21 ports, HTTP accounted for 80 ports, Telnet accounted for 23 ports, etc., here refers to the port is TCP or UDP port, the port is like a channel at both ends of the door, when the two machines to communicate the door must be open. The source port and the destination port each account for 16 bits, 2 of the 16 is equal to 65536, which is the "door" that each computer can connect with other computers. Generally as a service the port number of each service is fixed. The port number for this example is 00 15, converted to decimal 21, which is the default port of FTP, it is necessary to point out that this is the FTP control port, data transfer with the other port, the third group of analysis can see this. The client contacts the server randomly open a port greater than 1024, this example is 04 28, converted to decimal 1064. The trojan in your computer will also open a service port. Observing the port is very important, not only can see the normal service provided by this machine, but also can see the abnormal connection. Windows netstat when the command for the port is inspected.
32-bit serial number: Also known as the ordinal (Sequence numbers), shorthand for SEQ, from the above three handshake analysis can be seen, when one party to contact the other party to send an initial sequence number to each other, meaning: "Let us establish contact?" ", the service party will send a separate serial number to the sender after receiving it, meaning" the message is received, the data flow will begin with this number. " "This shows that the TCP connection is completely bidirectional, that is, the data flow between the two sides can be transmitted simultaneously." Both data are independent during the transfer process, so each TCP connection must have two sequential numbers corresponding to the data streams in different directions.
32-bit confirmation number: Also known as the answer number (acknowledgment), abbreviated as ACK. During the handshake phase, confirm that the serial number of the sender plus 1 as the answer, in the data transmission phase, confirm the serial number of the sender and send the size of the data as the answer, indicating that they did receive the data. This process will be seen in the analysis of the third group.
4-bit header length:. This field occupies 4 bits, and its units are 32 bits (4 bytes). This example is 7,tcp with a header length of 28 bytes, equal to the normal length of 2 0 bytes plus the option of 8 bytes. , the head length of TCP can be as long as 60 bytes (binary 1111 is converted to decimal 15,15*4 bytes = 60 bytes).
6 flag bits.
URG emergency pointer, tells the receiving TCP module that the critical pointer field points to critical data
When the ACK is set to 1, the confirmation number (valid, 0 indicates that the data segment does not contain confirmation information and the confirmation number is ignored).
PSH 1 O'Clock the requested data segment can be sent directly to the application after it has been received by the receiving party, without having to wait until the buffer is full.
RST 1 O'Clock rebuilds the connection. Some errors usually occur when the RST bit is received.
SYN 1 o'clock is used to initiate a connection.
FIN 1 indicates that the originator completes the send task. Used to release the connection, indicating that no data has been sent by the sending party.
Figure 13 of the 3 figure is 3-5 lines of TCP protocol header information, which three lines is the process of three handshake, we look at the handshake process flag bit what happened?
13-1 the request-side NO. 208 machine sends an initial sequence number (SEQ) 987694419 to the 1th number machine. The flag bit SYN is set to 1.
13-2 server 1th After receiving this sequence number, the answer signal (ACK) and randomly generated an initial sequence number (SEQ) 1773195208 is sent back to the requestor No. 208 machine, because there is a response signal and the initial sequence number, so the flag bit ACK and SYN are set to 1.
13-3 the request-side NO. 208 machine received the signal of the number 1th, sent back the message to the 1th machine. The flag bit ACK is set to 1, and the other flags are 0. Note that at this point the SYN value of 0,syn is marked to initiate the connection, and the top two connections have been completed.
16-bit window size: TCP traffic control is provided by the declared window size for each end of the connection. The window size is the number of bytes, starting with the value indicated in the confirmation ordinal field, which is the byte that receives the correct expected receive. The window size is a 16-byte field, and the maximum window size is 65535 bytes.
16-bit inspection and: Tests and covers the entire TCP packet segment: TCP header and TCP data. This is a mandatory field that must be computed and stored by the originator and validated by the end of the collection.
16-bit emergency pointer: only if the U R G flag is placed 1 o'clock the emergency pointer is valid. The emergency pointer is a positive offset, and the sum of the values in the Ordinal field represents the ordinal of the last byte of the emergency data.
Options: Figure 13-1 and figure 13-2 have 8 byte options, and Figure 13-3 has no options. The most common optional field is the longest message size, also known as MSS (Maximum Segment size). Each connecting party typically indicates this option in the first step of the handshake. It indicates the maximum length of the message segment that can be received on this side. Figure 13-1 shows that machine No. 208 can accept the maximum number of bytes is 1460 bytes, 1460 is the default size of Ethernet, in the third set of data analysis can be seen in the data transfer is transmitted in 1460 bytes.
Handshake Summary
Above we say three times separate handshake, look at a little scattered, now a summary.
Third Group data transfer
1) 57-60 rows of data are displayed
Figure 14
2) Interpreting the packet
These four rows of data are a process of sending a receive during data transfer.
As mentioned earlier, TCP provides a connection-oriented, reliable byte-stream service. When the receiving side receives information from the sender, the receiving side sends a reply message indicating that it received this message. Data transfer is separated by TCP into a block of data that is best suited for sending. Typically Ethernet is transmitted with TCP dividing the data into 1460 bytes. This means that the data is sent to the sender in a piece, and the receiving end receives the data and then combines them.
The 57 line shows that machine number 1th sends data of size 1514 bytes to No. 208, note that we mentioned earlier that the data is sent in layers with the protocol header, 1514 bytes = 14 bytes Ethernet Header + 20 bytes ip header + 20 bytes tcp header + 1460 bytes of data
58 line shows the answer signal ACK is: 1781514222, this number is 57 line SEQ serial number 1781512762 plus the transmitted data 1460,208 machine will send this response signal to the 1th machine instructions have received the data sent.
59, 60 lines show the process of continuing the transfer of data.
The process was like I borrowed a book from Zhang San and lent me a few books that I would say: "I have lent you a few copies." "Yes," he said. "
3) header information
Figure 15-1 and Figure 15-2 are the header information for 57 rows and 58 rows, respectively, and the second group is interpreted as reference.
Group fourth terminating the connection
1) 93-96 rows of data are displayed
Figure 16
2) Interpreting the packet
93-96 is the process of closing the two-machine communication.
It takes three handshake to establish a connection, and 4 handshake to terminate a connection. This is because a TCP connection is full-duplex (that is, the data can be passed in two directions at the same time), and each direction must be closed separately. The 4-time handshake is actually the process of closing the two sides individually.
When the file is finished downloading, closing the browser terminates the connection to the server figure 16 of the 93-96 lines shows the 4 handshake process of terminating the connection.
93 rows of data shows that after the browser is closed, 17-1 shows that the No. 208 machine will be the fin 1 together with the serial number (SEQ) 987695574 to the 1th machine request termination connection.
94 rows of data and figure 17-2 shows that the number 1th receives a fin close request, sends back a confirmation, and sets the response signal to receive the serial number plus 1, thus terminating the transmission in this direction.
95 rows of data and figure 17-3 shows that machine 1th sends the FIN 1 together with the serial number (SEQ) 1773196056 to the No. 208 machine request to terminate the connection.
96 rows of data and figure 17-4 shows that the number No. 208 receives a fin close request, sends back a confirmation, and sets the answer signal to receive the serial number plus 1, at which point the TCP connection is completely shut down.
3) header information
vi. Scan Examples
Let's give another example of ping to test whether a computer is going to work, and the most common command is the ping command. Ping a computer, the appearance of 18 interface is pass, the appearance of 19 interface is not through, there are two cases, one is that the computer does not exist or not to pick up the network cable, and the second is that the computer installed a firewall and set to not allow Ping. How do you differentiate between these two situations? The following also uses Iris to track the above situation.
Figure 18
Figure 19
20 is the case of the ping pass.
21 is a condition in which the computer does not exist without pinging. You can see from the diagram that the ARP request is not responding.
22 is a condition in which the computer exists but has a firewall installed. From the diagram you can see that the ARP request responds. But the ICMP request did not respond.
It can be seen from the analysis that although the surface phenomena of the latter two cases are the same, the essence is exactly the opposite. Through the header information can be clearly seen Ping is
ICMP protocol to complete, the communication process is completed on the third layer, there is no use of the fourth layer of the TCP protocol.
Figure 20
Figure 21
Figure 22
Vii. PostScript
This article is not a tutorial, many problems are not involved, such as TCP re-send, IP decomposition, routing, etc., just put forward a learning idea, hope to play a role. TCP/IP protocol family is very complex, but as long as the understanding is not difficult to learn. Finally to the interested friend asked a question: Telnet three machines, a normal 23 port open, a network is through but 23 port is not open, the other is not there. Follow the methods we've learned to compare the three differences. In fact, this is the use of TCP scanning to determine whether the other machine is online a way.
TCP/IP protocol analysis (recommended)