See ARP troubleshooting tips for network failures-application tips

Source: Internet
Author: User
I also found this problem so, found this ARP caused the network card is often not on the network problem
Recently, my unit has encountered a very strange problem, a P4 brand computer, built-in Intel network card, has been used very well, browsing the Internet, intranet communication is normal. Suddenly one day, found that the computer in the Internet when browsing time, ping the address on the Internet is also a pass, a break, but ping intranet when there is no problem, and the intranet communication is also very normal, that is, and the Internet communication with this phenomenon, very confusing. The IP address of this computer is 192.168.24.55, and the IP address of the firewall is 192.168.24.7.

Failure Analysis: Check physical link

My unit all access to the Internet computer is through the NetScreen NS25 firewall to connect, if it is a firewall problem, the other computer access to the Internet is quite normal, there is no time to break the phenomenon. According to this computer ping phenomenon, it seems that the problem should be in the next three layers, while the time is broken as if the phenomenon is a typical physical layer of problems, then first start checking the link.

This computer is connected to a Cisco three-tier switch on one of the ports, the firewall is connected to this three-tier switch, on the three-tier switch enabled routing, configuration is certainly no problem. First check the computer to switch network cable, if this cable has problems, then this computer and intranet communication should also have problems, through the test of the network cable to prove that there is no problem. The firewall to the switch jumper should be no problem, because the other computers are no problem. This can be judged link is no problem, the network card will have problems? Certainly not, because it is normal communication with the intranet, so the network card is certainly no problem. Then you can eliminate the problem of the physical layer.

Fault Analysis: Analog data communication

Look at the network layer, this computer can access the Internet, but there are only lost packets, it seems that the network layer should not have problems, then all the problems seem to focus on the data link layer. What is the problem with the data link layer? Thinking for a few days, no clue, finally had to think carefully about the process of network communication, see if you can find the problem.

Assuming that the computer has a packet that needs to be sent to the Internet, it will first check whether the destination address and the native address are in a network, and if not on a network, packages the data to the default gateway. In this case the destination IP is an Internet address and certainly not in a network, so packets are sent to the default gateway. The default gateway here is the Cisco three-tier switch with an IP address of 192.168.24.10. At this time 192.168.24.55 this computer will check the local ARP table, look for 192.168.24.10 of the corresponding MAC address, if the ARP table does not find the corresponding ARP entry, it will send an ARP request package, and send it to the network of all devices to obtain 192.168.24.10 MAC address. Because the ARP Request packet is sent in a broadcast manner, all devices in the network receive the packet and then pass it to the network layer test.

When the Cisco three-tier switch receives this ARP request, it checks that the IP address of the computer and the destination IP address in the ARP request package are the same, and if the same, the switch will make an ARP response, sending its MAC address to the source, which is 192.168.24.55 this computer. When the computer receives the ARP reply package, it writes the IP address (192.168.24.10) and MAC address of the switch to the ARP table, encapsulates the MAC address of the switch as the destination MAC address into the packet, and sends the packet to the switch. After receiving the packet, the switch will check whether the destination IP is in this segment, if it finds that it is not in this segment, it will look for the routing table and see if there are any routing entries to the destination IP, if not, the data will be packages to the default route. The default route for this switch in this case is the firewall with IP 192.168.24.7. So the switch sends an ARP broadcast to get the MAC address of the firewall. After the firewall makes an ARP reply, the switch encapsulates the MAC address of the firewall as the destination MAC address into the packet, the packet is sent to the firewall, and the firewall repeats the process, sending the data packages to the destination address on the Internet. All these processes are normal and there is no problem. It is also normal to use the TRACERT command to track routes in the ARP table of the computer and switch to find the appropriate ARP records. Where exactly is the problem? It seems that we have to continue with the analysis.

Fault Analysis: Filter ARP table

Once the packet has reached the destination address on the Internet, the response packet is returned to the computer, and it should repeat the previous procedure. Returns the packet to the firewall first, the ARP table in the firewall to find the destination IP address of the corresponding MAC address, if not, will send ARP request, get the purpose of the computer's MAC address, the computer's IP address and MAC address written to the firewall's ARP table, encapsulated and sent to this computer. All this seems to be normal, but why is there a time when the phenomenon of broken? By this computer in the network are normal phenomenon to judge, on the three-tier switch should be no problem, only when the Internet access problems, and finally decided to start from the firewall inspection.

Telnet on the firewall, check the firewall configuration, all normal; Check the port, everything is OK; Check the routing table, and it's all right. In doubt, it seems that I do not know where to begin. All of a sudden, in order to prevent intranet users to steal IP address Internet, on the firewall to do the IP address and MAC address binding! Yes, check the ARP table. So enter the command get ARP, display a large list of ARP table information, unexpectedly all is the IP address and MAC address static binding information, only a dynamic, that is the firewall next jump IP address and the next hop MAC address information, Is that there is no 192.168.24.55 arp table entries, is the problem of the ARP table? seems to see a glimmer of hope!

So I decided to try to clear several statically bound ARP table entries first. First with the unset ARP command cleared 6 static binding ARP table entries, and then ping the Internet address on that computer, incredibly do not lose the bag!? Does it solve the problem that bothers me for a few days? I can hardly believe it, and let my colleagues test on this computer, login QQ, browse the Web, send and receive mail ... Incredibly all normal, no original time when the phenomenon of broken! And then telnet to the firewall, execute get ARP command a look, 192.168.24.55 that computer's ARP list impressively in the eye. It seems that the problem is really solved! Please sit down and think about the reason.

Fault tracing

  This netscreen ns 25 firewall supports up to 128 ARP entries, and if no static bindings are made, the ARP table entries are constantly updated, the timeout automatically erased, so the ARP entry is not filled. And if it is static binding, then it will never be purged, will always occupy an ARP table entries, leaving the dynamic use of the ARP table entries will be less space, until all fully occupied, causing the situation I encountered. So then, a friend will ask, since they are full, the other computer will be completely impassability, why will there be when the phenomenon of broken? So I counted the ARP entries, the static binding is just 127, and the rest of the firewall's next jump address is occupied, note that this is dynamic, When it's time to update, it's erased, and that computer takes up the table, and the network gets through because there are other computers that are constantly accessing the Internet, So the 192.168.24.55 of the ARP entry as soon as the update time will be the firewall of the next hop of the address occupied, then the network does not pass. In fact, at this time, all of my units in the access to the Internet will appear when the phenomenon, but the next hop of the firewall address occupy the ARP table entries for a long time, the internet interruption of the time in everyone can endure the range, have not found it. Because the firewall's next hop address occupies an ARP table entry for a long time, 192.168.24.55 arp table entries are not entered into the ARP table, resulting in a timeout, so it does not pass the time is a bit longer, when the phenomenon of broken.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.