Faults are inevitable during network management and O & M. If an administrator attempts to build a zero-fault network, it will be futile. There are too many randomness and contingency factors in network faults, not to mention human factors. Therefore, what administrators need to do is to master the network troubleshooting skills, accumulate experience to cultivate a keen sense of smell, and be able to quickly locate and eliminate faults without detours. I have witnessed many troubleshooting procedures of network management and some troubleshooting articles by others, and found that many people have taken many detours intentionally or unintentionally. The following are two examples of network troubleshooting. I hope you will be inspired by them.
Case 1: network faults caused by viruses
Fault symptom
At work in the morning, the user called to say that a client in a subnet could not access the Internet normally. In addition, the system sends feedback. When the DNS is pinged, the system can remotely log on to the layer-3 Switch and check the port connecting to the user's office building. No exception is found.
Fault Diagnosis
I suggest the Administrator first check whether a storm or network loop occurs in the network. Enable the Sniffer software to monitor the network where the user is located and check whether the traffic is abnormal. After two hours of monitoring, the traffic is normal. It is strange that, according to the user's report, the network was normal after work at noon, but in the afternoon, the user called again and said that the Network was abnormal. The problem was initially determined by the user's client.
The author asked the Administrator to go to the user's office one by one for troubleshooting. According to the user's feedback, if the network card is disabled and then enabled, the network will be normal, but the ping will fail in 10 minutes, and the network will be resumed. We know that the process of disabling and re-enabling the NIC is an Arp learning process, during which it will send an Arp request to ask who is the gateway of this network segment, then obtain the MAC address of the Gateway. When it needs to access machines of different network segments, it will throw the data packet to the gateway. So, is there a virus in a user's machine that can mimic the real gateway address, so that the client in the lan sends packets to the machine that imitates the real gateway when accessing the internet, resulting in a fault? I immediately found a machine and used the arp-a command to check the MAC address of the default gateway on this machine. I found that the MAC address of the default gateway is correct when the network is normal, when a fault occurs, the MAC address of the default gateway suddenly changes.
Troubleshooting
Write down the MAC address of the gateway displayed when a fault occurs, and then find the machine on the corridor switch based on the MAC address. After the network cable of the machine is unplugged, the network returns to normal. The reason why the Internet access is normal at noon is because the user shuts down the virus machine during off work, so everyone can access the Internet normally. After the virus is infected, the computer returns to normal.
Troubleshooting Summary
Through the Fault Analysis of this network, we have summarized the following points: first, when the network fails, we must learn more from the user end, it is best to grasp the essence of a network fault through the user's description of the fault. Secondly, when there is a strange network phenomenon, we can analyze whether the virus is detected on the machine on the user end, which may not necessarily be a problem with network devices.