The superior performance and price of vswitches are greatly reduced, promoting the rapid popularization of vswitches. Network administrators often encounter a variety of switch faults at work. How can they quickly and accurately find and eliminate faults? This article briefly introduces common fault types and troubleshooting steps. As vswitches are widely used in the company's network, from the low end to the middle end, from the middle end to the high end, almost every level of products are involved, so the probability of a switch failure is better than that of a router, hardware firewalls are much higher, which is why we first discuss the classification of switch faults and troubleshooting steps.
I. Switch fault classification:
Switch faults can be divided into two categories: hardware faults and software faults. A hardware fault mainly refers to the failure of the power supply, backplane, module, port, and other components of the switch. It can be divided into the following categories.
(1) power supply faults:
The power supply is damaged or the fan is stopped due to unstable external power supply, aging of power supply lines, or lightning strikes. Other components in the machine are often damaged due to power supply.
If the POWER indicator on the panel is green, it indicates that it is normal. If the indicator is off, it indicates that the switch has no normal POWER supply. Such problems can be easily discovered, solved, and prevented.
To address this type of fault, we should first do a good job of external power supply. Generally, by introducing an independent power line to provide an independent power supply, and adding a voltage regulator to avoid instantaneous high voltage or low voltage. If conditions permit, you can add UPS (uninterruptible power supply) to ensure the normal power supply of the switch. Some UPS provide the regulator function, while others do not. Pay attention to the selection. Set up professional lightning protection measures in the IDC room to avoid lightning damage to the switch. Now there are many professional companies engaged in anti-ray engineering, which can be considered when implementing network cabling.
(2) Port faults:
This is the most common hardware fault, whether it is fiber port or twisted pair RJ-45 port, in the plug-in connector must be careful. If the optical fiber plug is accidentally soiled, the optical fiber port may be contaminated and cannot communicate normally. We often see that many people like the live plugging connector, which is theoretically acceptable, but this also inadvertently increases the port failure rate. It may also cause physical damage to the port. If the size of the purchased crystal head is too large, it is easy to damage the port when the switch is inserted. In addition, if a section of twisted pair wires connected to a port is exposed to the outside, in case the cable is struck by lightning, the connected switch port may be damaged or cause more unexpected damage.
Generally, one or more ports are damaged. Therefore, after the failure of the computer connected to the port is ruled out, you can change the connected port to determine whether it is damaged. In case of such a fault, you can use alcohol cotton balls to clean the port after the power is off. If the port is damaged, you can only change the port.
(3) module faults:
A vswitch is composed of many modules, such as stack module, management module (also called control module), and expansion module. The failure rate of these modules is very small, but once a problem occurs, it will suffer huge economic losses. Such failures may occur if you are not careful when plugging or removing modules, or when the switch is moved, or when the power supply is unstable.
Of course, the three modules mentioned above have external interfaces, which are easy to identify. Some modules can also identify faults through the indicators on the module. For example, a stacked module has a flat trapezoid port, or some switches have interfaces similar to USB. The management module has a CONSOLE port for establishing a connection with the network management computer to facilitate management. If the expansion module is connected to an optical fiber, there will be a pair of Optical Fiber interfaces.
In troubleshooting such a fault, first ensure that the power supply of the switch and module is normal, then check whether the modules are inserted in the correct position, and finally check whether the cables of the connection module are normal. When connecting to the management module, you also need to consider whether it uses the specified connection rate, whether there is parity, whether there is data flow control and other factors. When connecting the expansion module, you need to check whether the communication mode is matched, for example, whether the full or half duplex mode is used. Of course, if the module is faulty, there is only one solution, that is, you should immediately contact the supplier for replacement.
(4) backplane faults:
Each module of the vswitch is connected to the backboard. If the environment is wet, the circuit board is short-circuited by the tide, or the components are damaged due to high temperature, lightning, and other factors, the circuit board will not work properly. For example, if the heat dissipation performance is poor or the ambient temperature is too high, the temperature in the machine increases and the components are burned out.
When the external power supply is normal, if the internal modules of the switch cannot work normally, the backboard may be broken. In this case, even the electrical maintenance engineer may not be able to handle this problem, the only way is to change the backplane.
(5) cable faults:
In theory, such faults do not belong to the switch itself. However, in actual use, cable faults often make the switch system or port abnormal, therefore, such faults are also classified as switch hardware faults. For example, if the connection is not tight, the cables are arranged incorrectly or in an irregular order during cable preparation. When connecting the cables, the cables should be connected using a straight line. The two optical fiber cables in the optical fiber cables are staggered, network loops are caused by incorrect line connections.
From the above several hardware faults, poor data center environments can easily lead to various hardware faults, so when we build data centers, we must first build the anti-ray grounding and power supply, indoor temperature, indoor humidity, anti-electromagnetic interference, anti-static and other environments to provide a good environment for the normal operation of network equipment.
2. General troubleshooting steps for switch faults:
Vswitch faults are diverse, and different faults have different forms. In case of failure analysis, you must use various phenomena to flexibly use troubleshooting methods (such as troubleshooting, comparison, and replacement) to locate the fault and promptly eliminate it.
(1) Division:
When we face the fault and analyze the problem, we have learned to use the exclusion method to determine the direction of the fault. Based on the observed fault phenomenon, this method lists all possible faults as much as possible, and then analyzes and resolves them one by one. We should follow the principle of simplicity to complexity to improve efficiency. This method can be used to cope with various faults. However, maintenance personnel must have a strong logical thinking and have a thorough understanding of switch knowledge.
(2) comparison method:
The comparison method is to use the existing vswitches of the same model that can run normally as the reference object and compare them with the faulty vswitch to find out the fault point. This method is simple and effective, especially for System Configuration faults. You only need to make a simple comparison to find out the configuration differences, however, it is not easy to find a vswitch with the same model and configuration.
(3) replacement method:
This is our most commonly used method, and it is also a frequently used method in the maintenance of computers. Replacement refers to using normal switch components to replace faulty parts, so as to find the fault point. It is mainly used for the diagnosis of hardware faults, but it should be noted that the replaced parts must be the same type of switches of the same brand and model.
Of course, we can analyze faults according to the following principles in order to make the troubleshooting work follow the rules below.
1. From far to near
Port module-> horizontal cable-> jumper-> vswitch-Check one by one to eliminate the possibility of remote failure.
2. From the outside
If the switch has a fault, we can first identify the various external indicators, and then check whether the internal components are faulty according to the fault instructions. For example, if the power led is green, the POWER supply is normal. If the POWER is off, the POWER supply is unavailable. If the LINK LEDs is yellow, the connection is currently working at 10 Mb/s. If the LINK green is 100 Mb/s, the connection is unavailable, flashing indicates that the port is manually disabled by the Administrator; rdp led indicates redundant power supply; mgmt led indicates the Administrator module. Regardless of whether the fault is located from the outside, you must log on to the switch to determine the specific fault and take corresponding troubleshooting measures.
3. From soft to hard
In the event of a fault, no one wants to move, so the screwdrivers should first split the switch, so during the inspection, the system configuration or system software should always be used for troubleshooting. If the problem cannot be solved on the software, the hardware is faulty. For example, if a port is not easy to use, we can first check whether the user's connected port is not in the corresponding VLAN, whether the port is disabled by another administrator, or other reasons for configuration. If the system and configuration possibilities are eliminated, You can suspect that the real problem lies in hardware faults.
4. Easy and difficult
In case of complex fault analysis, you must start with simple operations or configuration. This can speed up troubleshooting and improve efficiency.
Iii. Summary:
Due to various switch failures, there are no fixed troubleshooting steps, and some faults are often clearly oriented and can be identified at a glance. Therefore, you can only analyze the problem based on the actual situation. Of course, no matter what kind of fault is difficult for a new network administrator. If you want to become a master of switch troubleshooting, we must accumulate experience in our daily work. Every time we get a problem, we carefully review the root cause of the problem and the solution. In this way, we can constantly improve ourselves and better fulfill the important responsibilities of network management.