Fast switch troubleshooting
Network administrators often deal with switches during their work. Because switches are widely used in enterprise networks, from low-end to high-end, they involve almost every level of products, therefore, vswitches are more likely to fail than routers and hardware firewalls. So how can we quickly and accurately locate and eliminate faults? Switch faults can be divided into two categories: hardware faults and software faults. The following describes how to solve these two types of faults.
Common switch hardware faults
Vswitch hardware faults mainly include the failure of vswitch power supply, backplane, module, port and other components.
1. Power Supply Fault
The power supply is damaged or the fan is stopped due to unstable external power supply, aging of the power supply line, or lightning strikes. Other components in the machine are often damaged due to power supply.
If the Power indicator on the panel is green, it indicates that it is normal. If the indicator is off, it indicates that the switch has no normal Power supply. Such problems are easy to discover and solve, and are also the easiest to prevent.
In view of such faults, the external power supply should be prepared first. Generally, an independent power line is introduced to provide an independent power supply, and a voltage regulator is added to avoid transient high voltage or low voltage. If conditions permit, you can add UPS to ensure the normal power supply of the switch. Some UPS provide the regulator function, while others do not. Pay attention to the selection. Second, set up professional lightning protection measures in the data room to avoid lightning damage to the switch. Now there are many professional companies engaged in anti-ray engineering, which can be considered when implementing network cabling.
2. Port faults
This is the most common hardware fault. Whether it's a fiber port or a twisted pair RJ-45 port, be careful when plugging the connector. If the optical fiber plug is accidentally soiled, the optical fiber port may be contaminated and cannot communicate normally. Many people like the live plugging connector. Theoretically, it is acceptable, but this inadvertently increases the incidence of port failures. The port may be physically damaged due to carelessness during handling. If the size of the purchased crystal head is too large, it is easy to damage the port when the switch is inserted. In addition, if a section of twisted pair wires connected to a port is exposed outside, the cable may be damaged or cause more unexpected damage if it is hit by lightning.
Generally, port failure means that one or more ports are damaged. Therefore, after the failure of the computer connected to the port is ruled out, you can change the connected port to determine whether it is damaged. In case of such a fault, you can use alcohol cotton balls to clean the port after the power is off. If the port is damaged, you can only change the port.
3. Module faults
A vswitch is composed of many modules, such as a stack module, a management module, or a control module. The failure rate of these modules is very small, but once a problem occurs, it will suffer huge economic losses. Such failures may occur if you are not careful when plugging or removing modules, or when the switch is moved, or when the power supply is unstable.
Of course, the three modules mentioned above have external interfaces, which are easy to identify. Some modules can also identify faults through the indicators on the module. For example, a stacked module has a flat trapezoid port, or some switches have interfaces similar to USB. The management module has a Console port for establishing a connection with the network management computer to facilitate management. If the expansion module is connected to an optical fiber, an optical fiber interface is provided.
In troubleshooting such a fault, first ensure that the power supply of the switch and module is normal, then check whether the modules are inserted in the correct position, and finally check whether the cables of the connection module are normal. When connecting to the management module, you also need to consider whether it uses the specified connection rate, whether there is parity, whether there is data flow control and other factors. When connecting the expansion module, you need to check whether the communication mode is matched, such as full or half duplex mode. Of course, if the module is faulty, there is only one solution-contact the supplier immediately for replacement.
4. Backboard fault
Each module of the vswitch is connected to the backboard. If the wet circuit board is short-circuited by the tide, or the components are damaged by high temperature, lightning, and other factors, the circuit board may not work properly. For example, if the heat dissipation performance is poor or the ambient temperature is too high, the temperature in the machine increases, resulting in component burning.
When the external power supply is normal, if the internal modules of the switch cannot work normally, the backboard may be broken. In this case, even an electrical appliance repair engineer may not be able to handle the problem. The only way is to change the backplane.
5. cable faults
In theory, such faults do not belong to the switch's own faults. However, in actual use, cable faults often cause abnormal operation of the switch system or port, therefore, such faults are also classified into switch hardware faults. For example, if the connection is not tight, the cables are arranged incorrectly or in an irregular order during cable preparation. When connecting the cables, the cables must be connected using a straight line. The two optical fiber cables in the optical fiber cables are staggered, network loops are caused by incorrect line connections.
From the above several hardware faults, poor data center environments can easily lead to various hardware faults, so when building data centers, the construction of anti-ray grounding and power supply, anti-electromagnetic interference, anti-static, indoor temperature control, humidity and other facilities must be completed to provide a good environment for the normal operation of network equipment.
Switch software fault
A software fault of a vswitch refers to a fault in the system and its configuration.
1. System Error
A vswitch system is a combination of hardware and software. There is a refresh read-only memory in the switch, which stores the software system required by the switch. Such errors are the same as those in Windows and Linux. Due to the design at that time, some vulnerabilities may occur, such as full load of switches, packet loss, and wrong packets, when conditions are appropriate. Therefore, the switch system provides methods such as Web and TFTP to download and update the system. Of course, errors may also occur during system upgrade.
For such problems, we need to develop the habit of browsing device manufacturers' websites frequently. If a new system is launched or a new patch is downloaded, update it in time.
2. Improper configuration
Beginners are not familiar with vswitches, or because the configurations of various vswitches are different, administrators often encounter configuration Errors When configuring vswitches. For example, the network is disconnected due to incorrect VLAN division, the ports are mistakenly disabled, and the switch and nic pattern configuration do not match. It is sometimes difficult to find such faults and requires some experience. If you are not sure whether your configuration is correct, restore the default factory configuration and configure it step by step. It is best to read the manual before configuration. This is also one of the habits of network management.
Each vswitch has a detailed installation manual and user manual, which are explained in detail in each module. Because most vswitch manuals are written in English, users who are not good at English can consult the supplier's engineers for specific configuration.
3. Lost Password
This may have happened to every administrator. Once you forget the password, you can use certain steps to restore or reset the system password. Some of them are relatively simple. Just press a button on the switch. However, some operations are required.
This type of situation occurs only when data is lost due to human forgetting or switch failure.
4. External factors
Due to viruses or hacker attacks, a host may send a large number of packets that do not comply with the encapsulation rules to the connected port. As a result, the vswitch processor is too busy to forward packets, the buffer overflow causes packet loss. Another scenario is the broadcast storm, which not only occupies a large amount of network bandwidth, but also occupies a large amount of CPU processing time. If the network is occupied by a large number of broadcast data packets for a long time, normal point-to-point communication will fail, and the network speed will be slow or paralyzed.
A failure of a network card or a port may lead to a broadcast storm. Because vswitches can only split conflicting domains, but cannot split broadcast domains without VLAN division), when the number of broadcast packets accounts for 30% of the total communication volume, the network transmission efficiency will be significantly reduced.
Software faults are more difficult to find than hardware faults. It may not take too much time to solve the problems. It is best to develop the habit of logging in your daily work. When a fault occurs, record the fault phenomenon, analyze the fault process, solve the fault, and summarize the fault categories in time to accumulate your own experience. For example, sometimes the network is not affected or the problem is not found due to various reasons during configuration, but the problem gradually becomes apparent several days later. If there is a log record, you can think of a configuration error in the previous days. Because this is often ignored, I thought it was a problem in other aspects. After a lot of detours, I found the problem. Therefore, it is necessary to record logs and maintain information.
Troubleshooting steps for vswitch faults
Vswitch faults are diverse, and different faults have different forms. In case of failure analysis, you must use troubleshooting methods flexibly to locate the fault and eliminate it in a timely manner.
1. troubleshooting principles
In order to make troubleshooting work follow the rules below, we can analyze faults according to the following principles.
1) from far to near
Generally, the failure of a vswitch is discovered by the connected computer, so it is often checked from the client. You can follow the route "client computer> port module> horizontal cable> jumper> switch" to check each other one by one to eliminate the possibility of remote faults.
2) from the outside
If the switch has a fault, you can identify the switch from various external indicators and check whether the internal components are faulty based on the fault instructions. For example, if the Power LED is a green light, the Power supply is normal, and if the Power is off, no Power supply is available. The Link LEDs is yellow, indicating that the connection is currently working at 10 Mb/s. The green value is 100 Mb/s. If the Link is disabled, the connection is not connected. If the Link is blinking, the port is manually disabled by the Administrator. Rdp led indicates redundant power supply. Mgmt led indicates the Administrator module. You must log on to the vswitch to determine the specific fault whether the fault is located from the outside or not, and take corresponding troubleshooting measures.
3) from soft to hard
When a fault occurs, a screwdriver is used to remove the switch if no one wants to change. During the check, the system configuration or system software is used for troubleshooting. If the problem cannot be solved on the software, the hardware is faulty. For example, if a port is not easy to use, you can first check whether the user-connected port is not in the corresponding VLAN, whether the port is disabled by another administrator, or other reasons for configuration. If the system and configuration possibilities are eliminated, the real problem may be caused by hardware faults.
4) first easy, then difficult
In case of complex fault analysis, you must start with simple operations or configuration. This can speed up troubleshooting and improve efficiency.
2. select an appropriate method
1) Exclusion
When faced with the fault phenomenon and analyzing the problem, we have learned to use the exclusion method to determine the direction of the fault. Based on the observed fault phenomenon, this method lists all possible faults as much as possible, and then analyzes and resolves them one by one. The principle from simplicity to complexity should be followed to improve efficiency. This method can be used to cope with various faults. However, maintenance personnel must have a strong logical thinking and have a thorough understanding of switch knowledge.
2) Comparison
The comparison method is to use the existing vswitches of the same model that can run normally as the reference object and compare them with the faulty vswitch to find out the fault point. This method is simple and effective, especially for System Configuration faults. You only need to make a simple comparison to find out the configuration differences, however, it is not easy to find a vswitch with the same model and configuration.
3) replacement method
This is our most commonly used method, and it is also frequently used in network maintenance. Replacement refers to the replacement of faulty parts with normal switch components to locate the fault point. It is mainly used for the diagnosis of hardware faults, but it should be noted that the replaced parts must be the same type of switches of the same brand and model.
BibliographyPrevious sectionNext section |