FAQ for Network loop in switch

Source: Internet
Author: User
Tags switches network troubleshooting port number

An improper port connection between switches in Ethernet can cause network loops, and if the associated switch does not have STP enabled, the loop triggers endless repetitive forwarding of packets, creating a broadcast storm that causes network failures.





one day, we found a problem with the VLAN in the network running performance monitoring platform of the campus network--the connection between the access switch and the campus network was interrupted. Check placed in the network center of the convergence switch, measured with the 100BASE-FX port has a large number of inbound traffic, and the flow is very few, it seems very abnormal. However, the performance of this converged switch seems to be fine, and it doesn't feel like it's a problem. So, we are in this convergence switch on camera like this abnormal port, with the Protocol analysis tool sniffer to grasp the bag, at most, can catch more than 100,000 per second. With a simple analysis of these packets, we found some of the common features.





at that time, we are anxious to repair the network as soon as possible, not to delve into the characteristics of these packets, only to see the 1th network is unknown to the origin of the Syn flood attack, is estimated to be caused by a new network virus, immediately the convergence switch on the port to disable, so as to avoid the





Troubleshooting





In order to be able to test the connectivity of the network in the field, in the Network Center, we connect the building access to the switch multimode tail fiber by the photoelectric converter with twisted pair to connect to a PC, and analog to the problem VLAN gateway. Then, to the scene to find a building network administrator, want him to help us as soon as possible infected with the unknown virus to find and isolate the host. According to the building network management staff reflected that the net was normal yesterday, however, at that time, a department of the building is doing network adjustment, today to find the network to work, I do not know if they have relations. We think that adjusting the network should not be related to the infection of the virus. Between the main wiring of the building, we unplug the cable from the access switch, connect to the laptop, and connect to the test host in the Network Center. We confirm that the link is no problem, every time the remaining half of the number of cable back to the switch, the test is no problem if you continue to, or swap the other half, gradually reduce the number of suspected problem network cable. We end up with a network cable that causes problems, and as long as this cable is plugged in, the building networks are disconnected from the analog gateway. Identified by the building's network administrator, the network cable is connected to the department that made the adjustment yesterday. He also said that the department had previously pulled a main two cable, there should be another one, and personally on the switch to find out the other one. At random plug in these two lines in one, the network is no problem, but as long as plug in, there are problems, which have in a switch on the same time plug in two network





line will activate the network virus SYN flood attack? At this point we feel that this phenomenon is more like a network of loops. We went to that department. Three unmanaged switches were found, all strung together, but two of them were connected to the access switch via the two cables, leading to the network loop. It is clear that the construction personnel network topology is not clear, then the building network administrator was out of the way, self-righteous to connect the wrong line, which caused the network accident. Find the reason to do it, just unplug one of the connected network cable to restore connectivity. After some setbacks, the network returned to normal, but we have been thinking, what is interfering with our judgment?





Fault Analysis





A typical network loop fault, with the protocol analysis tool sniffer grabbed so many packets, after some analysis but did not see the problem. Obviously, the first glimpse of a large number of SYN packets gave us the illusion that we assumed it was a SYN flood attack. After that, we reviewed the process of troubleshooting the network loop and carefully analyzed the data packets that were captured, explaining the 5 common characteristics of the data packets mentioned earlier, so that they can respond correctly in time for similar problems in the future. First look at the first 4 features: The convergence switch is a network layer device, the building's VLAN network layer interface is set up on this convergence switch, for the implementation of network management policy needs, the registered or unregistered IP address has been a MAC address binding. A TCP connection can be established with a handshake of 3 times. The SYN packet that initiates the connection here is 28 bytes long, plus a 14-byte Ethernet frame head and a 20-byte IP header, the frame length captured by Sniffer is 62 bytes (No 4 byte error detection FCS field). Happened to visit the VLAN at that time the unicast frame is from the extranet TCP request packet, according to the Ethernet bridge forwarding mechanism, through CRC correctness detection, because the static ARP configuration, this convergence switch will the unicast frame of the source MAC address conversion cost Machine MAC address, Its purpose is to replace the MAC address according to the binding parameters, and recalculate CRC values, update the FCS domain, after such encapsulation, and then forwarded to the building's access switch.





look at the last 1 features: A bridge is a storage and forwarding device that connects to a similar LAN. These bridges monitor every data frame transmitted over all the ports, using the bridge table as the forwarding basis for the data frame. The bridge table is a MAC address and a "MAC address-port number" List of the port number that is used to reach the address, which is refreshed using the source MAC address of the data frame and the port number that receives the frame. Bridges are used to bridge tables: When a network bridge receives a data frame from a port, it refreshes the bridge table and finds the destination MAC address of the frame in its bridge table. If found, the frame is forwarded from the port corresponding to the MAC address (if the switch is the same as the receiving port, the frame is discarded).





If it is not found, the frame is forwarded to a port other than the receive port, which is broadcast. This assumes that, during the entire forwarding process, the bridge A, B, C, and D do not find the destination MAC address of the data frame in its bridging table, that is, the network Bridge does not know which port to forward the frame from. When bridge a receives a unicast frame from the upstream network from the upper-end port, it broadcasts the frame, bridge B, C received will also broadcast the frame, Network Bridge D received from the Bridge B, C of this unicast frame, and respectively through the Network Bridge C, B transmission back to the bridge A, to this bridge a received two copies of the unicast frame. In such a circular forwarding process, the bridge a constantly on the different ports (at this time does not involve the upper end of the port) received the same frame, because the receiving port is changing, the bridging table is also changing the "source mac-port number" list content. It has been assumed that the bridge table in the network Bridge does not have the destination MAC address of the frame, and bridge A, after receiving the two unicast frames, can only broadcast the frame again to ports other than the receive port, so the frame is forwarded to the upper port.





in terms of each unicast frame, bridge a repeats the process mentioned above, theoretically, the broadcast will receive 21 frames, broadcast two times will receive 22 frames, ..., broadcast to Nth times will receive 2n frames. In short, Network Bridge A as this forward, will soon form a broadcast storm, this unicast frame will eventually consume the 100base-x port bandwidth. Although in this period the upper port will have many data frames collide with each other and become incomplete, so that sniffer capture, but can imagine the number of repeated occurrences of this unicast frame will still be very much. Once again, we checked the packets that came back, and almost all of them found duplicate flags that were not noticed at the time. Based on the 64-byte packet length, the Ethernet switch has a 100BASE-FX port forwarding line speed of up to 144000pps. In this network loop state, it is entirely possible for sniffer to catch more than 100,000 packets of 66 bytes per second.





Based on the above reasons, since the destination MAC address of the packet was not available in the bridge table of the 4 switches, the converged switch on the upstream network sent a TCP request packet to the building, and it continued to receive a copy of the TCP packet that was forwarded back by the building's access switch, And the number is very much (form large flow), however, it does not send the received packets back to the Internet, the network application is based on the request/Response mode, only send/Receive two channels are unblocked, in order to carry out end-to-end communication. Once a channel in this network application is blocked, it will cause the application to end because it cannot. After the end of the network application, generally speaking, the requesting party will not automatically send the request packet again for this application. As a result, in the network loop state there is a common channel has a large flow, the other channel almost no traffic phenomenon. Because VLANs have the ability to isolate broadcast domains, these large flows do not travel through the network layer, so there is no significant pressure on the converged switch. In fact, because this kind of network loop is the fault on the data link layer, only involves the source MAC address and the destination MAC address, no matter what type of package the high-level package may cause the broadcast storm. In other words, it was possible to catch all kinds of packets with sniffer.





Fault Prevention





Campus Network access layer is user-oriented network interface, there are many uncontrollable components, the situation is very complex, should be managed by a person, but also on the equipment to give reliability assurance. This hug access switch is manageable, has STP function, other switches are not management-type switch, no STP function. Originally in the Access switch has been configured with STP function, this network accident is completely avoidable, but somehow did not do so, afterwards can only be right when the "mend". Thus, even if the access switch to open STP function, downstream network for some reason to form a loop, generate a broadcast storm, the upstream network of the impact of the VLAN, so the access switch should also broadcast packet suppression function, so that the impact can be limited to the local scope. For the downstream network of the switch also have these requirements, it is only a cost problem. In a word, in the network troubleshooting, technology and experience is important, but in peacetime it is necessary to pay attention to the maintenance of the network standard connection, the implementation of basic preventive measures more important.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.