Good network management and design to solve Network Problems

Source: Internet
Author: User
Tags ping and traceroute

It is easy to solve network problems. Packet Loss, over-configuration, security patches, and software version control make network engineers have a nightmare. However, many IT professionals find that good network design and management can help them alleviate these problems.

Patrick Miller, architecture and desktop service manager of Apex Tool Group LLC, remembers a problem that often occurred many years ago in trying to trace the ring network. This problem is still common in many enterprise networks. "I have encountered such a situation. At every night, the UPS (uninterruptible power supply system) outside our factory will have power outages. No one can explain why this is the case, "Miller said," so I went to ping and traceroute with the sniffer and notebook, and finally I checked the cable."

In the end, Miller found that the power plug of a controlled access unit was unplugged, and a clean aunt would unplugged it every night so that she could use the vacuum, which caused the entire network to go down. "This strange thing happens from time to time," he said. "Sometimes you will deploy a device of $10 thousand to try to solve this problem, and sometimes you only need to track the cable to solve it. Packet loss is a completely different problem. packet loss is one of the most weird tasks. Sometimes you cannot find a solution ."

It is hard to find a solution to network problems, but network engineers hope to reduce the time they spend on these problems. Unfortunately, many companies still have a long way to go. MyITassessment.com is a software-as-a-service-based infrastructure evaluation vendor that helps large system integrators evaluate customer networks. The company summarizes valuable statistics from its scans of more than 2000 enterprise networks.

1. packet loss occurs on Layer 3 devices in 63% of enterprise networks.

2. In the 35% network, switches that have exceeded the quota cause performance problems.

3. 44% of Enterprise vswitches and vrouters have unpatched security vulnerabilities.

4. more than 75% of enterprises have different versions of IOS on devices of the same product series.

5. In the 54% network, vswitches and routers are no longer supported by suppliers.

These problems still exist, and many network providers are actively seeking ways to deal with them.

Solve packet loss and overprovisioning Problems

Engineers may never completely avoid packet loss in their networks, but rigorous monitoring and better network design can help ease these problems. Forrest Schroth, network manager of Randstad, a global human resources company, supervises 300 sites on its Multi-Protocol Label Switch (MPLS) cloud. He closely monitors four metrics to prevent packet loss.

"I usually look for jitters and incorrectly assembled data packets. This may be a problem for telecom carriers, or it may be because of bad internal interface cards. I want to ensure that the usage does not violate certain thresholds and you can increase the latency, "Schroth says." When I come to work in the morning, I will have a chart to show me the status of all sites, and which site has the most errors, the most jitter, and the most utilization. We will perform traffic shaping. When an error occurs, we call all the operators that interface with us, and the operators that are located between the edge of our suppliers and the edge of our customers to find out the source of the error, this is usually the daily work of an engineer."

Rich Siedzik, director of computer and telecommunications services at the University of Bryant, said, however, it is difficult to trace the loss of data packets on the LAN. "For us, this is usually the case. When you start to see service downgrade or user complaints, you will find data packet loss, and then you start to trace this problem. This is very difficult because there are too many network segments and too many different paths, "Siedzik says." It is almost impossible to run the detection tool on each path. Therefore, when we check different network segments, we will give priority to some network segments, such as the network segments from the core to the distribution layer. Then, when you reach the access layer, there is only a small amount of monitoring, because more points need to be monitored ."

In many cases, packet loss is caused by bad cables or ports. Sometimes it is because of poor design. The biggest design mistake made by network engineers is focusing on bandwidth, rather than the ability of switches to process data packets. Randstad's Schroth said: "It is a gigabit interface, it does not mean it will receive all the traffic. I am more interested in the rate at which a device receives traffic, that is, the packet receiving rate per second. I have seen many people entering 10 Gigabit, which is good, but you need to ensure that the device is in line rate ."

Jeremy Littlejohn, CEO and Chief Analyst of myITassessment.com, agreed. Too many engineers use bandwidth to solve the problem, rather than going deep into the root cause of the problem. "Somehow, bandwidth is the primary indicator of everything, and this is not a good thing," he said. "engineers should focus on packet loss, check whether there is a lack of bandwidth or something else that causes packet loss."

Excessive configuration of vswitches and vrouters is also a headache for network bottlenecks. Sometimes, due to poor management of a single device, excessive device configuration may occur. Some enterprises do not track the backplane capacity of modular switches and routers, and too much bandwidth is installed on the online card, resulting in excessive configuration.

"The eight ports may all share an over-configured ASIC (dedicated Integrated Circuit) backplane. When we add all the virtual machines, this will be an invisible killer," Littlejohn said, "When we insert them into these ports, we think we have 8 Gigabit, but we actually only have 1 Gigabit."

Even if you carefully avoid excessive device configurations, this issue will continue. "Although I may increase the bandwidth here, it only means that the bandwidth is no longer limited, and the bottleneck problem persists," Schroth said. "The problem is, is this because of the application? Or different WAN connections? Or switch port Link? In some cases, the slowest connection point always appears ." Given this inevitability, network providers need to predict where the next bottleneck will appear and ensure that this point is monitored. "Any connection will have a slowest link. It is critical to pay close attention to this slowest link ."

Track OS versions, security patches, and vswitches and vrouters

Other challenges listed by myITassessment.com include operating system versions, security patches, and device life. These are actually the overall asset management issues. These problems will affect the enterprise's ability to expand and automate the network. "If enterprises do not have certain specifications for their assets, this may affect their expansion support," Littlejohn said. "When enterprises attempt to automate operations, there is only one difference, for example, different OS versions cannot be effectively executed."

Siedzik says that by using tools he can help Bryant University better track these issues, including Cisco Network Collector (CNC), an asset tracking device typically used by Cisco value-added dealers. "It shows us all the code levels, where the vulnerability is located, what we need to fix, and what we need to upgrade," he said. "They are all displayed in the report format. We had to make a lot of effort before ."

Prior to installing the CNC, network administrators at the Bryant University often found that the service life of a switch device had ended when they called the Cisco technical support center. Siedzik said: "This is the old method. We cannot operate it in this way ."

The University also uses CNC to regulate the IOS version on its devices. "In the past, we had a lot of different code versions," Siedzik said. "We have a lot of stacks on our campus. Sometimes we find an IOS version in this stack, in the other stack, there is another one. If we want to upgrade a specific code, this puts us at risk. We do not know whether this will affect the upstream or downstream links. Now, we have two major code updates each year ."

Six years ago, Randstad's human resources company installed a large green network, which made it easy for Schroth to implement strict version control through Cisco Prime Management Software in its network design. "I will know if someone has changed the configuration or operating system," he said. "We often check the directory report to ensure that all devices run the correct version. I usually do the same for hardware recovery. We switched from a 2600 series Cisco router to a 2800 series Cisco router. When we replaced the device, we chose the operating system that can run throughout the enterprise and templated it. When we do this again, We will select new software and try to find software that can be compatible and stable in the template. In this way, based on the size and functionality of the office, we have some configuration to determine what operating system and hardware should be used."

Good management does not necessarily mean you can standardize the code library, because vswitches and vro software still have their own particularity. "There are many requirements in our network, so we know that according to the vro model, a certain version of IOS will adapt to a certain requirement," Miller of Apex Tool Group said, "If I have another vro that needs to enable BGP (Border Gateway Protocol), I know this IOS version may have a vulnerability. Therefore, I need to use a later version. You should not use the latest version of IOS, because many times the new version has vulnerabilities, and people do not know yet ." Miller uses detailed documentation to track these different code versions on his device, but he wants Cisco to provide better tools to help his work, especially considering the complexity of managing different OS versions of different applications.

Miller said engineers always have to deal with balance management between risks and stability. Although some network engineers want to use the unified code installed with the latest security patches to facilitate them, this desire is unlikely to be realized in the production environment. "What is the most important thing to the business," he said. "Does my vswitch have a vulnerability? Or is stability the most important? This depends on the switch location ."

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.