Server Load balancer troubleshooting guide (5)

Source: Internet
Author: User

Previous http://www.bkjia.com/net/201110/107942.html

Troubleshooting methodology
 
Troubleshooting methods
To put it simply, the troubleshooting method is simple: Describe the problem, analyze the problem, and solve the problem. In these three steps, it takes the most time to analyze the problem, indicating that the problem is most easily ignored, while solving the problem is relatively simple.
 
Problem description
When a problem occurs, the first thing we should do is to collect information as soon as possible to identify the problem and its impact. Sometimes, we also need some tools for comparative testing to further confirm the problem.
Generally, we can further locate the problem by answering the following questions:
· Are the applications that have been in use or are newly released?
 
· Have the network or application been adjusted some time before the problem occurred?
 
· What types of applications are faulty? What is the corresponding VIP address? If possible, determine the functional modules that may cause problems.
 
· Determine the frequency of the problem. Is it frequent, occasional, or always?
 
· Determine the scope of impact of the problem. Are all users having the same problem, or are some fixed users having such problems?
 
Analyze problems
By analyzing the answers to the above questions, we may have a preliminary judgment on the questions. And roughly locate the problem. We assume that the various causes of the problem may be caused by further information collection, or using tools for some simple tests to verify our conjecture. When the test results are consistent with your own conjecture, you may find the cause of the problem.
This method is very simple, but it requires engineers to have a deep understanding of the relevant network knowledge-note that I use the word "Understanding" here. Because, only when you really understand these basic principles can you analyze problems easily.
Here, I would like to share with you a "simple" training method that can help us better understand these basic principles in the network.
Imagine that there are only three simple elements in this network: Client PC, switch, and Web server. If the client accesses the server through a browser, can you give full play to your imagination and imagine the entire data processing process on terminals, switches, and servers?
 
Simplified Version:
The client sends an HTTP request and forwards it to the Web server through a vswitch. Then, the Web server responds to the request.
 
This description method may be the simplest, but it is useless for network engineers. We can refine this description:
 
Simple version:
· The client first establishes a TCP connection with the server
 
· The client sends an HTTP request in the TCP Connection
 
· After receiving the request, the server returns the response content to the client
 
 
A little interesting. At least, we know that HTTP requests are encapsulated in the TCP protocol. Can I further refine it? Of course.
 
TCP interactive version:
· The client first sends a tcp syn packet to the server and requests a TCP connection.
 
· The server returns the client SYN + ACK
 
· The client sends an ACK packet to the server, and the TCP three-way handshake is successful.
 
· The client sends an HTTP request to the server
 
· The server returns an ACK packet to the client and confirms that the request is received.
 
· The server returns the response content to the client.
 
· The client returns an ACK packet to the server and confirms that the response content is received.
 
· After the server sends the response content, FIN1 is sent to terminate the TCP connection.
 
· After the client receives FIN1, it responds to FIN1 + ACK and sends FIN2 to close the TCP connection.
 
· After the server receives FIN2, it responds to FIN2 + ACK and the TCP connection is closed.
 
 
In this version, we have clearly analyzed the interaction process of the TCP protocol. Can I further refine it? The answer is yes, of course. This time, we will add some layer-2 interactive analysis.
 
ARP interactive edition:
Imagine that your PC Client is just starting and configured with a static IP address. How do I interact between PCs, switches, and servers when you enter the server address in your browser?
· The PC determines whether the IP address of the server to be accessed belongs to the same network segment as the IP address of the server based on its own mask and address.
 
· If the IP address belongs to the same network segment, query the local ARP cache and find the MAC address corresponding to the server IP address.
 
· If no broadcast packet is found, the PC will send an ARP query broadcast packet through the NIC. The source MAC is itself, and the target MAC is FF: FF.
 
· After receiving the ARP query broadcast, the server sends an ARP query response to the PC terminal through unicast, telling the PC that it is the server it is looking.
 
· After the PC receives the ARP response, it puts the corresponding relationship between the IP address and the MAC address in its ARP cache system.
 
· Then, the PC encapsulates tcp syn into a data frame. The destination MAC address is the MAC address of the server and forwards the data frame to the switch.
 
· After receiving various data frames, the switch puts the correspondence between the source MAC and the port in its MAC-ADDRESS-TABLE. In this way, the next time you receive the data frame, you can query your MAC and port ing table to quickly forward it.
 
· The switch forwards the tcp syn data frame to the port on which the server is located.
 
· After the server receives the tcp syn, it will perform three handshakes, send requests, respond, and close the connection as described above.
 
 
Of course, there are only three simple elements. In a complex network environment, you can imagine how data is generated from a PC and processed by various devices. The more careful you think, the easier you will find the cause of the problem.
 
Of course, when a problem occurs, we need to find the cause of the problem in the shortest time. Therefore, you can simplify these interaction processes, analyze the processes you suspect, and try to figure out the processing rules of the devices, we also designed some small experiments to verify our conjecture.
 
Solve the problem
After finding the cause of the problem, it is relatively simple to solve the problem. Of course, sometimes, due to the current environment and conditions, we may not be able to completely solve the problem, so we can also adopt some temporary solutions to temporarily solve the problem, this is a detailed analysis of specific problems.
 
Common troubleshooting principles
· Bottom-up troubleshooting
 
The bottom-up troubleshooting method is related to the hierarchical design of the network protocol. It starts from the physical layer and goes up layer by layer to troubleshoot possible causes. For example, if the user reports that the server cannot be accessed, first check the status indicator of the physical layer port from the connection line between the server and the switch, and then check the link status on the switch and server, then, check the ARP table, route table, and so on. In this way, troubleshoot the problem layer by layer and locate the problem.
· Divide and conquer troubleshooting methods
 
In some large networks, the bottom-up troubleshooting method may be complicated. Therefore, you can use ping to determine the level of potential problems. If the ping is successful, at least it indicates that there is no problem under the network layer, and the problem may occur in the upper layer; otherwise, the problem may occur below the network layer if the ping fails.
· Data Stream Tracing Method
 
For Server Load balancer devices, most problems may come from the network layer. Therefore, using a packet capture tool to track the processing of data streams is also a very effective method for troubleshooting and analysis.
· Configuration Comparison Method
 
Sometimes, if you suspect that the configuration may change, you may find some clues by comparing it with the original backup configuration.
· Module replacement method
 
Module replacement is often used for suspicious hardware faults. For example, by replacing the optical module, we can determine whether it is a problem caused by the damage of the optical module.
 
In the following sections, I will introduce common Server Load balancer problems and troubleshooting methods using this methodology and common troubleshooting principles and some cases.
 
 
E.S.
This article is from the "ADC technology blog ".

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.