When I was curious about the server's performance, the first thing I thought of was the "top" command. Top is not the best, it is not a long-term snapshot, but it provides a good time point snapshot of the server, and tries to provide a message to tell "What is happening now? ". Unfortunately, if you do not have a deep understanding of the meaning of different display domains, the output of top is easily misunderstood.
I will not fully interpret the man page of the top command. When you and time and willingness, it will always be there waiting for you. What I want to do is to point out several key points on how to get a quick overview of the system, and hope to get instructions on how to do it. Top is my first stop in troubleshooting, but it is rarely my only one.
The top command is the load average, which is displayed on the screen in the upper right corner. The average load is calculated based on the number of statistics collected, but generally it can be considered as the number of requests from the CPU. If your machine has a single-core CPU, the average load is 1, which means that the machine is fully loaded and has sufficient capabilities to complete tasks within the sampling time. Similarly, if the average load is 2, the single-core CPU is overloaded, and two available kernels are required to complete the required tasks within the same sampling time. With the release of 8, 16, and 32 cores, I will pay attention to the average load. For example, when I need to check, I will press the number "1" in the top, this will list all the CPU cores, so that I can get a fast count to compare the load.
The second item I checked is the 9th column of the content listed above, marking "% CPU. The explanation for this column is vague:
The CPU running time after the last screen refresh of the task is expressed as the percentage of CPU time. In a real SMP (multi-processor) environment, if 'irix mode' is disabled, top runs in 'solaris mode, the cpu usage of a task is divided by the total number of CPUs. You can use the 'I' (uppercase letter I) Interactive command to trigger the Irix or Solaris mode.
It's not clear at all, right? The main thing to remember here is that if a single process increases its usage for a reason or other factors, it is likely to display it in the first row of top with a % CPU high number.
The area I noticed next is "Cpu (s):", in the middle of the header information. In particular, I am interested in % us, sy %, % id, and % wa, they are the time ratio of user processes, system processes, idle time, and CPU used to wait for I/O flow execution. This percentage should be close to 0, which requires close attention when it is higher than 5%.
Finally, I want to see the system up time, which is displayed in the upper left corner. If I have questions about a server and the server has been restarted recently, what may be found here, maybe a daemon process is not started.
These checks only take a few seconds. If I only observe, I may allow top to run for several minutes and observe processes, CPU, and load, but usually I quickly enter and exit top. Top is one of the amazing System Administrator tools that gives you a System Health overview and allows you to quickly diagnose potential problems.
Recommended reading:
Linux traffic monitoring tool-iftop
Linux top commands
Top commands in Linux
Efficient use of top commands in Linux
Linux top commands
Linux System Monitoring load top Command details