I. uptime
The Uptime command displays how long the server has been running, how many login users there are, and the overall server performance evaluation (load average ). The load average value records the load of the last 1 minute, 5 minutes, and 15 minutes respectively. The load average is not a percentage, but the number of processes waiting for execution in the queue. If the process requires that the CPU time be blocked (meaning that the CPU does not have time to process it), the load average value will increase. On the other hand, if each process can immediately obtain the CPU access time, this value will be reduced.
The optimal value of load average under UP kernel is 1, which means that each process can be processed by the CPU immediately. Of course, there is no problem in lower level, which only indicates that part of the resources are wasted. However, this value varies between different systems. For example, for a single-CPU workstation, loading average of 1 or 2 is acceptable,In a multi-CPU system, this value should be divided by the number of physical CPUs. If the number of CPUs is 4 and the load average is 8 or 10, the result is more than 2 points.
You can use uptime to determine whether a performance problem occurs on the server or on the network. For example, if the Running Performance of a network application is not satisfactory, run uptime to check whether the system load is high. If not, this problem may occur on your network.
Ii. top
The Top command displays the actual CPU usage. By default, it displays information about the CPU usage tasks on the server and refreshes every 5 seconds. You can classify them in multiple ways, including PID, time, and memory usage.
The following describes the output values:
Reference PID: process ID
USER; USER Name of the process owner
PRI: Process Priority
NI: nice
SIZE: The amount of memory occupied by the process (code + Data + stack)
RSS; number of physical memory used by the Process
SHARE; the amount of memory shared by the process and other processes
STAT: Process status: S = sleep, R = running, T = stopped, D = stopped, Z = botnet
% CPU: Shared CPU usage
% MEM; shared physical memory
TIME: the CPU usage TIME of the process.
COMMAND: COMMAND Line for starting a task (including parameters)
Process Priority and nice level
The process priority is a parameter that determines the priority of processes executed by the CPU. The kernel will adjust this value as needed. Nice value is a restriction on priority. The process priority value cannot be lower than the nice value. (The lower the nice value, the higher the priority)
The process priority cannot be changed manually. Only by changing the nice value can the process priority be adjusted indirectly. If a process runs too slowly, you can assign more CPU resources to it by specifying a lower nice value. Of course, this means that some other processes will be allocated less CPU resources and run slower. Linux supports the range of nice values from 19 (low priority) to-20 (high priority). The default value is 0. If you want to change the nice value of a process to a negative value (high priority), you must use the su command to log on to the root user. Below are some command examples for adjusting nice values,
Start the program xyz with nice value-5
# Nice-n-5 xyz
Change the nice value of a running program
# Renice level pid
Change the nice value of the process with pid 2500 to 10.
# Renice 10 2500
Botnets
When a process ends, it usually takes some time to complete all the tasks (such as closing open files) before it ends. In a very short period of time, the status of this process is zombie. After the process completes all the close tasks, it will submit the information to the parent process. In some cases, a zombie process cannot close itself, and the process is in the z (zombie) state ).You cannot use the kill command to kill a zombie process because it is marked as "dead ". If you cannot get rid of a zombie process, you can kill its parent process and the zombie process disappears. However, if the parent process is an init process, you cannot kill the init process because init is an important system process. In this case, you can only restart the server once to get rid of the zombie process. Why does the application become frozen?
Iii. iostat
Iostat is part of the sysstat package. Iostat displays the average CPU time after the system is started (similar to uptime). It also displays the usage of the disk subsystem. iostat can be used to monitor CPU usage and disk usage.
CPU utilization is divided into four parts:
Reference % user: CPU usage of user level (Application)
% Nice: CPU usage of the user level with nice priority added
% Sys: CPU usage of system level (kernel)
% Idle: idle CPU resources
Disk usage includes the following parts:
Reference Device: block Device name
Tps: the number of I/O requests transmitted by the device per second ). Multiple independent I/O requests can be combined into one transmission operation because one transmission operation can have different capacities.
Blk_read/s, Blk_wrtn/s: The number of blocks read and written by the device per second. The block size may be different.
Blk_read and Blk_wrtn: Total number of Block devices read and written since the system was started.
Block Size
The block size may be different. The block size is generally 1024, 2048, or 4048 bytes. It can be obtained through tune2fs or dumpe2fs:
Reference [root @ rfgz ~] # Tune2fs-l/dev/hda1 | grep 'block size'
Block size: 4096
[Root @ rfgz ~] # Dumpe2fs-h/dev/hda1 | grep 'block size'
Dumpe2fs 1.35 (28-Feb-2004)
Block size: 4096
Iv. Vmstat
The Vmstat command monitors processes, memory, page I/O blocks, CPU, and other information. vmstat can display the average value or sample value of the detection results, the sampling mode provides monitoring results at different frequencies within a sampling period.
Note: In the sampling mode, you must consider possible errors in data collection. Setting the sampling frequency to a lower value can minimize the impact of errors.
The following describes the meaning of each column.
Reference · process (procs)
R: Number of processes waiting for running time
B: Processes in non-disruptive sleep state
W: process that is switched out but can still run. The value is calculated.
· Memoryswpd: Number of virtual memory
Free: Number of idle memory
Buff: The amount of memory used as a buffer
· Swap
Si: Number of switches from hard disk
So: Number of switches to the hard disk
· IO
Bi: number of blocks output to a block Device
Bo: number of blocks received from a device
· System
In: number of interruptions per second, including clock
Cs: Number of context switches occurring per second
· Cpu (percentage of cpu running time)
Us: Non-kernel code running time (user time, including nice time)
Sy: kernel code running time (System Time)
Id: idle time. In kernel versions earlier than Linux 2.5.41, this value includes the I/O wait time;
Wa: The time for waiting for I/O operations. In Linux versions earlier than 2.5.41, this value is 0.
The Vmstat command provides a large number of additional parameters. The following lists several useful parameters:
Reference · m: displays the memory usage of the kernel.
· A: displays the memory page information, including active and inactive memory pages.
· N: Display header lines. This parameter is useful when sampling mode is used and command results are output to a file. For example, root # vmstat-n 2 10 displays 10 output results at a frequency of 2 seconds.
· When-p {partition} is used, vmstat provides statistics on the I/O results
V. ps and pstree
Ps and pstree commands are the most common basic commands for system analysis. ps commands provide a list of running processes. The number of processes listed depends on the parameters attached to the command. For example, the ps-A command lists all processes and their corresponding process IDS (PIDS). The process PID is required before using other tools, such as pmap or renice.
On the system running the java application, the output of the ps-A command is easily beyond the display range of the screen, which makes it difficult to obtain complete information about all processes. In this case, the pstree command can be used to display all process information in a tree structure and integrate sub-process information. The Pstree command is very useful for analyzing the source of the process.
6. Numastat
With the continuous development of the NUMA architecture, such as eServer xSeries 445 and its subsequent products eServer xSeries 460, The NUMA architecture has become the mainstream of enterprise-level data centers. However, the NUMA architecture faces new challenges in terms of performance tuning. For example, the memory allocation problem is not of interest to the NUMA system before, while the Numastat command provides a tool to monitor the NUMA architecture. The Numastat command compares the local memory usage with the remote memory usage and the memory usage of each node. The Numa_miss column displays the local memory that failed to be allocated, and the numa_foreign column displays the allocated remote memory (slow access) information. Excessive calls to the remote memory will increase the system latency and affect the performance of the entire system. Enabling processes running on a node to access the local memory will greatly improve the system performance.
※The system I use does not support the NUMA architecture. This figure shows the original document.
VII. sar
The sar program is also part of the sysstat installation package. The sar command is used to collect, report, and save system information. The Sar command consists of three applications: sar, which uses and displays data; sa1 and sa2, which are used to collect and store data. By default, the system will add automatic collection and analysis operations to the crontab:
Reference [root @ rfgz ~] # Cat/etc/cron. d/sysstat
# Run system activity accounting tool every 10 minutes
*/10 * root/usr/lib/sa/sa1 1 1
# Generate a daily summary of process accounting at 23:53
53 23 *** root/usr/lib/sa/sa2-
The data generated by the sar command is saved in the/var/log/sa/directory. The data is saved by time and the corresponding performance data can be queried by time.
You can also use sar to get a real-time execution result under the command line. The collected data can include CPU utilization, Memory Page, network I/O, and so on. The following command indicates that sar is executed five times at an interval of 3 seconds:
8. free
The free command displays the usage of all the system memory, including idle memory, used memory, and swap memory. The Free command also displays cache and buffer information used by the kernel.
When using the free command, remember the linux memory structure and virtual memory management methods, such as the limit on the number of idle memory and the use of swap space does not indicate a memory bottleneck.
Useful parameters of the Free command:
Reference ·-B,-k,-m, and-g are displayed by bytes, kilobytes, megabytes, and gigabytes respectively.
·-L difference between low and high memory
·-C {count} shows the number of free outputs
9. Pmap
The pmap command shows the amount of memory used by one or more processes. You can use this tool to determine which process on the server occupies too much memory, leading to memory bottlenecks.
10. Strace
Strace intercepts and records the system call information of a process, and also includes the command signal received by the process. This is a useful diagnostic and debugging tool. The system administrator can solve program problems through strace.
Command Format. You must specify the ID of the process to be monitored. This is mostly used by developers.
Strace-p <pid>
11. ulimit
You can use ulimit to control the use of system resources. See previous logs: Use ulimit and proc to adjust system parameters
12. Mpstat
The mpstat command is also part of the sysstat package. The Mpstat command is used to monitor the status of each available CPU in a multi-CPU system. The Mpstat command can display the running status of each CPU or all CPUs, and monitor sampling results at a certain frequency using parameters like the vmstat command.
XIII. Appendix
This article captures and modifies the IBM Redbook Tuning Red Hat Enterprise Linux on IBM eServer xSeries Servers.