When a Linux system encounters a problem, we not only need to view the system log information, but also need to use a large number of performance monitoring tools to determine which part (memory, CPU, hard disk ......) An error occurred. In Linux, all running parameters are stored in the virtual directory/proc. In other words, the data value obtained by the performance monitoring tool is actually derived from this directory, when system overestimates are involved, we can modify the relevant parameters in the/proc directory. Of course, some parameters cannot be changed. Next let's take a look at these common performance monitoring tools.
Tools |
Function Description |
Uptime |
Average system load rate |
Dmesg |
Hardware/system information |
Top |
Process status |
Iostat |
Average CPU and disk usage |
Vmstat |
System running status |
SAR |
Real-time collection of system Usage Status |
KDE System Guard |
Graphic monitoring tools |
Free |
Memory usage |
Traffic-vis |
Network Monitoring (only available in SuSE) |
Pmap |
Process memory usage |
Strace |
Tracking program running status |
Ulimit |
System resource usage restrictions |
Mpstat |
Multi-processor usage |
1. uptime
The uptime command is used to view how long the server has been running and how many users have logged on to it, so as to quickly learn the server load.
The uptime output contains the load average, which shows the load of the last 1, 5, and 15 minutes. The value indicates the number of processes waiting for processing by the CPU. If the CPU does not have time to process these processes, the load average value increases, and vice versa.
The optimal value of load average is 1, indicating that each process can be processed immediately and no CPU cycles is lost. For single-CPU machines, either 1 or 2 is acceptable; for multiple-CPU machines, the load average value may be between 8 and 10.
You can also use the uptime command to determine the network performance. For example, if a network application has low performance, run uptime to check whether the server load is high. If not, the problem may be caused by the network.
The following are running instances of uptime:
Am up, 1 user, load average: 0.00, 0.00, 0.00
You can also view the/proc/loadavg and/proc/uptime files. Note that you cannot edit the files in/proc. You must run commands such as CAT to view the files, for example:
Liyawei :~ # Cat/proc/loadavg
0.00 0.00 0.00 1/55 5505
2. dmesgThe dmesg command is used to display kernel information. Dmesg can be used to effectively diagnose hardware faults or add hardware faults.
In addition, you can use dmesg to determine which hardware is installed on your server. Every time the system restarts, the system checks all hardware and records the information. Run the/bin/dmesg command to view the record.
Dmesg input instance:
Reiserfs: hda6: Checking transaction log (hda6)
Reiserfs: hda6: Using R5 hash to sort names
Adding 1044184 K swap on/dev/hda5. priority:-1 extents: 1 small SS: 1044184 K
Parport_pc: via 686a/8231 Detected
Parport_pc: Probing current configuration
Parport_pc: Current Parallel Port base: 0x378
Parport0: PC-style at 0x378 (0x778), IRQ 7, using FIFO [pcspp, tristate, compat, ECP]
Parport_pc: via parallel port: IO = 0x378, IRQ = 7
Lp0: Using parport0 (Interrupt-driven ).
E100: Intel (r) Pro/100 Network Driver, 3.5.10-k2-NAPI
E100: Copyright (c) 1999-2005 Intel Corporation
ACPI: PCI interrupt 0000: 00: 0d. 0 [a]-> GSI 17 (level, low)-> IRQ 169
E100: eth0: e100_probe: ADDR 0xd8042000, IRQ 169, Mac ADDR 00: 02: 55: 1E: 35: 91
Usbcore: registered new driver usbfs
Usbcore: registered new driver Hub
HDC: atapi 48x CD-ROM drive, 128kb cache, udma (33)
Uniform CD-ROM driver revision: 3.20
USB universal host controller interface driver v2.3
3. Top
The top command displays the activity status of the processor. By default, tasks that occupy the most CPU are displayed and refreshed every 5 seconds.
The value of Process Priority determines the sequence of processes processed by the CPU. The liunx kernel will adjust the size of this value as needed. Nice value is limited to priority. Priority value cannot be lower than Nice value (the lower the nice value, the higher the priority ). You cannot directly modify the value of process priority, but you can adjust the nice level value to indirectly change the value of process priority. However, this method is not always available. If a process runs abnormally slowly, you can lower the nice level to allocate more CPUs to the process.
Linux supports nice levels from 19 (low priority) to-20 (high priority). The default value is 0.
Run the/bin/ps command to view the current process.
4. iostatIostat is released by Red Hat Enterprise Linux. At the same time, iostat is also part of sysstat, which can be downloaded to the web site is http://perso.wanadoo.fr/sebastien.godard/
The average CPU time after the iostat command is run is similar to the uptime. In addition, iostat also reports activity for creating a server disk subsystem. The report contains two parts: CPU usage and disk usage.
Iostat display instance:
AVG-CPU: % USER % nice % System % iowait % steal % idle
0.16 0.01 0.03 0.10 0.00 device: TPS blk_read/s blk_wrtn/s blk_read blk_wrtn
Hda 0.31 4.65 4.12 327796 290832avg-cpu: % USER % nice % System % iowait % steal % idle
1.00 0.00 0.00 0.00 0.00 device: TPS blk_read/s blk_wrtn/s blk_read blk_wrtn
Hda 0.00 0.00 0.00 0 0avg-cpu: % USER % nice % System % iowait % steal % idle
0.00 0.00 0.00 0.00 0.00 device: TPS blk_read/s blk_wrtn/s blk_read blk_wrtn
Hda 0.00 0.00 0.00 0 0
CPU usage includes four items
% USER: displays CPU usage at user level (applications.
% Nice: displays the CPU usage when user level is nice priority.
% SYS: displays the CPU usage of system level (kernel.
% Idle: displays the percentage of CPU idle time. The disk usage report is divided into the following parts:
Device: the name of the block device.
TPS: the number of I/O transfers per second of the device. Multiple I/O requests can be combined into one, and the number of bytes transmitted by each I/O request is different. Therefore, multiple I/O requests can be combined into one.
Blk_read/s, blk_wrtn/s: the number of data blocks read and written from the device per second. The block size can be different, for example, 1024,204 8 or 4048 bytes, depending on the partition size. For example, run the following command to obtain the data block size of the device/dev/sda1:
Dumpe2fs-H/dev/sda1 | grep-F "block size" output result:
Dumpe2fs 1.34 (25-jul-2003)
Block Size: 1024blk_read, blk_wrtn: indicates the total number of data block reads/writes since the system is started.
You can also view the files/proc/STAT,/proc/partitions,/proc/diskstats.
5. vmstat
Vmstat provides the activity status of processes, memory, paging, block I/O, traps, and CPU.
Procs ----------- memory ---------- --- swap -- ----- Io -----System -- ----- CPU ------
R B SWPD free buff cache Si so Bi Bo in CS us Sy ID wa st
1 0 0 513072 52324 0 0 2 162404 32 0 0 261 0
0 0 0 513072 52324 0 0 0 162404 43 0 0 271 0 0
0 0 0 513072 52324 0 0 0 162404 27 0 0 255 0 0
0 0 0 513072 52324 162404 0 0 28 275 51 0 0 97 3 0
0 0 0 513072 52324 0 0 0 162404 21 0 0 255 0 0
Meaning of each output column:
Process
-R: The number of processes waiting for runtime.
-B: The number of processes in uninterruptable sleep.
Memory
-SWPD: The amount of virtual memory used (KB ).
-Free: The amount of idle memory (KB ).
-Buff: The amount of memory used as buffers (KB ).
Swap
-Si: amount of memory swapped from the disk (Kbps ).
-So: amount of memory swapped to the disk (Kbps ).
Io
-Bi: blocks sent to a block device (blocks/s ).
-Bo: blocks received ed from a block device (blocks/s ).
System
-In: The number of interrupts per second, including the clock.
-Cs: the number of context switches per second.
CPU (these are percentages of total CPU time)
-Us: time spent running non-kernel code (user time, including nice time ).
-Sy: time spent running kernel code (system time ).
-ID: time spent idle. Prior to Linux 2.5.41, this has ded io-wait time.
-Wa: time spent waiting for Io. Prior to Linux 2.5.41, this appeared as zero.
6. SARSAR is a tool released by Red Hat Enterprise Linux as and one of the sysstat toolset commands, which can be downloaded from: http://perso.wanadoo.fr/sebastien.godard/
SAR is used to collect, report, or save system activity information. SAR consists of three applications: SAR display data, and sar1 and sar2 are used to collect and store data.
With sar1 and sar2, the system can be configured to automatically capture information and logs for analysis. Configuration example: Add the following lines to/etc/crontab:
Similarly, you can run real-time reports using SAR in the command line mode. :
Detailed CPU usage (% USER, % nice, % SYSTEM, % idle) can be obtained from the collected information), Memory Page scheduling, network I/O, process activity, block device activity, and interrupts/second
Liyawei :~ # Sar-U 3 10
Linux 2.6.16.21-0.8-default (liyawei) 05/31/0710: 17: 16 CPU % USER % nice % System % iowait % idle
10:17:19 all 0.00 0.00 0.00 0.00 100.00
10:17:22 all 0.00 0.00 0.00 0.33 99.67
10:17:25 all 0.00 0.00 0.00 0.00 100.00
10:17:28 all 0.00 0.00 0.00 0.00 100.00
10:17:31 all 0.00 0.00 0.00 0.00 100.00
10:17:34 all 0.00 0.00 0.00 0.00 100.00
7. KDE System Guard
KDE System Guard (ksysguard) is a KDE graphical task management and performance monitoring tool. Monitors hosts in the local and remote client/server architecture.
8. FreeThe/bin/free command displays the number of idle and used memories, including swap. It also contains the cache used by the kernel.
Total used free shared buffers cached
Mem: 776492 263480 513012 0 52332 162504
-/+ Buffers/cache: 48644 727848
Swap: 1044184 0 1044184
9. Traffic-visTraffic-vis is a set of target hosts for communications and communications on the IP network and the amount of data transmitted. And output reports in plain text, HTML, or GIF format. Note: Traffic-vis is only applicable to SuSE Linux Enterprise Server. Run the following command to collect eth0 information:
Traffic-collector-I eth0-S/root/output_traffic-collector
You can use the killall command to control the process. To write a report to a disk, run the following command:
Killall-9 traffic-collector
To stop collecting information, run the following command: killall-9 traffic-collector. Do not forget to run the last command. Otherwise, the performance will be affected due to memory usage. The output can be sorted by packets, bytes, and TCP connections, based on the total number of items or the number of incoming/outgoing packets.
For example, sort the number of packets sent and received by the host and execute the following command:
Traffic-sort-I output_traffic-collector-O output_traffic-sort-HP displays the number of transmitted bytes if you want to generate reports in HTML format, for information about packets records, all TCP connection requests, and each server in the network, run the following command:
Traffic-tohtml-I output_traffic-sort-O output_traffic-tohtml.html
To generate a report in GIF format (600x600), run the following command:
Traffic-togif-I output_traffic-sort-O output_traffic-togif.gif-x 600-y 600gif format reports can be easily found on network broadcast, to check which host uses the IPX/SPX protocol in the TCP network and isolate the network, remember that IPX is a broadcast packet-based protocol. If we need to identify problems such as NIC faults or duplicate IP addresses, we need to use special tools. For example, ethereal provided by SuSE Linux Enterprise Server.
Tips and tips: You can run only one command to generate reports by using pipelines. For example, to generate an HTML report, run the following command:
Cat output_traffic-collector | traffic-sort-HP | traffic-tohtml-O output_traffic-tohtml.html
To generate a GIF file, run the following command:
Cat output_traffic-collector | traffic-sort-HP | traffic-togif-O output_traffic-togif.gif-x 600-Y 600
10. pmapPmap can report the memory usage of one or more processes. Use pmap to determine which process on the host causes memory bottleneck due to excessive memory usage.
Pmap <pid> liyawei :~ # Pmap 1
1: init
Start size RSS dirty perm Mapping
08048000 484 K 244 K 0 k r-XP/sbin/init
080c1000 4 K 4 K 4 k rw-P/sbin/init
080c2000 144 K 24 k 24 k RW-P [heap]
Bfb5b000 84 K 12 K 12 k rw-P [Stack]
Ffffe000 4 K 0 K 0 K --- P [vdso]
Total: 720 K 284 K 40k232k writable-private, 488 K readonly-private, and 0 K shared
11. straceStrace intercepts and records system process calls and signals received by processes. It is a very effective tool for detection, guidance, and debugging. The system administrator can use this command to easily solve program problems.
To use this command, you must specify the ID (PID) of the process, for example:
Strace-P <pid>
# Strace-P 2582
Rt_sigprocmask (sig_setmask, [], null, 8) = 0
Read (7, "/"///"///////"///////////////"///////// ///////////////"..., 16384) = 321
Write (3, "} H/331q/37/275 $/271/T/311 M/304 $/317 ~) R9/330oj/304/257/327 "..., 360) = 360
Select (8, [3 4 7], [3], null, null) = 2 (in [7], out [3])
Rt_sigprocmask (sig_block, [chld], [], 8) = 0
Rt_sigprocmask (sig_setmask, [], null, 8) = 0
Read (7, "/"///"///////"///////////////"///////// ///////////////"..., 16384) = 323
Write (3, "/204/303/27 $/35/206 // 306vl/370/5 R/200/226/2/320 ^/253/253"..., 360) = 360
Select (8, [3 4 7], [3], null, null) = 2 (in [7], out [3])
Rt_sigprocmask (sig_block, [chld], [], 8) = 0
Rt_sigprocmask (sig_setmask, [], null, 8) = 0
Read (7, "/"///"///////"///////////////"///////// ///////////////"..., 16384) = 323
Write (3, "/243/207/204/277 CW/0162/2 ju =/205/'l/352? 0j/256i/376/32 "..., 360) = 360
Select (8, [3 4 7], [3], null, null) = 2 (in [7], out [3])
Rt_sigprocmask (sig_block, [chld], [], 8) = 0
Rt_sigprocmask (sig_setmask, [], null, 8) = 0
Read (7, "/"///"///////"///////////////"///////// ///////////////"..., 16384) = 320
Write (3, "6/270 s/3I/310/334/301/253! YS/324/'/234%/356/305/26/233 "..., 360) = 360
Select (8, [3 4 7], [3], null, null) = 2 (in [7], out [3])
Rt_sigprocmask (sig_block, [chld], [], 8) = 0
Rt_sigprocmask (sig_setmask, [], null, 8) = 0
12. ulimit
Ulimit is built in bash shell to provide control over the available resources of Shell and processes.
Liyawei :~ # Ulimit-
Core File size (blocks,-C) 0
Data seg size (Kbytes,-d) Unlimited
File size (blocks,-f) Unlimited
Pending signals (-I) 6143
Max locked memory (Kbytes,-l) 32
Max memory size (Kbytes,-m) Unlimited
Open File (-N) 1024
Pipe size (512 bytes,-p) 8
POSIX message queues (bytes,-q) 819200
Stack size (Kbytes,-S) 8192
CPU time (seconds,-T) Unlimited
Max user processes (-u) 6143
Virtual Memory (Kbytes,-v) Unlimited
File locks (-x) Unlimited
The-H and-s options indicate the hardware and software constraints on the resource. If the limit is exceeded, the system administrator will receive a warning. Hard limit refers to the maximum value that can be reached before the user receives an error message that exceeds the limit of the file sentence.
For example, you can set a hard limit on file sentence Bing: ulimit-HN 4096.
For example, you can set a soft limit on file sentence Bing: ulimit-Sn 1024.
Run the following command to view the hardware and software values:
Ulimit-HN
Ulimit-Sn
For example, to restrict oracle users, enter the following lines in/etc/security/limits. conf:
Soft nofile 4096
Hard nofile 10240
For Red Hat Enterprise Linux as, make sure the file/etc/PAM. d/system-auth contains the following lines:
Session required/lib/security/$ ISA/pam_limits.so
For SuSE Linux Enterprise Server, determine that the file/etc/PAM. d/login and/etc/PAM. d/sshd contains the following lines:
Session required pam_limits.so
This line makes these restrictions take effect.
13. mpstatMpstat is part of the sysstat toolset and is a http://perso.wanadoo.fr/sebastien.godard/
Mpstat is used to report the CPU activity of multiple CPU hosts and the CPU status of the entire host.
For example, the following command can report the activity of the processor every two seconds and execute three times.
Mpstat 2 3
Liyawei :~ # Mpstat 2 3
Linux 2.6.16.21-0.8-default (liyawei) 05/31/0710: 23: 03 CPU % USER % nice % sys % iowait % IRQ % Soft % steal % idle intr/s
10:23:05 all 0.50 0.00 0.00 1.99 0.00 0.00 0.00 97.51 271.64
10:23:07 all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 261.00
10:23:09 all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 261.50
Average: All 0.17 0.00 0.00 0.67 0.00 0.00 0.00 99.17 264.73
The following command displays the processor activity of multiple CPU hosts every one second and runs three times.
Mpstat-P all 1 3
Liyawei :~ # Mpstat-P all 1 10
Linux 2.6.16.21-0.8-default (liyawei) 05/31/0710: 23: 31 CPU % USER % nice % sys % iowait % IRQ % Soft % steal % idle intr/s
10:23:32 all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 273.00
10:23:32 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 272.00
10:23:33 all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 254.00
10:23:33 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 254.00
10:23:34 all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 271.00
10:23:34 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 271.00
10:23:35 all 0.00 0.00 0.00 1.98 0.00 0.00 0.00 98.02 254.46
10:23:35 0 0.00 0.00 0.00 1.98 0.00 0.00 0.00 98.02 254.46