Linux environment common performance monitoring and assistance in developing debugging tools __linux

Source: Internet
Author: User
Tags curl memory usage printable characters server memory

Linux has a number of excellent tools to help us analyze the performance metrics of the server and assist in the development of debugging work. The following lists only the underlying commands and are generally integrated into a Linux environment without having to be installed again. More detailed commands can refer to the Https://github.com/brendangregg/perf-toolsA, CPU process relatedCommon tools listed below: Uptime, PS, top, mpstat, Pidstat, etc.Uptime: View the system running time, average load and so on.PS: To see the percentage of CPU resources that a process consumes; View Thread Information ps-elfTop/htop/atop: The information displayed is close to the PS, but top can understand CPU consumption, can update the display according to the time specified by the user, top-hp PID (main thread id) can see the state of all threads in the multithreaded program.Mpstat: You can view the average information for all CPUs, and also view information about the specified CPU;Pidstat: It is useful to show the state of a process, and time consuming, and so on.
The following command can be used to view page break information
Ps-o majflt,minflt-c <program_name>
Ps-o majflt,minflt-p <pid>

Among them, Majflt represents major fault, refers to the big mistake, Minflt represents minor fault, refers to the small error. These two numbers represent the number of missing pages that have occurred since the start of a process. The difference between Majflt and Minflt is that Majflt indicates the need to read and write the disk, possibly the memory corresponding page in the disk needs to load into physical memory, or may be at this time the physical memory is not enough, need to eliminate some physical pages to disk.
For example, here is an example of mysqld.
mysql@ tlog_590_591:~> ps-o majflt,minflt-c mysqld
Majflt Minflt
144856 15296294

If the process of the kernel state CPU use too much, one of the reasons may be the number of page breaks per unit time multiple, can be viewed by the above command.
If the Majflt is too large, it is likely that there is not enough memory.
If the Minflt is too large, it is likely to be allocating/releasing large chunks of memory (128k) frequently, malloc using Mmap. In this case, the critical value can be increased by mallopt (M_mmap_threshold, <SIZE>), or the program implements the memory pool.

B, memory-relatedCommon tools: Free, Vmstat Free: You can view the total number of memory, used, free memory, swap use (Swap devices are used when the system does not have enough physical memory to meet all requests), and the swap device can be a file or a disk partition. But be careful, the cost of using swap is very high. If the system does not have physical memory available, it will frequently swapping if swap devices and programs are accessing data on the same file system, which can lead to severe IO problems, eventually causing the entire system to slow down or even crash, and so on, especially if swap is used more, Indicates that the server memory is not enough;
The cache in the Linux system's memory (cached in free output) is not released in all cases as an idle space, even if the cache can be released, it is not for the system without cost. To sum up the main points, we should remember these points:
1. When the cache is released as a file buffer, it raises the IO height, which is the cost of the cache to speed up file access.
2. The file stored in TMPFS will occupy the cache space, unless the file is deleted, this cache will not be automatically released.
3. The shared memory requested using the Shmget method occupies the cache space, unless the shared memory is IPCRM or used shmctl to Ipc_rmid, otherwise the related cache space will not be automatically released.
4. The memory used by the Mmap method for the map_shared flag will occupy the cache space, unless the process will munmap this memory, otherwise the relevant cache space will not be automatically released.
5. In fact, Shmget, Mmap shared memory, in the kernel layer are implemented through the TMPFS, TMPFS implementation of the storage used are cache.

Vmstat: Can monitor virtual memory usage, free memory, buffer, cache, and other indicators, and a similar to the tools. Whether the server occurs swap can be viewed through Vmstat 1.
C, disk I/O relatedCommon tools: Iostat, Fio, SwaponIostat: Can obtain read-write data per second block number, all read and write blocks, and so on, can have a general understanding of disk Read and write performance, and can simulate the sequence and random read and write disk operations;Fio: Another powerful IO pressure test tool, the biggest feature of this tool is simple to use, supporting a very large number of file operations, can be covered to the file we can see how to use.Swapon: Displays the swap device usage if you start the swap device.
Badblocks detect if disk fails--> time DD if=/dev/zero of=/test.dbf count=100000 bs=10k oflag=direct detect IO Speed-->
Iostat-x-d-k 1 View IO read/write performance,%util is too high--> iotop-o find the higher IO process on the machine--> the directory/usr/local/stat/log/to the CGI frequent IO write operation and mount the disk in the TMPFS format , mount-t tmpfs-o size=20m Tmpfs/tmpfs; Mount-o bind/tmpfs//usr/local/stat/log/See if IO can be lowered (note: The TMPFS format of the disk, which is the data IO operation in memory, can greatly improve IO speed, For some operations frequently and small files of IO operations are very accelerated effect)--> strace-p tracking IO high process, see what to do, or what error occurred.

An example: CGI needs to escalate the data through logagent, the premise of which is to read the logagent configuration file:/usr/local/stat/bin/msglog.conf. However, because this batch of machines did not have logagent installed, it caused the read configuration file to fail. After the failure to read the configuration file, CGI frequently flushes the overwrite writes to the file/usr/local/stat/log/logapi_syserr.bin in trunc mode. Although it is a direct trunc refresh, the disk sector is typically 512. At the same time, the size of the Linux page cache is 4KB, in the case of Non-direct io, Io is written first to the Linux page cache. As a result, IO operations typically modify at least 512 bytes (typically 4KB). Multiplied by the number of reads and writes per second of the CGI process (Times/sec), so even if only 4 bytes are written, it looks like dozens of k on Io are normal.

D, network I/O relatedCommon tools: Netstat, tcpdump, route, Iptarf, Netperf, Nicstat, Ping/traceroutenetstat: It is a very useful tool to monitor TCP/IP network, it can display the routing table, the actual network connection and the status information of each network interface device;tcpdump: Used to monitor TCP/IP connections and directly read the packet headers of the data link layer. You can specify which packets are monitored, which controls are displayed, and-W Xx.pcap write to the file, you can use Wireshark to open and then filter with Wireshark syntax. The package is in the local words remember I loRoute: You can set up static routes for the NIC configured by the Ifconfig command, and display and modify the entry network commands in the local IP routing table;Iptarf: Can be used to view the throughput of the local network, to obtain network transmission rate;Netperf: Can simulate server and client network transceiver, test network throughput size;Iperf: Similar to Netperf, analog server and client network transceiver, test the maximum TCP and UDP bandwidth performance, can provide network throughput information, as well as vibration, packet loss rate, maximum and maximum transmission unit size statistics.Nicstat:Monitor the state of the network interface, such as throughput, similar to IOSTAT output format.Ping/traceroute: more common to see if the network is unblocked. Network bandwidth usage can be counted by/proc/net/dev file.
E, development test relatedCommon tools: Readelf, Hexdump/xxd, OD, Objdump, NM, TELNET/NC

readelf: Displays the elf file format in a readable manner, including (target file/executable/shared library)

hexdump/xxd: Print the contents of the file in 16

od: Optional Print file contents

objdump: disassemble the machine command

nm: List symbols of target files

strings: Print a printable string in a file (print the strings of printable characters in files), commonly used to find strings in binary files, used with grep

TELNET/NC: Testing the network connection clientWget/curl:Simulate HTTP request clients, support proxies, cookies, and so on. wget If you enclose link in quotation marks, such as wget ' http://baidu.com '-o tmp.down, the URL cannot be http:\/\/baidu.com curl normally need to be escaped with the [] {}, otherwise it will appear [Globb ing] illegal character in range specification the error at POS xx, or you can set the parameter-g/--globoff this option switches off the URL globbing parser ". When you are set this option, can
Specify URLs that contain the letters {}[] without have them being interpreted by curl
itself. Note This letters are not normal legal URL contents but they should is
Encoded according to the URI standard. -K,--insecure Allow connections to SSL sites without certs (H)

AB (Apache bench)/wrk/TC (Traffic control) APC: HTTP request pressure test.Valgrind/addresssanitizer:Memory leak detection, can reduce program performance, Asan than valg performance better.

F, tracking debugging relatedCommon tools: Strace, Ltrace, Dtrace/ftrace, BlktraceStrace:Tracking system calls that run processes take time, error messages, parameter passing, and so on. STRACE-TT-T-P PID (can be a thread's PID)
Pstack:The gstack is actually a soft connection, and the Gstack itself is based on the GDB encapsulated shell script, focusing on the input thread apply all BT this interactive command. This command requires the output of all thread stack information. The results of the GDB output were replaced and filtered through the pipeline and by the SED command.
Ltrace: It takes time, error information, parameter transfer, and so on to track function library calls of running processes.
Dtrace/ftrace: The combination of the two tools mentioned above. DTrace is a tracing tool whichruns in the system level-this means you can trace all processes to and out of the Kerne l, rather than selecting a single process to trace.
Blktrace: Block I/O event tracer
PT-PMP:is a poor mans ' profiler, inspired by http://poormansprofiler.org. It can create and summarize full stack traces of processes on Linux. Summaries of stack traces can be the invaluable tool for diagnosing what a process are waiting for.
Pstack {pid of mysqld} > Pid.info pt-pmp pid.info

F, performance evaluation relatedCommon tools: Perf Top, perf record, gprof
Perf Record: Can be used to understand the real-time operation of the functions of the program. It uses the method of periodic sampling to count the proportions of each function. A higher ranking function, either a long time to be counted, or frequent calls are counted to many times, in either case, these functions are the devil to consume system resources, is a good object of optimization.
perf Top: Mainly used for real-time analysis of each function in a performance event on the heat, can quickly locate hot functions, including application functions,
The module function and kernel function can even locate the hotspot command. The default performance event is CPU cycles. Perl top-p PID (can be a thread's PID)
gprof: Can be used to statistics the actual time of the program's functions, through which you can accurately understand the time-consuming functions, so which functions need to optimize, it is self-evident.

integrated tools for H and all-inclusive: Sar/collectl, Dstat, view/proc/pid/xxx various information (such as/PROC/PID/FD view the file descriptor opened by this process, lsof-p pid), SYSCTL,/sys various information
The performance of the CPU can not be two kinds: 1 CPU is depleted, 2 CPU surplus idle, but the program overload. First, the CPU has been eaten up, the computational resources are gone, the program performance can not be further. The second, is also a common one, the CPU has not been eaten up, but the program is overloaded, this situation is mainly because the program is suspended waiting for some events, causing the waiting for two kinds of reasons: 1 synchronous blocking, 2 memory swap (memory is not).

The following example sar-u (pictured below) belongs to the second scenario: the CPU idle surplus, but the program is overloaded. According to the diagram, idle surplus at the same time, basically no iowait, this shows that the program is not blocking the IO aspect (also from the side of the proof is not swap caused by blocking), then the biggest suspicion of blocking is the mutex lock.




Reference: Https://github.com/brendangregg/perf-tools
Http://crtags.blogspot.com/2012/04/dtrace-ftrace-ltrace-strace-so-many-to.html
https://danielmiessler.com/study/tcpdump/
Http://ufsdump.org/papers/oscon2009-linux-monitoring.pdf



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.