A good introduction and use of Linux IO monitoring commands.
1. System Level IO monitoring Iostat
Iostat-xdm 1 # Personal habits
%util indicates how busy the disk is. 100% indicates that the disk is busy and 0% indicates that the disk is idle. Note, however, that disk busy does not represent high disk (bandwidth) utilization
Argrq-sz submitted to the drive layer IO request size, generally not less than 4 K, not greater than Max (readahead_kb, max_sectors_kb)
Can be used to determine the current IO mode, in general, especially when the disk is busy, the larger the order, the smaller it represents the random
SVCTM the service time of an IO request, for a single disk, completely random read, basically around 7ms, both seek + rotation delay time
Note: The relationship between the statistics
=======================================
%util = (r/s + w/s) * svctm/1000 # Queue Length = arrival Rate * Average service time Avgrq-sz = (rmb/s + wmb/s) * 2048 /(r/s + w/s) # 2048 for 1m/512
=======================================
Summarize:
Iostat Statistics is the general block layer after merging (rrqm/s, wrqm/s), directly to the device to submit the IO data, can reflect the overall system IO status, but there are the following 2 disadvantages:
1 is far from the business layer, and the code in the Write,read does not correspond to (due to the system pre-read + Pagecache + IO scheduling algorithm and other factors, it is difficult to correspond)
2 is the system level, can not be accurate to the process, such as only to tell you that the disk is busy now, but there is no way to tell you who is busy, what is busy?
2. Process level IO monitoring iotop and Pidstat (rhel6u series only)
Iotop as the name implies, the IO version of the top
Pidstat as the name implies, statistical process (PID) stat, Process Stat Naturally includes the IO status of the process
Both commands can count the IO status by process, so you can answer the following two questions
-
- Which processes in the current system occupy Io, and what percentage is it?
- Is the process that consumes IO read? or write? What is the read and write volume?
Pidstat a lot of parameters, give only a few personal habits
Pidstat-d 1 #只显示IO
Pidstat-u-r-d-T 1 #-D IO Information,
#-R pages and memory information
#-U CPU Usage
#-T with threads as the statistical unit
# 1 1 seconds to count
Iotop, it's simple, just knock the order.
Block_dump, Iodump
Iotop and Pidstat are very cool, but both rely on the/proc/pid/io file to export statistics, this is not for older kernels, such as Rhel5u2
Therefore had to use the above 2 poor version of the command to replace:
echo 1 >/proc/sys/vm/block_dump # Open Block_dump, this will input IO information into DMESG
# Source: [Email protected]_rw_blk.c:3213
Watch-n 1 "dmesg-c | Grep-op \ "\w+\ (\d+\): (write| READ) \ "| Sort | Uniq-c "
# Keep on Dmesg-c
echo 0 >/proc/sys/vm/block_dump # Off when not in use
You can also use the ready-made script iodump, see http://code.google.com/p/maatkit/source/browse/trunk/util/iodump?r=5389
Iotop.stp
Systemtap script, a look to know is the poor copy of Iotop command, need to install SYSTEMTAP, default output information every 5 seconds
STAP IOTOP.STP # EXAMPLES/IO/IOTOP.STP
Summarize
Process level IO monitoring,
- Can answer 2 questions that system level IO monitoring cannot answer
- Relatively close to the business layer (for example, you can count the amount of process read and write)
But there is no way to connect with the business layer of read,write, while coarse granularity, there is no way to tell you, the current process read and write what files? Take? Size?
3 Business level IO monitoring ioprofile
The Ioprofile command is essentially lsof + strace, and the specific download is visible http://code.google.com/p/maatkit/
Ioprofile can answer your following three questions:
1 What files (read, write) are read and written at the business level at any given time in the current process?
2 What is the number of reads and writes? (Read, write number of calls)
3 What is the volume of read and write data? (Read, write byte number)
Suppose a behavior triggers a program IO action, for example: "One page click, which causes the background to read a,b,c files"
============================================
./io_event # Suppose to simulate an IO behavior, read a file once, B file 500 times, C file 500 times
Ioprofile-p ' pidof io_event '-C count # Read/write times
Ioprofile-p ' pidof io_event '-C times # Read and write time
Ioprofile-p ' pidof io_event '-C sizes # read-Write size
Note: Ioprofile only supports multithreaded programs and is not supported for single threaded program. For the IO Business-level analysis of single-threaded threads, strace is sufficient.
Summarize:
Ioprofile is essentially a strace, so you can see the read,write call trajectory, can do business layer IO analysis (mmap way powerless)
4 file-level IO monitoring
FILE-level IO monitoring can match/supplement "business-level and process-level" IO analysis
FILE-level IO analysis, primarily for individual files, answers which processes are currently reading or writing a file.
1 lsof or LS/PROC/PID/FD
2 INODEWATCH.STP
Lsof tells you which processes are open for the current file
Lsof.. /io # IO directory currently open by bash and lsof two processes
The lsof command can only answer static information, and "open" does not necessarily "read", for the cat, echo such command, open and read are instantaneous, lsof difficult to capture
You can use INODEWATCH.STP to compensate.
Stap INODEWATCH.STP Major Minor inode # main device number, auxiliary device number, file inode node number
Stap inodewatch.stp 0xfd 0x00 523170 # Main device number, auxiliary device number, inode number, can be obtained by STAT command
5 IO Simulator
iotest.py # See Appendix
Developers can use Ioprofile (or strace) to make detailed analysis of the system's IO path, and then make the appropriate optimizations at the program level.
However, in general the adjustment procedures, the cost is relatively large, especially when uncertain whether the modification scheme can be effective, it is best to have some kind of simulation approach to quickly verify.
Consider our business as an example, when a query is found, the system's IO access mode is as follows:
Accessed the a file once
Accessed the B file 500 times, 16 bytes each time, average interval 502K
Accessed the C file 500 times, 200 bytes each time, average interval 4M
Here the B,c file is interleaved, both
1 Access B First, read 16 bytes,
2 again access c, read 200 bytes,
3 Go back to B, jump 502K and then read 16 bytes,
4 back to C, jump 4M, then read 200 bytes
5 Repeat 500 times
The Strace file is as follows:
A simple and naïve idea, will b,c interleaved read, change to first batch read B, and then bulk read C, so adjust the Strace file as follows:
The adjusted strace file, as input to iotest.py, iotest.py according to the access mode in Strace file, simulate the corresponding IO
Iotest.py-s io.strace-f Fmap
Fmap is mapped to the map file, and the FD, such as 222,333 in Strace, is mapped to the actual file
===========================
111 =/opt/work/io/a.data
222 =/opt/work/io/b.data
333 =/opt/work/io/c.data
===========================
6 Disk Defragmentation
Bottom line: As long as the disk capacity does not maintain more than 80% years, basically do not worry about fragmentation problem.
If you're really worried, you can use the defrag script
7 Other IO-related commands
Blockdev Series
=======================================
Blockdev--GETBSZ/DEV/SDC1 # View the block size of the SDC1 disk
Block Blockdev--GETRA/DEV/SDC1 # View the pre-read (readahead_kb) size of the SDC1 disk
Blockdev--setra 256/DEV/SDC1 # Set the pre-read (readahead_kb) size of the SDC1 disk, the lower version of the kernel through the/sys settings, sometimes will fail, rather than blockdev reliable
Appendix
1. Location of various IO monitoring tools in the Linux IO architecture
2, iotest.py source
#! /usr/bin/env python#-*-coding:gbk-*-import osimport reimport timeit from ctypes import Cdll, Create_string_buffer, C_u Long, c_longlongfrom optparse import optionparserusage = '%prog-s strace.log-f fileno.map ' ' _GLIBC = None_glibc_pread = None_c_char_buf = None_open_file = []def getlines (filename): _lines = [] with open (filename, ' R ') as _f:for Lin E in _f:if line.strip ()! = "": _lines.append (Line.strip ()) return _linesdef Parsecmdline (): parser = Option Parser (usage) parser.add_option ("-S", "--strace", dest= "Strace_filename", help= "Strace file", metavar= "file") Parser.add_option ("-F", "--fileno", dest= "Fileno_filename", help= "Fileno file", metavar= "File") (Options, AR GS) = Parser.parse_args () If Options.strace_filename is None:parser.error ("Strace are not specified.") If not os.path.exists (options.strace_filename): Parser.error ("Strace file does not exist.") If Options.fileno_filename is None:parser.error ("Fileno are not speCified. ") If not os.path.exists (options.strace_filename): Parser.error ("Fileno file does not exist.") return options.strace_filename, options.fileno_filename# [type, ...] # [Pread, FNO, Count, offset]# pread (All, "", 4348, 140156928) def parse_strace (filename): lines = getlines (filename) AC tion = [] _regex_str = R ' (PREAD|PREAD64) [^\d]* (\d+), \s*[^,]*,\s* ([\dkkmm*+\-.] *), \s* ([\dkkmm*+\-.] *) ' For I in lines: _match = Re.match (_regex_str, i) if _match is none:continue # Skip Invalid line _type, _FN, _count, _of f = _match.group (1), _match.group (2), _match.group (3), _match.group (4) _off = _off.replace (' k ', "* 1024x768"). Replace (' K ', "* 1024x768"). Replace (' m ', "* 1048576"). Replace (' m ', "* 1048576") _count = _count.replace (' k ', "* 1024x768"). Replac E (' K ', "* 1024x768"). Replace (' m ', "* 1048576"). Replace (' m ', "* 1048576") #print _off action.append ([_type, _FN, St R (Int (eval (_count))), str (int (eval (_off)))] return actiondef Parse_fileno (filename): lines = getlines (filename)Fmap = {} for I in Lines:if I.strip (). StartsWith ("#"): Continue # Comment Line _split = [J.strip () for J in I.split ("=")] If Len (_split)! = 2:continue # Invalid line fno, fname = _split[0], _split[1] fmap[fno] = fname return fmapdef Simulat E_before (Strace, Fmap): Global _open_file, _c_char_buf rfmap = {} for I in Fmap.values (): _f = open (i, "r+b") #pr int "Open {0}:{1}". Format (_f.fileno (), i) _open_file.append (_f) rfmap[i] = str (_f.fileno ()) # reverse Mapping To_read = 4 * 1024# default 4K buf for i in strace:i[1] = rfmap[fmap[i[1]]] # FID, fname, FID mapping conversion to_read = MAX (To_read, in T (I[2])) #print "read buffer len:%d Byte"% to_read _c_char_buf = Create_string_buffer (to_read) def simulate_after (): G Lobal _open_file for _f in _open_file: _f.close () def simulate (actions): #timeit. Time.sleep (10) # rest 2 seconds for IO interval start = timeit.time.time () for ACT in Actions: __simulate__ (act) finish = Timeit.time.time () return finish-startdef __si MULATE__ (ACT): Global _GLIBC, _glibc_pread, _c_char_buf if "pread" in act[0]: _fno = Int (act[1]) _buf = _c_char_buf _count = C_ulon G (int (act[2])) _off = c_longlong (int (act[3))) _glibc_pread (_fno, _buf, _count, _off) #print _glibc.time (None) Else:pass passdef LOADLIBC (): Global _GLIBC, _glibc_pread _glibc = Cdll ("libc.so.6") _glibc_pread = _glibc.pread64 if __name__ = = "__main__": _strace, _fileno = Parsecmdline () # Parse command-line Arguments loadlibc () # load Dynamic Library _action = Parse_strace (_stra CE) # parse action file _fmap = Parse_fileno (_fileno) # Parse file name Mapping files Simulate_before (_action, _fmap) # preprocessing #print "Total IO op Erate:%d "% (len (_action)) #for Act in _action:print" ". Join (ACT) print"%f "% simulate (_action)
Original link: http://www.cnblogs.com/quixotic/p/3258730.html
- This article is from: Linux Learning Tutorial Network
Analysis of the use of IO monitoring commands in Linux