Fio is a great tool for testing IOPS, for stress testing and validation of hardware, supporting 13 different I/O engines,includes: Sync,mmap, Libaio, Posixaio, SG v3, splice, NULL, network, Syslet, Guasi, Solarisaio, etc. FIO website address: http://freshmeat.net/projects/fio/ One, FIO installationwget http://brick.kernel.dk/snaps/fio-2.0.7.tar.gzYum Install Libaio-develTAR-ZXVF fio-2.0.7.tar.gzCD fio-2.0.7 MakeMake Install two, Random Read test:Random Read:fio-filename=/dev/sdb1-direct=1-iodepth 1-thread-rw=randread-ioengine=psync-bs=16k-size=200g-numjobs=10-runtime=1000-group_reporting-name=mytest Description:filename=/dev/sdb1 test file name, usually select the data directory of the disk you want to test. the Direct=1 test process bypasses the machine's own buffer. Make the test results more realistic. rw=randwrite test for random write I/ORW=RANDRW test for random write and read I/Obs=16k a block file size of 16k for one-time IObsrange=512-2048, setting the size range of the data blocksize=5g This time the test file size is 5g, with every 4k IO to test. numjobs=30 This time the test thread is.runtime=1000 test time is 1000 seconds, if not write the 5g file will be divided into 4k each time until the end of writing. ioengine=psync IO engine uses pync moderwmixwrite=30 in the mixed read and write mode, the write occupies 30%group_reporting information about each process that displays the results. alsolockmem=1g only uses 1g of memory for testing. zero_buffers initializes the system buffer with 0. nrfiles=8 The number of files generated per process. Sequential reads:fio-filename=/dev/sdb1-direct=1-iodepth 1-thread-rw=read-ioengine=psync-bs=16k-size=200g-numjobs=30-runtime =1000-group_reporting-name=mytestRandom Write:fio-filename=/dev/sdb1-direct=1-iodepth 1-thread-rw=randwrite-ioengine=psync-bs=16k-size=200g-numjobs=30-ru Ntime=1000-group_reporting-name=mytestSequential write:fio-filename=/dev/sdb1-direct=1-iodepth 1-thread-rw=write-ioengine=psync-bs=16k-size=200g-numjobs=30-runtim E=1000-group_reporting-name=mytestmixed Random Read and write:fio-filename=/dev/sdb1-direct=1-iodepth 1-thread-rw=randrw-rwmixread=70-ioengine=psync-bs=16k-size=200g-num Jobs=30-runtime=100-group_reporting-name=mytest-ioscheduler=noop three, the actual test example:[email protected] ~]# fio-filename=/dev/sdb1-direct=1-iodepth 1-thread-rw=randrw-rwmixread=70-ioengine=psync- Bs=16k-size=200g-numjobs=30-runtime=100-group_reporting-name=mytest1 mytest1: (g=0): RW=RANDRW, bs=16k-16k/16k-16k, Ioengine=psync, Iodepth=1... mytest1: (g=0): RW=RANDRW, bs=16k-16k/16k-16k, Ioengine=psync, Iodepth=1Fio 2.0.7Starting Threadsjobs:1 (f=1): [________________m_____________] [3.5% done] [6935k/3116k/s] [423/190 IOPS] [eta 48m:20s] s]Mytest1: (Groupid=0, jobs=30): err= 0:pid=23802 Read : IO=1853.4MB, bw=18967kb/s, iops=1185 , Runt=100058msecClat (USEC): min=60, max=871116, avg=25227.91, stdev=31653.46Lat (usec): min=60, max=871117, avg=25228.08, stdev=31653.46Clat percentiles (msec):| 1.00th=[3], 5.00th=[5], 10.00th=[6], 20.00th=[8],| 30.00th=[ ], 40.00th=[], 50.00th=[, 60.00th=[),| 70.00th=[], 80.00th=[[Notoginseng], 90.00th=[], 95.00th=[],| 99.00th=[151], 99.50th=[202], 99.90th=[338], 99.95th=[383],| 99.99th=[523]bw (kb/s): min=, max= 1944, per=3.36%, avg=636.84, stdev=189.15 Write: io=803600kb, bw=8031.4kb/s, iops=501 , Runt=100058msecClat (USEC): min=52, max=9302, avg=146.25, stdev=299.17Lat (usec): min=52, max=9303, avg=147.19, stdev=299.17Clat percentiles (usec):| 1.00th=[ ], 5.00th=[], 10.00th=[, 20.00th=[),| 30.00th=[], 40.00th=[], 50.00th=[[], 60.00th=[],| 70.00th=[], 80.00th=[[], 90.00th=[], 95.00th=[370],| 99.00th=[1688], 99.50th=[2128], 99.90th=[3088], 99.95th=[3696],| 99.99th=[5216]bw (kb/s): min=, max= 1117, per=3.37%, avg=270.27, stdev=133.27Lat (usec): 100=24.32%, 250=3.83%, 500=0.33%, 750=0.28%, 1000=0.27%Lat (msec): 2=0.64%, 4=3.08%, 10=20.67%, 20=19.90%, 50=17.91%Lat (msec): 100=6.87%, 250=1.70%, 500=0.19%, 750=0.01%, 1000=0.01%cpu:usr=1.70%, sys=2.41%, ctx=5237835, majf=0, minf=6344162IO depths:1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%submit:0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%complete:0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%issued:total=r=118612/w=50225/d=0, Short=r=0/w=0/d=0Run Status Group 0 (all jobs):READ:IO=1853.4MB, aggrb=18966kb/s, minb=18966kb/s, maxb=18966kb/s, mint=100058msec, Maxt=100058msecwrite:io=803600kb, aggrb=8031kb/s, minb=8031kb/s, maxb=8031kb/s, mint=100058msec, Maxt=100058msecDisk Stats (read/write):sdb:ios=118610/50224, merge=0/0, ticks=2991317/6860, in_queue=2998169, util=99.77%See the ioPS in the red Font section above (read/write) * * Two bottleneck analysis of disk array throughput and IOPS * * 1. Throughput The throughput depends primarily on the fabric of the array, the size of the Fibre Channel (now the array is typically a fiber array, as for SSAS arrays like SCSI, we don't discuss) and the number of hard disks. The array's architecture differs from each array, and they all have internal bandwidth (similar to the PC's system bus), but in general, the internal bandwidth is well designed, not the bottleneck. the impact of Fibre Channel is still relatively large, such as data Warehouse environment, the traffic requirements for data is very large, and a 2Gb of fiber-optic card, 77 can support the maximum flow should be 2GB/8 (small b) =250mb/s (large B) of the actual flow, when 4 blocks of optical fiber card to achieve 1gb/s actual flow , so the data Warehouse environment can consider changing the 4Gb fiber optic card. finally say the hard disk limit, here is the most important, when the front of the bottleneck no longer exist, it is necessary to look at the number of hard disk, I listed below the different hard disk can support the size of the traffic: Ten k rpm-K rpm ATA ——— ——— ——— 10m/s 13m/s 8m/s So, suppose an array has 120 15K rpm of the optical drive, then the largest on the hard disk can support the flow of 120*13=1560mb/s, if it is 2Gb fiber card, it may take 6 blocks to be able, and 4Gb of fiber-optic card, 3-4 pieces is enough. 2. IOPS The main determinant of IOPS is the algorithm of the array, the cache hit rate, and the number of disks. Array algorithms are different for different arrays, as we have recently encountered on HDS USP, possibly because Ldev (LUNs) have queue or resource limitations, and the ioPS of a single ldev is not up, so it is necessary to understand some of the algorithm rules and limitations of this store before using this storage. the cache's hit rate depends on the distribution of the data, the size of the cache, the rules for data access, and the cache algorithm, which, if fully discussed, will become complex and can be discussed in one day. I only emphasize one cache hit rate, if an array, read the cache hit rate is better, generally said it can support more iops, why so? This is related to the hard disk IOPS we are going to discuss below. hard disk limit, each physical hard disk can handle the IOPS is limited, such as Ten k rpm-K rpm ATA ——— ——— ——— the Similarly, if an array has 120 15K rpm optical drive, then it can support the maximum iops of 120*150=18000, this is the theoretical value of the hardware limit, if the value is exceeded, the response of the hard disk may become very slow and not provide normal business. On RAID5 and RAID10, there is no difference in read ioPS, but the same business write ioPS, and ultimately the ioPS on disk, is different, and we are evaluating the IOPS of the disk, and if the disk limit is reached, performance is definitely up. Let's assume a case where the iops of the business is 10000, the read cache hit rate is 30%, the read ioPS is 60%, the Write ioPS is 40%, the number of disks is 120, and the IOPS per disk is calculated separately for RAID5 and RAID10. RAID5: ioPS of a single disk = (10000* (1-0.3) *0.6 + 4 * (10000*0.4))/120 = (4200 + 16000)/120 = 168 here 10000* (1-0.3) *0.6 is read ioPS, scale is 0.6, get rid of cache hit, actually only 4,200 ioPS and 4 * (10000*0.4) represents the write ioPS, because each write, in RAID5, actually occurs 4 io, so write ioPS 16,000 In order to consider the RAID5 in the write operation, the 2 read operation can also hit, so the more accurate calculation is: ioPS of a single disk = (10000* (1-0.3) *0.6 + 2 * (10000*0.4) * (1-0.3) + 2 * (10000*0.4))/120 = (4200 + 5600 + 8000)/120 = 148 calculate the ioPS of a single disk to 148, basically reach the disk limit RAID10 ioPS of a single disk = (10000* (1-0.3) *0.6 + 2 * (10000*0.4))/120 = (4200 + 8000)/120 = 102 as you can see, because RAID10 has only 2 io for a write operation, the same pressure, same disk, only 102 IOPS per disk, is far below the disk's limit ioPS. In a real case, a very heavy recovery standby (mainly written, and small io write), using the RAID5 scheme, found that performance is poor, through analysis, IOPS per disk in the peak period, quickly reached 200, resulting in a huge response speed. This performance problem was avoided after the RAID10 was changed, and IOPS per disk dropped to about 100. Reprint Address: http://blog.itpub.net/26855487/viewspace-754346/
Linux test Disk IOPS with FIO (reprint)