I just found that the low I/O write efficiency is to manually write a large file at the same time as the business write, the business write is completely blocked, and the server load was once as high as 50 (the highest load cp
I just found that the low I/O write efficiency is to manually write a large file at the same time as the business write, the business write is completely blocked, and the server load was once as high as 50 (the highest load cp
System Environment: CentOS release 5.10
Application Environment: Oracle 10.2.0.5 + php5.2.17
Hard parts: DELL R720, 1 T * 3 7200r, raid5
Business Environment: sqlload is written into the database every five minutes for effective data within five minutes, with a data size of about 30 mb.
Cause:
Recently, a data migration was conducted, with the hardware changed from 300G * 3 7200r to 1 T * 3 7200r and the hardware RAID 5. The other environments perform peer-to-peer migration. However, after the migration, i/O write efficiency is extremely low, and I/O bottlenecks are analyzed.
Analysis process:
I just found that the low I/O write efficiency is to manually write a large file at the same time as the business write, and the business write is completely blocked, at the same time, the server load was once as high as 50 (the highest load was about 45% cpu wait and about 35% iowait, immediately ran the related SQL in oracle [select * from v $ locked_objects; select * from v $ lock where request <> 0 or block <> 0 ;], if no lock blocking occurs, I still feel like I am running awr. I 'd like to compare it with that after migration. After the result is run, it is quite surprising (some of the results are reasonable)
Awr information: 2: 00-, pure business, no other operations, the Business Operation volume is the same, the oracle configuration is identical (except the redolog path ):
179 database (before migration ):
Top 5 Timed Events
EventWaitsTime (s) Avg Wait (MS) % Total Call TimeWait Class
CPU time 348 95.6
Log file sync12, 7061213.3 Commit
Log file parallel write13, 5311213.2 System I/O
Enq: TM-contention492, 3622.6 Application
Control file parallel write2, 102631.7 System
51 database (after migration): [because only three hard disks are made into RAID 5, redolog can only be placed on the RAID 5 array ]:
Top 5 Timed Events
EventWaitsTime (s) Avg Wait (MS) % Total Call TimeWait Class
Part of the reason for wait time is that redo and datafile are put on raid5 together, and io contention is caused. However, we can see that the overall efficiency of this io is much worse than before.
Therefore, a series of tests were conducted for io, including dd tests on the OS (dd if =/dev/zero of =/Data/apps/oracle/product/10.2.0/oradata/detail // 1Gb. file bs = 1024 count = 1000000), add a temporary file to the two servers, and use iostat to observe the io status. The result is as follows:
179 servers (before migration ):
51 servers (after migration ):
See the two IO comparison, found 51 server dm-0 logic disk io is very high, because I am not too familiar with lvm and raid, so asked the company related O & M, why is the dm-0 I/O so high, the answer is normal [here I said very helpless], if normal, it is raid5 write efficiency compared to the write efficiency of a single hard disk is much lower. I am weak in this technology, so I believe in O & M's answer for the moment.
Due to this problem, I/O write efficiency is low. as long as a large number of I/O writes are made manually, the business is blocked. Therefore, the company feels that raid5 is changed to raid1 + 0.
In the end, I don't get the answer to my question. If any of you have any insights, I can give you some advice.
Introduction:
========================================================== ==================
================================ Iostat ========
========================================================== ==================
Rrqm/s: the number of merge read operations per second. That is, delta (rmerge)/s.
Wrqm/s: Number of write operations performed on merge per second. That is, delta (wmerge)/s
R/s: The number of read I/O devices completed per second. That is, delta (rio)/s.
W/s: the number of write I/O devices completed per second. That is, delta (wio)/s
Rsec/s: Number of read sectors per second. That is, delta (rsect)/s
Wsec/s: Number of write sectors per second. That is, delta (wsect)/s
RkB/s: the number of K Bytes read per second. It is half of rsect/s, because the size of each sector is 512 bytes. (calculated)
WkB/s: the number of K bytes written per second. Half of wsect/s (to be calculated)
Avgrq-sz: average data size (slice) of each device I/O operation. delta (rsect + wsect)/delta (rio + wio)
Avgqu-sz: Average I/O queue length, that is, delta (aveq)/s/1000 (because aveq is measured in milliseconds ).
Await: average wait time for each device I/O operation (MS). That is, delta (ruse + wuse)/delta (rio + wio)
Svctm: Average service time for each device I/O operation (MS). That is, delta (use)/delta (rio + wio)
% Util: the percentage of time in one second for I/O operations, or the number of I/O queues in one second is not empty. that is, delta (use)/s/1000 (because the Unit of use is milliseconds)
If % util is close to 100%, it indicates that too many I/O requests are generated and the I/O system is fully loaded. This disk may have a bottleneck.
When the idle is less than 70% I/O, the load is high. Generally, the read speed is wait.
You can also view the parameters B (number of processes waiting for resources) and wa in combination with vmstat (percentage of CPU time occupied by I/O wait, higher than 30% when I/O pressure is high)
In addition, await parameters need to be referenced more by svctm. IO problems may occur if the difference is too high.
Avgqu-sz is also a place to note when performing IO optimization. This is the data size of each operation. If the number of times is large but the data size is small, in fact, IO will be very small. if the data is big, the IO data will be high. you can also use avgqu-sz × (r/s or w/s) = rsec/s or wsec/s. that is to say, the speed of reading is determined by this.
For more information, see
Generally, svctm is smaller than await (because the wait time for simultaneously waiting requests is calculated repeatedly). The size of svctm is generally related to disk performance, and the CPU/memory load will also affect it, too many requests may indirectly increase the svctm. the size of await generally depends on the service time (svctm), the length of the I/O queue, and the mode in which I/O requests are sent. if svctm is close to await, it means that I/O has almost no waiting time. If await is much larger than svctm, it means that the I/O queue is too long and the response time of the application is slow, if the response time exceeds the allowable range, you can consider replacing a faster disk, adjusting the kernel elevator algorithm, optimizing the application, or upgrading the CPU.
The queue length (avgqu-sz) can also be used as an indicator to measure the system I/O load. However, because avgqu-sz is based on the average per unit time, therefore, it cannot reflect the instantaneous I/O flood.