Linux Buffer I/O error on device dm-4, logical block, dm-4logical
Some errors such as "Buffer I/O error on device dm-4, logical block 0" appear in the Linux Server log (Oracle Linux Server release 5.7), as shown below:
Jul 3 02:33:24 localhost kernel: Buffer I/O error on device dm-4, logical block 0
Jul 3 02:33:24 localhost kernel: Buffer I/O error on device dm-4, logical block 1
Jul 3 02:33:24 localhost kernel: Buffer I/O error on device dm-4, logical block 2
Jul 3 02:33:24 localhost kernel: Buffer I/O error on device dm-4, logical block 3
Jul 3 02:33:24 localhost kernel: Buffer I/O error on device dm-4, logical block 0
Jul 3 02:33:24 localhost kernel: Buffer I/O error on device dm-4, logical block 0
Jul 3 02:33:24 localhost kernel: Buffer I/O error on device dm-4, logical block 1
Jul 3 02:33:24 localhost kernel: Buffer I/O error on device dm-4, logical block 2
Jul 3 02:33:24 localhost kernel: Buffer I/O error on device dm-4, logical block 3
This article introduces Buffer I/O Error in/var/log/messages in English:
A server using a LUN, which is presented by a storage array through fabric channels, may show buffer I/O errors while the server is booting or commands such as fdisk and vgscan are being run. the access can be a read or write attempt. these messages are sometimes harmless. when using PowerPath, these errors are suppressed. however, in the case where Linux native multipathing is used, there is no automatic provision for filtering these messages.
The errors can occur when using an active/passive storage array, such as EMC Clarion series. these types of SANs contain two storage processors. LUNs are assigned to only one of the processors at the time of LUN creation. the LUN can receive I/O only via that one processor. the other processor is passive; it acts as a backup, ready to receive I/O if the active controller fails, or if all paths to the LUN via the active controller fails.
Paths to the LUN going via the passive controller are passive paths and will generate an I/O errors shoshould I/O be sent over them. at bootup, the kernel's SCSI mid-layer scans all paths to find devices. thus it will scan both active and passive paths and will generate buffer I/O errors for the passive paths.
This is a normal behavior for Linux native multipath, and the errors do not indicate an array issue. the errors can safely be filtered through the OS logging configuration or the user can avoid access to native devices (as opposed to using/dev/mapper devices ). alternatively, a qualified version of PowerPath may be installed, which will automatically filter these errors.
In the official Why do I see I/O errors on a RHEL system using devices from an active/passive storage array? Also introduced.
· Storage arrays in a SAN are generally implemented in a redundant manner such that the host can access logical units (LUN) on one of multiple different paths. typically, these operate in one of two different modes: active/active or active/passive. with active/active network, the I/O can be sent to any path of a LUN and it will be handled by the controller. with active/passive arrays, a controller is considered the main for each LUN, while the other controller is waiting and acts as a backup plan. some windows will accept I/O to a LUN on the backup controller (passive) but this will not be optimized (worst performance ). however, other active/passive arrays will not accept I/O to the backup controller for a LUN, so all commands sent to it will result in an I/O error.
· In RHEL, there are a number of commands and utilities that can send I/O to varous devices, such as LVM, udev, fdisk, etc ., not to mention applications such as databases, web servers, etc. si one of them had to issue I/O to a passive way on a bay that does not accept it, it will cause an error I/O in newspapers. the messages are harmless and do not indicate a problem, but they can fill newspapers or become unduly concern. therefore, some may want to try to avoid these errors by preventing applications from accessing passive paths. typically, filtering from LVM will disappear the majority of these erreurs. aussi to reduce the number of errors, avoid commands like 'fdisk-l' that scan all devices. finally, the configuration of the applications that scan or accesses multiple devices so that only accesses the appropriate active path or multipath logical device (/dev/mapper/mpath *,/dev/emcpower *, /dev/* sddlma etc .) can reduce the number of errors.
It seems that this error message can be ignored (harmless ). It is not a problem with storage. Check a large number of logs and find that this error only occasionally occurs. In addition, when the error occurs, it is due to a large IO load (when RMAN backup and Platespin replication occur at the same time)
References:
Http://blog.csdn.net/kinges/article/details/40425841
Https://access.redhat.com/solutions/18746