Storage of thin volumes in Oracle causes ASM Disk Group exceptions

Source: Internet
Author: User
Tags system log

There is a friend in a storage space for ASM use, there is insufficient space, and then use another storage LUN to the ASM Data Disk group to increase ASM disk, run for about 1 days after the ASM disk group directly dismount, Database crash. And then you can't mount it properly. Several other disk groups, including this storage, are not able to mount properly.
Database Exception Log
Sun Oct 23 08:43:59 2016
Success:diskgroup DATA was dismounted
Success:diskgroup DATA was dismounted
Sun Oct 23 08:44:00 2016
Errors in FILE/ORACLE/APP/ORACLE/DIAG/RDBMS/ORCL/ORCL1/TRACE/ORCL1_LMON_79128.TRC:
Ora-00202:control file: ' +data/orcl/controlfile/current.278.892363163 '
Ora-15078:asm DiskGroup was forcibly dismounted
Sun Oct 23 08:44:00 2016
Errors in FILE/ORACLE/APP/ORACLE/DIAG/RDBMS/ORCL/ORCL1/TRACE/ORCL1_LGWR_79174.TRC:
Ora-00345:redo Log write error block 15924 count 2
Ora-00312:online Log 2 thread 1: ' +data/orcl/onlinelog/group_2.274.892363167 '
Ora-15078:asm DiskGroup was forcibly dismounted
Ora-15078:asm DiskGroup was forcibly dismounted
Errors in FILE/ORACLE/APP/ORACLE/DIAG/RDBMS/ORCL/ORCL1/TRACE/ORCL1_LGWR_79174.TRC:
Ora-00202:control file: ' +data/orcl/controlfile/current.278.892363163 '
Ora-15078:asm DiskGroup was forcibly dismounted
Errors in FILE/ORACLE/APP/ORACLE/DIAG/RDBMS/ORCL/ORCL1/TRACE/ORCL1_LGWR_79174.TRC:
Ora-00204:error in Reading (blocks 1, # blocks 1) of control file
Ora-00202:control file: ' +data/orcl/controlfile/current.278.892363163 '
Ora-15078:asm DiskGroup was forcibly dismounted
Sun Oct 23 08:44:00 2016
LGWR (ospid:79174): Terminating the instance due to error 204
Sun Oct 23 08:44:00 2016
OPIODR aborting process unknown ospid (79742) as a result of ORA-1092
Sun Oct 23 08:44:01 2016
Ora-1092:opitsk aborting Process
Sun Oct 23 08:44:01 2016
Ora-1092:opitsk aborting Process
System State Dump requested by (Instance=1, osid=79174 (LGWR)), summary=[abnormal instance. Termination].
System state dumped to trace FILE/ORACLE/APP/ORACLE/DIAG/RDBMS/ORCL/ORCL1/TRACE/ORCL1_DIAG_79118.TRC
Instance terminated by LGWR, PID = 79174
Obviously, the database exception is due to the ASM DiskGroup dismount, so parsing the ASM log

ASM Log
Sun Oct 23 07:00:31 2016
Time drift detected. Please check VKTM the trace file for more details.
Sun Oct 23 08:43:55 2016
Errors in FILE/ORACLE/APP/ORACLE/DIAG/ASM/+ASM/+ASM1/TRACE/+ASM1_ARB0_8755.TRC:
Ora-27061:waiting for async I/Os failed
linux-x86_64 error:5: Input/output Error
Additional Information:-1
Additional information:1048576
Warning:write Failed. Group:1 disk:2 au:1222738 offset:0 size:1048576
error:failed to copy file +data.524, extent 15030
error:ora-15080 thrown in ARB0 for group number 1
Errors in FILE/ORACLE/APP/ORACLE/DIAG/ASM/+ASM/+ASM1/TRACE/+ASM1_ARB0_8755.TRC:
ora-15080:synchronous I/O operation to a disk failed
Sun Oct 23 08:43:55 2016
Note:stopping Process ARB0
Note:rebalance interrupted for group 1/0XEC689CDD (DATA)
Note:asm did background COD recovery for Group 1/0XEC689CDD (DATA)
Note:starting rebalance of Group 1/0XEC689CDD (DATA) at Power 1
Starting background Process ARB0
Sun Oct 23 08:43:56 2016
ARB0 started with pid=24, OS id=103554
Note:assigning ARB0 to group 1/0XEC689CDD (DATA) with 1 parallel I/O
Errors in FILE/ORACLE/APP/ORACLE/DIAG/ASM/+ASM/+ASM1/TRACE/+ASM1_ARB0_103554.TRC:
Ora-27061:waiting for async I/Os failed
linux-x86_64 error:5: Input/output Error
Additional Information:-1
Additional information:1048576
Warning:write Failed. Group:1 disk:2 au:1222738 offset:0 size:1048576
error:failed to copy file +data.256, extent 6570
error:ora-15080 thrown in ARB0 for group number 1
Errors in FILE/ORACLE/APP/ORACLE/DIAG/ASM/+ASM/+ASM1/TRACE/+ASM1_ARB0_103554.TRC:
ora-15080:synchronous I/O operation to a disk failed
Note:stopping Process ARB0
Sun Oct 23 08:43:58 2016
Errors in FILE/ORACLE/APP/ORACLE/DIAG/ASM/+ASM/+ASM1/TRACE/+ASM1_DBW0_8521.TRC:
Ora-27061:waiting for async I/Os failed
linux-x86_64 error:5: Input/output Error
Additional Information:-1
Additional information:4096
Warning:write Failed. Group:1 disk:3 au:6789 offset:24576 size:4096
Note:cache initiating offline of disk 3 group DATA
Note:process _DBW0_+ASM1 (8521) initiating offline of disk 3.3915934787 (data_0003) with mask 0x7e in Group 1
Sun Oct 23 08:43:58 2016
Warning:disk 3 (data_0003) in Group 1 mode 0x7f are now being offlined
Warning:disk 3 (data_0003) in Group 1 in mode 0x7f are now being taken offline on ASM Inst 1
note:initiating PST update:grp = 1, DSK = 3/0xe9686c43, mask = 0x6a, op = clear
Gmon Updating disk modes for Group 1 in for PID 8521 Osid
Error:disk 3 cannot be offlined, since DiskGroup has external.
Error:too many offline disks in PST (GRP 1)
Sun Oct 23 08:43:58 2016
Note:cache dismounting (not clean) group 1/0XEC689CDD (DATA)
Note:messaging CKPT to Quiesce pins Unix process pid:103577, Image:oracle@node1 (B000)
Warning:disk 3 (data_0003) in Group 1 mode 0x7f offline are being aborted
Warning:offline of Disk 3 (data_0003) in Group 1 and mode 0x7f failed on ASM Inst 1
Note:halting all I/Os to DiskGroup 1 (DATA)
Sun Oct 23 08:43:59 2016
NOTE:LGWR doing Non-clean dismount of Group 1 (DATA)
NOTE:LGWR sync aba=160.10145 last written ABA 160.10145
The error message is obvious because the write failed causes the ASM DiskGroup dismount.

System log
OCT 08:43:55 node1 kernel:sd 6:0:12:1: [SDD] Result:hostbyte=did_ok Driverbyte=driver_sense
OCT 08:43:55 node1 kernel:sd 6:0:12:1: [SDD] Sense key:data Protect [current]
OCT 08:43:55 node1 kernel:sd 6:0:12:1: [SDD] Add. Sense:space Allocation failed write protect
OCT 08:43:55 node1 kernel:sd 6:0:12:1: [SDD] Cdb:write (): 8a E7 00 00 00 07 00 00
OCT 08:43:55 node1 kernel:end_request:critical space allocation error, Dev SDD, Sector 12467058681
OCT 08:43:55 node1 kernel:end_request:critical space allocation error, Dev dm-3, Sector 12467058681
OCT 08:43:55 node1 kernel:sd 8:0:6:1: [SDH] Result:hostbyte=did_ok Driverbyte=driver_sense
Oct node1 kernel:sd 8:0:6:1: [SDH] Sense key:data Protect [08:43:55]
OCT 08:43:55 node1 kernel:sd 8:0:6:1: [SDH] Add. Sense:space Allocation failed write protect
OCT 08:43:55 node1 kernel:sd 8:0:6:1: [SDH] Cdb:write (): 8a E7 18
OCT 08:43:55 node1 kernel:sd 6:0:4:1: [SDB] Result:hostbyte=did_ok Driverbyte=driver_sense
OCT 08:43:55 node1 kernel:sd 6:0:4:1: [SDB] Sense key:data Protect [current]
OCT 08:43:55 node1 kernel:sd 6:0:4:1: [SDB] 33Add. Sense:space Allocation failed write protect
OCT 08:43:55 node1 kernel:sd 6:0:4:1: [SDB] Cdb:write (): 8a E7 00 00 (+)
OCT 08:43:55 node1 kernel:end_request:critical space allocation error, dev SDB, Sector 12467056640
OCT 08:43:55 Node1 kernel:f9 00 00 04 00
OCT 08:43:55 node1 kernel:end_request:critical space allocation error, Dev dm-3, Sector 12467056640
OCT 08:43:55 Node1 kernel:00 00
OCT 08:43:55 node1 kernel:end_request:critical space allocation error, dev SDH, Sector 12467057657
OCT 08:43:55 node1 kernel:end_request:critical space allocation error, Dev dm-3, Sector 12467057657
OCT 08:43:57 node1 kernel:sd 8:0:6:1: [SDH] Result:hostbyte=did_ok Driverbyte=driver_sense
Oct node1 kernel:sd 8:0:6:1: [SDH] Sense key:data Protect [08:43:57]
OCT 08:43:57 node1 kernel:sd 8:0:6:1: [SDH] Add. Sense:space Allocation failed write protect
OCT 08:43:57 node1 kernel:sd 8:0:6:1: [SDH] Cdb:write (): 8a E7 00 00 00 07 00 00
OCT 08:43:57 node1 kernel:end_request:critical space allocation error, dev SDH, Sector 12467058681
OCT 08:43:57 node1 kernel:end_request:critical space allocation error, Dev dm-3, Sector 12467058681
OCT 08:43:57 node1 kernel:sd 8:0:12:1: [SDJ] Result:hostbyte=did_ok Driverbyte=driver_sense
OCT 08:43:57 node1 kernel:sd 8:0:12:1: [SDJ] Sense key:data Protect [current]
OCT 08:43:57 node1 kernel:sd 8:0:12:1: [SDJ] Add. Sense:space Allocation failed write protect
OCT 08:43:57 node1 kernel:sd 8:0:12:1: [SDJ] Cdb:write (): 8a E7 00 00 (+)
OCT 08:43:57 node1 kernel:end_request:critical space allocation error, dev SDJ, Sector 12467056640
OCT 08:43:57 node1 kernel:end_request:critical space allocation error, Dev dm-3, Sector 12467056640
OCT 08:43:57 node1 kernel:sd 6:0:4:1: [SDB] Result:hostbyte=did_ok Driverbyte=driver_sense
OCT 08:43:57 node1 kernel:sd 6:0:4:1: [SDB] Sense key:data Protect [current]
OCT 08:43:57 node1 kernel:sd 6:0:4:1: [SDB] Add. Sense:space Allocation failed write protect
OCT 08:43:57 node1 kernel:sd 6:0:4:1: [SDB] Cdb:write (): 8a E7 00 00 04 00 00 00
OCT 08:43:58 node1 kernel:sd 6:0:4:1: [SDB] Result:hostbyte=did_ok Driverbyte=driver_sense
OCT 08:43:58 node1 kernel:sd 6:0:4:1: [SDB] Sense key:data Protect [current]
OCT 08:43:58 node1 kernel:sd 6:0:4:1: [SDB] Add. Sense:space Allocation failed write protect
OCT 08:43:58 node1 kernel:sd 6:0:4:1: [SDB] Cdb:write: 8a 3b 7e 78 30 00 00 00 08 00 00
OCT 10:50:59 node1 INIT:ORACLE-OHASD main process (6150) killed by TERM signal
The error message is: Critical space allocation error, serious spatial allocation errors. That is, Linux has an error allocating space. In other words, the ASM Disk group dismount is caused by an allocation space error.

View multipath Information
[Root@node1 ~]# Multipath-ll
36000d31003190c000000000000000003 dm-3 Compelnt,compellent Vol.
size=80t features= ' 1 queue_if_no_path ' hwhandler= ' 0 ' WP=RW
'-+-policy= ' round-robin 0 ' prio=1 status=active
|-6:0:9:1 SDD 8:48 active ready Running
'-8:0:9:1 SDI 8:128 active ready running
Delldisk2 (36000d310031908000000000000000003) dm-4 Compelnt,compellent Vol.
size=8.0t features= ' 1 queue_if_no_path ' hwhandler= ' 0 ' WP=RW
'-+-policy= ' round-robin 0 ' prio=1 status=active
|-6:0:12:1 SDE 8:64 active ready Running
|-8:0:6:1 SDH 8:112 active ready Running
|-6:0:4:1 SDB 8:16 active ready Running
'-8:0:12:1 sdj 8:144 active ready Running
Delldisk1 (36000d31003190a000000000000000007) dm-2 Compelnt,compellent Vol.
size=12t features= ' 1 queue_if_no_path ' hwhandler= ' 0 ' WP=RW
'-+-policy= ' round-robin 0 ' prio=1 status=active
|-6:0:1:1 SDA 8:0 active ready running
|-8:0:2:1 SDF 8:80 active ready running
|-6:0:7:1 SDC 8:32 Active ready Running
'-8:0:3:1 SDG 8:96 active ready Running
It is obvious that the error is the same LUN (DELLDISK2), that is, the storage space used up storage. That is, because the delldisk2 storage space is exhausted, causing the system to have an allocation space error, resulting in ASM write failure, This results in a database exception. The nature of this problem is that the system is allocated 8T, but the actual storage can use less than 8T of space, and the OS is used in 8T to cause the problem. The professional name is "storage Compact volume". So you need to be aware of this when you store your configuration. Because this kind of situation usually only writes the IO exception, reads is still normal, therefore does not lose the data.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.