Oracle Database diagnosis case-the redo log group is in the highly active status

Source: Internet
Author: User

Author: eygle
Source: http://www.eygle.com/blog
Date: June 26,200 5
Link: http://www.eygle.com/archives/2005/06/oracleeaoieiear.html

«Use of new features of Oracle9i flashback query to recover and accidentally delete data | blog homepage | popular Oracle database performance optimization book»

Platform: SunOS 5.8 Generic_108528-23 sun4u iSCSI sunw, ultra-Enterprise
Database: 8.1.5.0.0
Symptom: the response is slow and application requests cannot be returned.
Log on to the database and find that the redo log group is in the active status except the current status.
oracle:/oracle/oracle8>sqlplus "/ as sysdba"SQL*Plus: Release 8.1.5.0.0 - Production on Thu Jun 23 18:56:06 2005(c) Copyright 1999 Oracle Corporation.  All rights reserved.Connected to:Oracle8i Enterprise Edition Release 8.1.5.0.0 - ProductionWith the Partitioning and Java optionsPL/SQL Release 8.1.5.0.0 - ProductionSQL> select * from v$log;    GROUP#    THREAD#  SEQUENCE#      BYTES    MEMBERS ARC STATUS           FIRST_CHANGE# FIRST_TIM---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- ---------         1          1     520403   31457280          1 NO  ACTIVE              1.3861E+10 23-JUN-05         2          1     520404   31457280          1 NO  ACTIVE              1.3861E+10 23-JUN-05         3          1     520405   31457280          1 NO  ACTIVE              1.3861E+10 23-JUN-05         4          1     520406   31457280          1 NO  CURRENT             1.3861E+10 23-JUN-05         5          1     520398   31457280          1 NO  ACTIVE              1.3860E+10 23-JUN-05         6          1     520399   31457280          1 NO  ACTIVE              1.3860E+10 23-JUN-05         7          1     520400  104857600          1 NO  ACTIVE              1.3860E+10 23-JUN-05         8          1     520401  104857600          1 NO  ACTIVE              1.3860E+10 23-JUN-05         9          1     520402  104857600          1 NO  ACTIVE              1.3861E+10 23-JUN-059 rows selected.SQL> /    GROUP#    THREAD#  SEQUENCE#      BYTES    MEMBERS ARC STATUS           FIRST_CHANGE# FIRST_TIM---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- ---------         1          1     520403   31457280          1 NO  ACTIVE              1.3861E+10 23-JUN-05         2          1     520404   31457280          1 NO  ACTIVE              1.3861E+10 23-JUN-05         3          1     520405   31457280          1 NO  ACTIVE              1.3861E+10 23-JUN-05         4          1     520406   31457280          1 NO  CURRENT             1.3861E+10 23-JUN-05         5          1     520398   31457280          1 NO  ACTIVE              1.3860E+10 23-JUN-05         6          1     520399   31457280          1 NO  ACTIVE              1.3860E+10 23-JUN-05         7          1     520400  104857600          1 NO  ACTIVE              1.3860E+10 23-JUN-05         8          1     520401  104857600          1 NO  ACTIVE              1.3860E+10 23-JUN-05         9          1     520402  104857600          1 NO  ACTIVE              1.3861E+10 23-JUN-059 rows selected.

If the logs are in the active status, it is clear that dbwr writes cannot keep up with the checkpoint triggered by log switch. Next, let's check the busy degree of dbwr:

SQL> !oracle:/oracle/oracle8>ps -ef|grep ora_  oracle  2273     1  0   Mar 31 ?       57:40 ora_smon_hysms02  oracle  2266     1  0   Mar 31 ?       811:42 ora_dbw0_hysms02  oracle  2264     1 16   Mar 31 ?       16999:57 ora_pmon_hysms02  oracle  2268     1  0   Mar 31 ?       1649:07 ora_lgwr_hysms02  oracle  2279     1  0   Mar 31 ?        8:09 ora_snp1_hysms02  oracle  2281     1  0   Mar 31 ?        4:22 ora_snp2_hysms02  oracle  2285     1  0   Mar 31 ?        9:40 ora_snp4_hysms02  oracle  2271     1  0   Mar 31 ?       15:57 ora_ckpt_hysms02  oracle  2283     1  0   Mar 31 ?        5:37 ora_snp3_hysms02  oracle  2277     1  0   Mar 31 ?        5:58 ora_snp0_hysms02  oracle  2289     1  0   Mar 31 ?        0:00 ora_d000_hysms02  oracle  2287     1  0   Mar 31 ?        0:00 ora_s000_hysms02  oracle  2275     1  0   Mar 31 ?        0:04 ora_reco_hysms02  oracle 21023 21012  0 18:52:59 pts/65   0:00 grep ora_

The process Number of dbwr is 2266.
Use the TOP command to observe:

oracle:/oracle/oracle8>toplast pid: 21145;  load averages:  3.38,  3.45,  3.67               18:53:38725 processes: 711 sleeping, 1 running, 10 zombie, 3 on cpuCPU states: 35.2% idle, 40.1% user,  9.4% kernel, 15.4% iowait,  0.0% swapMemory: 3072M real, 286M free, 3120M swap in use, 1146M swap free   PID USERNAME THR PRI NICE  SIZE   RES STATE    TIME    CPU COMMAND 11855 smspf      1  59    0 1355M 1321M cpu/0   19:32 16.52% oracle  2264 oracle     1   0    0 1358M 1316M run    283.3H 16.36% oracle 11280 oracle     1  13    0 1356M 1321M sleep   79.8H  0.77% oracle  6957 smspf     15  29   10   63M   14M sleep  107.7H  0.76% java 17393 smspf      1  30    0 1356M 1322M cpu/1  833:05  0.58% oracle 29299 smspf      5  58    0 8688K 5088K sleep   18.5H  0.38% fee_ftp_get 21043 oracle     1  43    0 3264K 2056K cpu/9    0:01  0.31% top 20919 smspf     17  29   10   63M   17M sleep  247:02  0.29% java 25124 smspf      1  58    0   16M 4688K sleep    0:35  0.25% smif_status_rec  8086 smspf      5  23    0   21M   13M sleep   41.1H  0.24% fee_file_in 16009 root       1  35    0 4920K 3160K sleep    0:03  0.21% sshd2 25126 smspf      1  58    0 1355M 1321M sleep    0:26  0.20% oracle  2266 oracle     1  60    0 1357M 1317M sleep  811:42  0.18% oracle 11628 smspf      7  59    0 3440K 2088K sleep    0:39  0.16% sgip_client_ltz 26257 smspf     82  59    0  447M  178M sleep  533:04  0.15% java

We noticed that the CPU consumed by the process No. 2266 is only 0.18%, which is obviously not busy, so the bottleneck is probably on Io.
Use iostat to check the IO status.

gqgai:/home/gqgai>iostat -xn 3                    extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device......    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t6d0    1.8   38.4   32.4  281.0  0.0  0.7    0.0   16.4   0  29 c0t10d0    1.8   38.4   32.4  281.0  0.0  0.5    0.0   13.5   0  27 c0t11d0   24.8   61.3 1432.4  880.1  0.0  0.5    0.0    5.4   0  26 c1t1d0    0.0    0.0    0.0    0.0  0.0  0.0    0.0    9.1   0   0 hurraysms02:vold(pid238)                    extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device........    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t6d0    0.3    8.3    0.3   47.0  0.0  0.1    0.0    9.2   0   8 c0t10d0    0.0    8.3    0.0   47.0  0.0  0.1    0.0    8.0   0   7 c0t11d0   11.7   65.3  197.2  522.2  0.0  1.6    0.0   20.5   0 100 c1t1d0    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 hurraysms02:vold(pid238)                    extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device........    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t6d0    0.3   13.7    2.7   68.2  0.0  0.2    0.0   10.9   0  12 c0t10d0    0.0   13.7    0.0   68.2  0.0  0.1    0.0    9.6   0  11 c0t11d0   11.3   65.3   90.7  522.7  0.0  1.5    0.0   19.5   0  99 c1t1d0    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 hurraysms02:vold(pid238)                    extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device........    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t6d0    0.0    8.0    0.0   42.7  0.0  0.1    0.0    9.3   0   7 c0t10d0    0.0    8.0    0.0   42.7  0.0  0.1    0.0    9.1   0   7 c0t11d0   11.0   65.7  978.7  525.3  0.0  1.4    0.0   17.7   0  99 c1t1d0    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 hurraysms02:vold(pid238)                    extended device statistics                  r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device........    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t6d0    0.3   87.7    2.7  433.7  0.0  2.2    0.0   24.9   0  90 c0t10d0    0.0   88.3    0.0  436.5  0.0  1.8    0.0   19.9   0  81 c0t11d0   89.0   54.0  725.4  432.0  0.0  2.1    0.0   14.8   0 100 c1t1d0    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 hurraysms02:vold(pid238)

We noticed that the main volume of c1t1d0 stored in the database is always at 99 ~ 100, but the write speed is only about 500 kb/s, which is extremely slow.
(% B percent of time the disk is busy (transactions in progress)
KW/s kilobytes written per second)
According to our knowledge, the write speed of the T3 disk array by char can reach about 10 Mb/s. Some tpcc indicators have been tested before. For details, refer to: Use Bonnie to test system Io speed.
Under normal circumstances, the random write speed of databases is usually between 1 and 1 ~ About 2 M. It is clear that the disk is in an abnormal state. It is confirmed that the hard disk is damaged, and a hard disk is damaged in the RAID 5 group.
After replacement, the system gradually becomes normal.

Posted by eygle at June 26,200 5

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.