Oracle will die when the Web end queries some special records

Source: Internet
Author: User

A system has been killed frequently recently. The symptom is that all the CPU resources of the system are exhausted. The system environment is as follows:

Dual CPU and RAID 5
Oracle 8.1.7.0.1
Weblogic 1, 8.12
JAVA JDBC
Oracle and weblogic run on RedHat 7.3 respectively! I don't know what it is!

The character set is ZHS16GBK.

The system will occupy 99% of the CPU at intervals, and the two CPUs are all 0 idle. At the same time, events in v $ session_wait are SQL * Net more data to client or SQL * Net message from client, alert. everything in the log is normal, and no exceptions are found in the StatsPack Report, but the SQL * Net event waits a lot! At first, I thought it was because of a problem with the Intranet switch and the network causes database problems?

Later, the analysis found that when the Web Client queries a table's specific records, the CPU usage rose sharply, but the Web did not find this record, but this record does exist in the Database Table! If only one Web server is exhausted when querying this record, if multiple Web servers are enabled to query this record, for example, if five are enabled, each of the five server processes occupies about 20% of the CPU on average, at the same time, the two CPUs are exhausted and the system goes down! However, it is not normal to query this record in SQLPLUS!

The test for a number is as follows (X is the hidden part ):

Now query the 130XXXXXXXX number on the Web Client, and then the TOP shows:

  1:22pm  up 1 day, 10:15,  4 users,  load average: 0.98, 0.50, 0.33
106 processes: 104 sleeping, 2 running, 0 zombie, 0 stopped
CPU0 states: 15.0% user,  6.0% system,  0.0% nice, 78.4% idle
CPU1 states: 90.3% user,  1.4% system,  0.0% nice,  7.2% idle
Mem:  1031044K av, 1021448K used,    9596K free,       0K shrd,   46060K buff
Swap: 2097112K av,   37424K used, 2059688K free                  860096K cached
  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
 5309 oracle    25   0  8832 8832  8252 R    99.0  0.8   0:59 oracle
 3026 oracle    16   0  138M 138M  137M S     2.9 13.7  27:41 oracle
 3028 oracle    16   0  144M 144M  143M S     2.5 14.3  28:59 oracle
 3030 oracle    16   0  129M 129M  129M S     2.1 12.8  23:25 oracle
 5193 oracle    15   0  1092 1092   856 S     2.1  0.1   1:39 top
 5310 oracle    15   0  1092 1092   856 R     2.1  0.1   0:06 top
 1234 oracle    15   0  136M 136M  135M S     0.3 13.5   1:27 oracle
 3020 oracle    15   0 26004  25M 25380 S     0.3  2.5   3:48 oracle
 1199 oracle    15   0 11788  11M 11256 S     0.1  1.1   0:13 oracle
 1219 oracle    15   0  161M 161M  159M S     0.1 16.0   0:47 oracle
    1 root      15   0   476  432   416 S     0.0  0.0   0:05 init
    2 root      15   0     0    0     0 SW    0.0  0.0   0:00 keventd
    3 root      34  19     0    0     0 SWN   0.0  0.0   0:00 ksoftirqd_CPU0
    4 root      34  19     0    0     0 SWN   0.0  0.0   0:00 ksoftirqd_CPU1
    5 root      15   0     0    0     0 SW    0.0  0.0   0:06 kswapd
    6 root      25   0     0    0     0 SW    0.0  0.0   0:00 bdflush
    7 root      15   0     0    0     0 SW    0.0  0.0   0:00 kupdated
    8 root      25   0     0    0     0 SW    0.0  0.0   0:00 mdrecoveryd
   16 root      15   0     0    0     0 SW    0.0  0.0   0:28 kjournald
   95 root      16   0     0    0     0 SW    0.0  0.0   0:00 khubd

At this point:
SQL> select a. sid, status from v $ session a, v $ process B where a. paddr = B. addr and B. spid = 5309;

SID STATUS
------------------
29 ACTIVE

SQL> select sid, event, wait_time, seconds_in_wait, state from v $ session_wait where sid = 29;

       SID EVENT                         WAIT_TIME  SECONDS_IN_WAIT STATE
---------- ----------------------------- ---------- --------------- -------------------
        29 SQL*Net more data to client           -1              56 WAITED SHORT TIME

In SQLPLUS:
SQL> select rowid, tel, name from teluser_tbl where tel = '130xxxxxxxx ';

ROWID TEL NAME
-----------------------------------------------------------
AAAGHgAAFAAAU27AAA 130 xxxxxxxx xxx South China Engineering Co., Ltd.

Apparently, the "Company" is missing"

You can use PL/SQL Developer's "SQL window" to query. "company" is a garbled code:
TEL NAME
-----------------------------------------
130 xxxxxxxx xxx South China Engineering Co., Ltd.

Query in the "command window" of PL/SQL Developer. What is "after" public "? "
 
The problem lies in the NAME field! If you delete the garbled code behind "public", there will be no problem! The system is all right!

I used dbms_system.set_ SQL _trace_in_session to track the session with PID = 5309. No trace file is generated, that is, the SQL statement has been executed and data is being returned to the Web. The Web still cannot receive the data, server process is suspended and occupies a large amount of CPU resources.

I open seven IE Windows to query the same mobile phone number, TOP:

  7:46pm  up 16:40,  4 users,  load average: 7.81, 7.77, 7.62
105 processes: 96 sleeping, 9 running, 0 zombie, 0 stopped
CPU0 states: 96.4% user,  3.1% system,  0.0% nice,  0.0% idle
CPU1 states: 99.0% user,  1.0% system,  0.0% nice,  0.0% idle
Mem:  1031044K av, 1021452K used,    9592K free,       0K shrd,   36812K buff
Swap: 2097112K av,   42588K used, 2054524K free                  881416K cached
  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
 3438 oracle    25   0  9592 9592  8988 R    32.7  0.9  21:29 oracle
 3431 oracle    25   0 11900  11M 11300 R    32.3  1.1  21:57 oracle
 3436 oracle    25   0  9136 9136  8552 R    30.9  0.8  20:51 oracle
 3454 oracle    25   0 11916  11M 11260 R    24.5  1.1  14:39 oracle
 3433 oracle    25   0 25028  24M 24376 R    23.5  2.4  16:06 oracle
 3442 oracle    25   0  9092 9092  8508 R    23.5  0.8  15:51 oracle
 3452 oracle    25   0  9056 9056  8472 R    23.5  0.8  14:36 oracle

All the events of these seven processes are: SQL * Net more data to client
The WEB end does not have any current situation, that is, data cannot be transmitted to the WEB !!!

The corresponding SQL is very simple, the execution plan is good, the cost is also very low, I caught the SQL with TOP-PID + v $ session, v $ process, v $ sqlarea, the execution plan is as follows:

Select rowid, t. tel, t. password, t. language, t. isfree, t. status,
T. enddate, t. endprocess, t. province, t. city, t. code, t. name,
T. address, t. postcode, t. companytel, t. familytel, t. fax, t. qq,
T. email, t. regdate, t. unregdate, t. logincount
From teluser_tbl t where t. tel = '133xxxxxxxx'

Elapsed: 00:00:00. 45
Execution Plan
----------------------------------------------------------
0 select statement Optimizer = CHOOSE
1 0 table access (by index rowid) OF 'teluser _ tbl'
2 1 INDEX (unique scan) OF 'sys _ C001736 '(UNIQUE)

 

Statistics
----------------------------------------------------------
0 recursive cballs
0 db block gets
4 consistent gets
0 physical reads
0 redo size
1628 bytes sent via SQL * Net to client
425 bytes encoded ed via SQL * Net from client
2 SQL * Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

Obviously, this is a good execution plan.

Dump the block of this record:

SQL> select file_id from dba_extents where segment_name = 'teluser _ TBL ';

FILE_ID
---------------
5

SQL> set serveroutput on
SQL> declare
2 a number;
3 begin
4 a: = dbms_rowid.rowid_block_number ('aaaghgaafaaau27aaa ');
5 dbms_output.put_line ();
6 end;
7/
85435

The PL/SQL process is successfully completed.

SQL> declare
2 a number;
3 begin
4 a: = dbms_rowid.rowid_row_number ('aaaghgaafaaau27aaa ');
5 dbms_output.put_line ();
6 end;
7/
0

The PL/SQL process is successfully completed.

SQL> alter system dump datafile 5 blocks 85435;

System altered.

The TRACE file is as follows (only row 0 is pasted ):

[Oracle @ sxdb udump] $ less ora_5301.trc
/Opt/ora/admin/sxit/udump/ora_5301.trc
Oracle8i Enterprise Edition Release 8.1.7.0.1-Production
With the Partitioning option
JServer Release 8.1.7.0.1-Production
ORACLE_HOME =/opt/ora/8.1.7
System name: Linux
Node name: sxdb
Release: 2.4.18-3smp
Version: #1 SMP Thu Apr 18 07:27:31 EDT 2002
Machine: i686
Instance name: sxit
Redo thread mounted by this instance: 1
Oracle process number: 45
Unix process pid: 5301, image: oracle @ sxdb (TNS V1-V3)

* ** 13:11:49. 539
* ** Session id: (19.25087) 13:11:49. 538
Start dump data blocks tsn: 4 file #: 5 minblk 85435 maxblk 85435
Buffer tsn: 4 rdba: 0x01414dbb (5/85435)
Scn: 0x0000. 058b81ce seq: 0x01: 0x02 tail: 0x81ce0601
Frmt: 0x02 chkval: 0x0000 type: 0x06 = trans data
 
Block header dump: 0x01414dbb
Object id on Block? Y
Seg/obj: 0x61e0 csc: 0x00. 58b81cd itc: 1:-typ: 1-DATA
Fsl: 0 fnx: 0x0 ver: 0x01
 
Itl Xid Uba Flag Lck Scn/Fsc
0x01 xid: 0x0003. 031.20.aa4c uba: 0x00c003de. 10f4. 48 -- U-1 fsc 0x0000. 058b81ce
 
Data_block_dump
====================
Tsiz: 0x1fb8
Hsiz: 0x58
Pbl: 0x09724644
Bdba: 0x01414dbb
Flag = -----------
Ntab = 1
Nrow = 35
Frre =-1
Fsbo = 0x58
Fsearch = 0x312
Avsp = 0x389
Tosp = 0x389
0xe: pti [0] nrow = 35 offs = 0
0x12: pri [0] offs = 0x1edf
0x14: pri [1] offs = 0x1e13
0x16: pri [2] offs = 0x1d47
0x18: pri [3] offs = 0x1c75
0x1a: pri [4] offs = 0x1ba9
0x1c: pri [5] offs = 0x1add
0x1e: pri [6] offs = 0x1a11
0x20: pri [7] offs = 0x1945
0x22: pri [8] offs = 0x1879
0x24: pri [9] offs = 0x17ad
0x26: pri [10] offs = 0x16e2
0x28: pri [11] offs = 0x1617
0x2a: pri [12] offs = 0x154c
0x2c: pri [13] offs = 0x1481
0x2e: pri [14] offs = 0x13b6
0x30: pri [15] offs = 0x12eb
0x32: pri [16] offs = 0x1220
0x34: pri [17] offs = 0x1155
0x36: pri [18] offs = 0x0000c
0x38: pri [19] offs = 0xfc1
0x3a: pri [20] offs = 0xef6
0x3c: pri [21] offs = 0xe2d
0x3e: pri [22] offs = 0xd62
0x40: pri [23] offs = 0xc99
0x42: pri [24] offs = 0 xbce
0x44: pri [25] offs = 0xb05
0x46: pri [26] offs = 0xa39
0x48: pri [27] offs = 0x96d
0x4a: pri [28] offs = 0x8a5
0x4c: pri [29] offs = 0x7d9
0x4e: pri [30] offs = 0x70d
0x50: pri [31] offs = 0x645
0x52: pri [32] offs = 0x57d
0x54: pri [33] offs = 0x3e2
0x56: pri [34] offs = 0x312
Block_row_dump:
Tab 0, row 0, @ 0x1edf
Tl: 217 fb: -- H-FL -- lb: 0x0 cc: 50
Col 0: [11] 31 33 30 3X 3X 3X 3X 3X 3X 3X (X-is hidden by me to avoid leaking customer information)
Col 1: [6] 32 30 30 35 30 31
Col 2: [4] b9 e3 b6 AB
Col 3: [4] b7 f0 c9 bd
Col 4: [4] 30 37 36 35
Col 5: [19] c2 b7 c7 c5 bb aa c4 cf b9 a4 b3 cc d3 d0 cf de b9 AB cb -- that is, the NAME field stores 19 bytes in total, the NAME of this line of record is "XX South China Engineering Co., Ltd.", a total of 9 characters, obviously the database stores one more byte! This byte causes garbled characters!
Col 6: * NULL *
Col 7: * NULL *
Col 8: * NULL *
Col 9: * NULL *
Col 10: * NULL *
Col 11: * NULL *
Col 12: * NULL *
Col 13: [14] 32 30 35 30 31 30 35 31 31 37 31 34 33 32
Col 14: [14] 32 30 35 30 31 30 35 31 31 37 31 34 33 32
Col 15: [2] c1 02
Col 16: [1] 80
Col 17: [1] 80
Col 18: [2] c1 02
Col 19: [14] 32 30 35 30 31 30 35 31 31 37 31 34 33 32
Col 20: [14] 32 30 35 30 31 30 35 31 31 37 31 34 33 32
Col 21: [1] 80
Col 22: [1] 80
Col 23: [2] c1 04
Col 24: [1] 80
Col 25: [2] c1 06
Col 26: [2] c1 02
Col 27: [4] 30 30 30 30
Col 28: [1] 80
Col 29: * NULL *
Col 30: [14] 32 30 35 30 31 30 35 31 31 37 31 34 33 32
Col 31: [1] 80
Col 32: * NULL *
Col 33: [1] 80
Col 34: [1] 80
Col 35: * NULL *
Col 36: * NULL *
Col 37: * NULL *
Col 38: * NULL *
Col 39: * NULL *
Col 40: * NULL *
Col 41: [1] 80
Col 42: [1] 80
Col 43: [1] 80
Col 44: [1] 80
Col 45: [1] 80
Col 46: [1] 80
Col 47: [1] 80
Col 48: [14] 32 30 35 30 33 30 37 31 37 31 34 33 32
Col 49: [2] c1 0c

Because this field is not very important, update teluser_tbl set name = rtrim (name, substr (name,-1) is used at last. The character at the rightmost end is broken and the system returns to normal.

Later, the developer checked the program in detail and found that the C language did not read one byte when reading data!

So why does this Byte cause such a serious problem? The only explanation currently is the Bug in the ZHS16GBK character set. The character corresponding to this Byte in the ZHS16GBK character set conflicts with the underlying OS communication code, it causes communication faults and database faults.

Come here first, and try again later.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.