某個朋友公司的客戶,友情幫忙分析的。客戶使用的是oracle 12c(12.1.0.1),應用通過jdbc訪問發現時快時慢。但是通過sqlplus訪問發現一切正常。
開始以為是防火牆問題,檢查發現防火牆什麼的都是禁用掉了,甚至我還修改了selinux=disable,發現問題依舊。
由於之前處理過幾個類似的case,都是jdbc版本的問題,因此開始我讓他們換幾個jdbc版本測試下,發現問題依舊。類似如下結果:
[oracle@12c_single ~]$ /oracle/product/12.1/db_1/jdk/bin/java -jar a16_12101.jar
2013-11-24 18:19:20
start:
??????
??????
init
??????
??????
init
end.
2013-11-24 18:19:21
[oracle@12c_single ~]$ /oracle/product/12.1/db_1/jdk/bin/java -jar a16_12101.jar
2013-11-24 18:19:23
start:
??????
??????
init
??????
??????
init
end.
2013-11-24 18:19:43
[oracle@12c_single ~]$ /oracle/product/12.1/db_1/jdk/bin/java -jar a16_12101.jar
2013-11-24 18:19:48
start:
??????
??????
init
??????
??????
init
end.
2013-11-24 18:19:56
[oracle@12c_single ~]$ /oracle/product/12.1/db_1/jdk/bin/java -jar a16_12101.jar
2013-11-24 18:19:59
start:
??????
??????
init
??????
??????
init
end.
2013-11-24 18:20:17
後面我通過strace 跟蹤發現了一些蛛絲馬跡,如下的跟蹤的結果:
strace -fr /oracle/product/12.1/db_1/jdk/bin/java -jar a16_12101.jar
。。。。。
[pid 17242] 0.000142 stat("/dev/random", {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 8), ...}) = 0
[pid 17242] 0.000084 stat("/dev/urandom", {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 9), ...}) = 0
[pid 17242] 0.000376 lseek(3, 46548757, SEEK_SET) = 46548757
[pid 17242] 0.000037 read(3, "PK\3\4\n\0\0\0\0\0\201^8A\275}\357^K\16\0\0K\16\0\0/\0\0\0", 30) = 30
[pid 17242] 0.000046 lseek(3, 46548834, SEEK_SET) = 46548834
[pid 17242] 0.000032 read(3, "\312\376\272\276\0\0\0001\0\245\10\0\r\10\0\22\10\0)\10\0.\10\0:\10\0@\1\0\3("..., 3659) = 3659
[pid 17242] 0.000275 open("/dev/random", O_RDONLY) = 10
[pid 17242] 0.000042 fstat(10, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 8), ...}) = 0
[pid 17242] 0.000033 fcntl(10, F_GETFD) = 0
[pid 17242] 0.000023 fcntl(10, F_SETFD, FD_CLOEXEC) = 0
[pid 17242] 0.000055 open("/dev/urandom", O_RDONLY) = 11
[pid 17242] 0.000032 fstat(11, {st_mode=S_IFCHR|0666, st_rde
........
[pid 17250] 0.000044 futex(0x7fb5ec0b9528, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 17250] 0.000058 futex(0x40110734, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1385288524, 652223000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
[pid 17250] 0.050426 futex(0x7fb5ec0b9528, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 17250] 0.000060 futex(0x40110734, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1385288524, 702709000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
[pid 17250] 0.050573 futex(0x7fb5ec0b9528, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 17250] 0.000044 futex(0x40110734, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, {1385288524, 753331000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
[pid 17250] 0.050433 futex(0x7fb5ec0b9528, FUTEX_WAKE_PRIVATE, 1) = 0
........
到futex 這裡一直timeout,大概10多秒。其中的open(“/dev/random”, O_RDONLY) 引起了我的注意,google搜了一下,還真有不少人遇到過。
Oracle 從11g開始,對於jdbc這塊兒安全上進行了加強,大概是這樣的一個解釋:
The JDBC 11g needs about 40 bytes of secure random numbers, gathered from /dev/random, to encrypt its connect string.
那麼解決方案就是將java_home下面的 Java.security檔案中的如下內容進行修改:
securerandom.source=file:/dev/random 修改為:securerandom.source=file:/dev/urandom
當我客戶檢查時,發現這個設定檔已經是securerandom.source=file:/dev/urandom 了。到這裡我似乎感覺是jdbc版本的問題了或者是12c本身的問題。
將客戶的jar把傳到自己的12.1.0.1和12.1.0.2環境中進行測試,發現現象一樣,時快時慢。
通過strace跟蹤了一下,發現資訊跟之前在客戶環境中的strace結果類似,這是很怪異的。
後面我懷疑可能是這個設定檔並沒有起作用,最後搜了下mos發現有一篇文檔:
How To Configure Database JVM (JavaVM) To Use /dev/urandom (In Order To Avoid JDBC Connection Delays Due To Lack Of Random Number Entropy) (ID 1594701.1)
裡面的建議是直接修改設定檔,如下:
1) Set a system property in your Java code:
System.setProperty("java.security.egd", "file:///dev/urandom"); // the 3 '/' are important to make it a URL
2) Set a database system property:
declare
c_property varchar2(32767);
begin
c_property := dbms_java.set_property('java.security.egd', '/dev/urandom');
end;
The database user needs a privilege in order to execute this call:
begin
dbms_java.grant_permission(
grantee => '{Database Schema}',
permission_type => 'SYS:java.util.PropertyPermission',
permission_name => 'java.security.egd',
permission_action => 'read,write'
);
end;
根據這個docs進行修改之後,再次測試,我發現一切正常了。。。。 如下是測試過程:
[oracle@12c_single ~]$ /oracle/product/12.1/db_1/jdk/bin/java -jar a16_12101_new.jar
2013-11-24 19:06:41
start:
??????
??????
init
??????
??????
init
end.
2013-11-24 19:06:42
[oracle@12c_single ~]$ /oracle/product/12.1/db_1/jdk/bin/java -jar a16_12101_new.jar
2013-11-24 19:06:43
start:
??????
??????
init
??????
??????
init
end.
2013-11-24 19:06:43
[oracle@12c_single ~]$ /oracle/product/12.1/db_1/jdk/bin/java -jar a16_12101_new.jar
2013-11-24 19:06:45
start:
??????
??????
init
??????
??????
init
end.
2013-11-24 19:06:45
這個case本身是並不複雜,比較簡單,跟大家分享一下,歡迎交流!
注意:這裡最好是使用oracle自己的java,保持版本一致,我這裡測試發現如果使用os自己的java,版本較低,串連仍然會比較慢。
[oracle@12c_single ~]$ java -jar a16_12101_new.jar
2013-11-24 19:09:20
start:
??????
??????
init
??????
??????
init
end.
2013-11-24 19:09:46
[oracle@12c_single ~]$ java -version
java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10.4) (rhel-1.41.1.10.4.el6-x86_64)
OpenJDK 64-Bit Server VM (build 20.0-b11, mixed mode)
這個版本很明顯是低於Oracle 12.1.0.1 官方文檔中的要求的,必須是1.6.0_37以上版本。