Symptoms:
1, AMMs process intermittent hang, causing the success rate of logging network decreased
2, AMMs process all work line threads hang, Pstack display as follows:
-----------------lwp# 41/thread#--------------------
feedb075 Read (B6, b2625d80, 8)
fe58e904 __1cdhpieread6fipvi_i_ (B6, b2625d80, 8) + A0
Fe58fa72 Jvm_read (B6, b2625d80, 8) + 36
B2f5b1ff java_java_net_socketinputstream_socketread0 (a30f518, B2627dbc, b2627dc0, b2627de0, 0, 8) + 137
fb498e62???????? (BD23AD10, 0, 8, 0, a30f400, BD23AD28)
fb4c7820???????? ()
Based on experience, for OLTP production systems, a large number of socket read timeouts for the business side may be related to the database. As a result, problem handling is directed to the database direction. Sure enough, the Discovery database (solaris10+oracle10g RAC) has the following exception:
1, LGWR Process CPU usage exception, up to 10% (normal should be under 1%)
2. Observe busy AWR report, log file sync and log file parallel write up to 80ms (normal should be below 5ms)
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/5C/37/wKioL1Uc6WSTmnIYAAEPZFTIVgM928.jpg "title=" awr1. JPG "alt=" Wkiol1uc6wstmniyaaepzftivgm928.jpg "/>
3, write the script from the Application debug log Amms1.log Extract logging time, found up to 300ms (normal should be under 10ms)
This looks like a database performance problem, is that true?
Finally, after a few weeks of tossing and finding that it was originally Solaris DISM, the problem was solved without DISM.
Note:
1, how to determine whether the current database is using DISM technology?
When using DISM, a process named Ora_dism_sid starts with the Oracle instance startup and exits with the Oracle instance shutdown.
[[Email Protected]]ps-ef|grep-v grep|grepdism
Root 10633 1 0 June 11? 0:42 ora_dism_rwdb
As can be seen here, Oracle uses DISM.
2. When does Oracle 10g use DISM?
In oracle10g, if Sga_max_size and Sga_target are the same size, DISM (using ISM) is not used. If the setting sga_target is smaller than sga_max_size, DISM is used.
3. What problems can DISM cause?
DISM may cause problems if the Oracle SGA cannot be locked properly. Judging method:
#/usr/sbin/lockstat-a-N 200000 sleep 10
If there is a large number of segspt_softunlock or spt_anon_getpages, then the SGA may not be locked properly.
If you use ISM, the memory of the SGA will be locked directly by the OS without this problem.
When we change the sga_target to the same size as sga_max_size and restart the instance one by one, confirming that Oracle no longer uses DISM, the problem is resolved and all symptoms disappear. The AWR report top events resumes as follows:
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/5C/37/wKioL1Uc7XfSuBqVAAEUAP3sHo8476.jpg "title=" AWR2. JPG "alt=" Wkiol1uc7xfsubqvaaeuap3sho8476.jpg "/>
Summary:
In the absence of significant changes in the system load, the CPU and IO anomalies may not be caused by performance problems, and the multi-angle multi-level comprehensive analysis problem.
This article is from the "Memory Fragments" blog, so be sure to keep this source http://weikle.blog.51cto.com/3324327/1627699
Processing of an intermittent process hang problem