Solve the Problem of CPU resource depletion caused by a large number of latch free databases in production. latchfree

Source: Internet
Author: User
Tags high cpu usage

Solve the Problem of CPU resource depletion caused by a large number of latch free databases in production. latchfree

At noon, the cpu of a database we produced remained high.

Using the following SQL statement, we can view the database waiting and contention at that time:

select s.SID,       s.SERIAL#,       'kill -9 ' || p.SPID,       s.MACHINE,       s.OSUSER,       s.PROGRAM,       s.USERNAME,       s.last_call_et,       a.SQL_ID,       s.LOGON_TIME,       a.SQL_TEXT,       a.SQL_FULLTEXT,       w.EVENT,       a.DISK_READS,       a.BUFFER_GETS  from v$process p, v$session s, v$sqlarea a, v$session_wait w where p.ADDR = s.PADDR   and s.SQL_ID = a.sql_id   and s.sid = w.SID   and s.STATUS = 'ACTIVE' order by s.last_call_et desc;

From the event, we can see that it is the cause of latch contention.

Check what latch is through the SQL statement.

select * from v$session_wait where event  like 'latch free'; 

P2 is the name of this latch. You can use the v $ latchname view to know which specific latch

1:45:55 PM SQL> select * from v$latchname where latch#=164;     LATCH# NAME                                                                   HASH---------- ---------------------------------------------------------------- ----------       164 simulator hash latch                                             2233208730

View latch history

2:11:59 PM SQL> select name,gets,misses,sleeps from v$latch where sleeps >0 order by sleeps desc; NAME                                                                   GETS     MISSES     SLEEPS---------------------------------------------------------------- ---------- ---------- ----------simulator hash latch                                             4827860212  135426899   10890947cache buffers chains                                             1619822817 2850976006    4747728gc element                                                       4660052091   25748270     175073resmgr:schema config                                               91872524     153968      95708ges resource hash list                                            174151449    1070556      55459Real-time plan statistics latch                                    40953155     651496      44527call allocation                                                     3301878     265908      43501row cache objects                                                 336300485    4970324      19366

This simulator hash latch is already a significant latch part.

Eagle has an article on his website about simulator.


Simulator means simulation. That is to say, when Oracle processes data blocks in the memory, it will actually record relevant information in the pre-allocated Buffer, such as DBA information, when a data block is aged, if the requested data exists in the Simulator memory during the next read, it is considered meaningful to continue caching the data block. These operations are monitored and simulated, and the weighted calculation result can be used to adjust the memory.
In the simulation process, it is also implemented through Latch. Related Latch include simulator lru latch and simulator hash latch.

For Buffer Cache, if this type of contention is serious in the system, you can consider disabling db_cache_advice to eliminate the impact of these internal operations on performance.
The following is a related BUG. In this Bug, the startup of DB_CACHE_ADVICE causes severe competition for simulator lru latch:

Bug 5918642 Heavy latch contention with DB_CACHE_ADVICE onThis note gives a brief overview of bug 5918642.
The content was last updated on: 01-APR-2008
  Click here for details of each of the sections below.
Product (Component) Oracle Server (Rdbms)
Range of versionsBelievedTo be affected Versions <11.2
VersionsConfirmedAs being affected
Platforms affected Generic (all/most platforms affected)
This issue is fixed in
  • 11.2 (Future Release)
  • (Server Patch Set)
  • (Server Patch Set)
Symptoms: Related:
  • Latch Contention
  • Waits for "latch free"
  • Performance Monitoring
High simulator lru latch contention can occur when db_cache_advice isset to ON if there is a large buffer cache.Workaround:  Set db_cache_advice to OFF

Of course, this is only a temporary solution. It is a manifestation of the problem, the root cause is the problem with this SQL statement.

When a data block is read into sga, the block header (buffer header) of this block will be placed in a hash bucket linked list (hash chain. The memory structure is protected by a series of cache buffers chains sublatch (also known as hash latch or cbc latch ). For blocks in the Buffer cache, to select, update, insert, delete, etc., you must first obtain the cache buffers chains sublatch to ensure exclusive access to the chain. If a competition occurs during the process, the system will wait for the latch: cache buffers chains event.

Cause: 1. inefficient SQL statements (mainly reflected in high read logic) in some environments, Applications open multiple concurrent sessions that execute the same inefficient SQL statements, these SQL statements try to get the same dataset. The main reason is that each execution has an SQL statement with a high BUFFER_GETS (logical read. On the contrary, smaller logical reads mean fewer latch get operations, reducing latch contention and improving performance. Note the BUFFER_GETS/EXECUTIONS statements in v $ SQL. 2. Hot block when multiple sessions repeatedly access one or more blocks protected by the same sub-cache buffers chains lock, Hot blocks will be generated. This wait event occurs when multiple sessions compete for the cache buffers chains sub-latch. Sometimes, even if the SQL is optimized, but multiple sessions execute this SQL statement at the same time, it is afraid that only a few specific blocks will be scanned, and HOT blocks will also appear.

SELECT P935.SEQUENCEID,       null FA_SEQUENCEID,       P935.ORDERID,       P935.ORGORDERID,       P935.PRODUCTNAME,       P935.PRODUCTNUM,       P935.ORDERTIME,       P935.LASTUPDATETIME,       P935.ORDERSTATUS,       P935.MEMO,       935 orderCode,       P935.PAYERACCTCODE,       P935.PAYERACCTTYPE,       P935.PAYEEACCTCODE PLATACCTCODE,       P935.PAYEEACCTTYPE PLATACCTTYPE,       P936.PAYEEACCTCODE,       P936.PAYEEACCTTYPE,       EXT935.PAYER_DISPLAYNAME,       EXT935.PAYER_NAME,       EXT935.PAYER_IDC,       EXT935.PAYER_MEMBERTYPE,       EXT936.PAYER_DISPLAYNAME PLAT_DISPLAYNAME,       EXT936.SUBMITNAME PLAT_NAME,       EXT936.PAYER_IDC PLAT_IDC,       EXT936.PAYER_MEMBERTYPE PLAT_MEMBERTYPE,       EXT936.PAYEE_DISPLAYNAME,       EXT936.PAYEE_NAME,       EXT936.PAYEE_IDC,       EXT936.PAYEE_MEMBERTYPE,       P935.PAYEEDISPLAYNAME WEBSITENAME,       CASE         WHEN (SELECT count(*)                 FROM PAYMENTORDER P936                WHERE P936.Ordercode = 936                  and P936.Orderstatus = 0                  AND <span style="color:#ff0000;">P936.Relatedsequenceid = P935.SEQUENCEID</span>) > 0 THEN          0         ELSE          1       END AS SHARINGRESULT,       CASE D935.Dealcode         WHEN 210 then          14         else          D935.DEALTYPE       end PAYMETHOD,       D935.DEALAMOUNT,       G935.EXT1,       G935.Ext2,       G935.PAYERCONTACTTYPE,       G935.PAYERCONTACT,       NVL(D935.PAYEEFEE, 0) PAYEEFEE,       NVL(D935.PAYERFEE, 0) PAYERFEE,       nvl(MS936.PAYEEFEE, 0) PLATFORMFEE,       P935.VERSION  FROM PAYMENTORDER          P935,       PAYMENTORDER          P936,       DEAL                  D935,       GATEWAYORDER          G935,       MSGATEWAYSHARINGORDER MS936,       PAYMENTORDEREXT       EXT935,       PAYMENTORDEREXT       EXT936 WHERE P936.ORDERCODE = 936   AND P935.ORDERCODE = 935   AND P936.RELATEDSEQUENCEID = to_char(P935.SEQUENCEID)   AND P935.SEQUENCEID = G935.SEQUENCEID(+)   AND P935.SEQUENCEID = D935.ORDERSEQID(+)   AND P935.SEQUENCEID = EXT935.ORDERSEQID(+)   AND P936.SEQUENCEID = EXT936.ORDERSEQID(+)   AND P936.SEQUENCEID = MS936.SEQUENCEID(+)   AND MS936.SHARINGTYPE = 1   AND P935.SEQUENCEID = :1UNIONSELECT P938.SEQUENCEID,       P935.SEQUENCEID FA_SEQUENCEID,       P938.ORDERID,       P938.ORGORDERID,       P935.PRODUCTNAME,       P935.PRODUCTNUM,       P938.ORDERTIME,       P938.LASTUPDATETIME,       P938.ORDERSTATUS,       P938.MEMO,       938 orderCode,       P938.PAYERACCTCODE,       P938.PAYERACCTTYPE,       P938.PAYEEACCTCODE PLATACCTCODE,       P938.PAYEEACCTTYPE PLATACCTTYPE,       P938.PAYEEACCTCODE,       P938.PAYEEACCTTYPE,       EXT938.PAYER_DISPLAYNAME,       EXT938.PAYER_NAME,       EXT938.PAYER_IDC,       EXT938.PAYER_MEMBERTYPE,       EXT938.PAYEE_DISPLAYNAME PLAT_DISPLAYNAME,       EXT938.SUBMITNAME PLAT_NAME,       EXT938.PAYEE_IDC PLAT_IDC,       EXT938.PAYEE_MEMBERTYPE PLAT_MEMBERTYPE,       EXT938.PAYEE_DISPLAYNAME,       EXT938.PAYEE_NAME,       EXT938.PAYEE_IDC,       EXT938.PAYEE_MEMBERTYPE,       P935.PAYEEDISPLAYNAME WEBSITENAME,       null SHARINGRESULT,       D938.DEALTYPE PAYMETHOD,       D938.DEALAMOUNT,       G935.EXT1,       G935.Ext2,       G935.PAYERCONTACTTYPE,       G935.PAYERCONTACT,       NVL(D938.PAYEEFEE, 0) PAYEEFEE,       NVL(D938.PAYERFEE, 0) PAYERFEE,       0 PLATFORMFEE,       P935.VERSION  FROM PAYMENTORDER    P935,       PAYMENTORDER    P938,       DEAL            D938,       GATEWAYORDER    G935,       PAYMENTORDEREXT EXT938 WHERE P935.ORDERCODE = 935   AND P938.ORDERCODE = 938   AND P938.RELATEDSEQUENCEID = to_char(P935.SEQUENCEID)   AND P935.SEQUENCEID = G935.SEQUENCEID(+)   AND P938.SEQUENCEID = D938.ORDERSEQID(+)   AND P938.SEQUENCEID = EXT938.ORDERSEQID(+)   AND P935.SEQUENCEID = :2

Analyze the preceding SQL statement, marked as red above, and the Data Type of varchar2 is on the left side of the equal sign, and the Data Type of number is on the right side of the brackets, which may lead to implicit conversion of the Data Type, resulting in great performance impact.

Contact R & D and modify the SQL statement to solve the problem.

The website cannot be opened because the CPU usage of the website is too large. How can this problem be solved?

Solution 1 is as follows:
Symptom: when the machine is debugging or allows IIS, the service is interrupted due to exceptions (such as power failure). When the IIS runs the page again, the CPU resource usage is 100%, which is invalid even if the server is restarted.
Cause: When the log is interrupted, IIS will write exception logs, but garbled characters are written at this time, which leads to an endless loop of logs written by IIS and depletion of system resources. Find the error log file of the current day under the system path \ System32 \ Logfiles \ W3SVC1, and you will see the above content.
Solution: Delete the error log file of the day under the system path \ System32 \ Logfiles \ W3SVC1, for example, ex060904.log, and restart IIS. Solution 2 is as follows:
Environment: win2003server + IIs + ASP + MSSQL
Phenomenon: Once a website is opened very slowly (sometimes several minutes, sometimes half an hour), or sometimes it times out to open the site. In this case, check the process on the server, the CPU usage reaches 100%, of which w3wp occupies 70 ~ 80%, SQL occupies 20 ~ 30%. Operations on all servers also become slow
Initial solution: log on to the server immediately to stop the w3wp process or restart the IIS Service. On average, the server is stored in the remote data center, all operations are remotely controlled, and sometimes remote connection cannot be performed. You can only call the Data Center Administrator to restart the server. This process causes constant complaints.
After checking the information on the Internet, we found that most of these phenomena are caused by improper webpage code. The following situations may cause such phenomena:
1. application, seesion, and other Server caches are often used in the Code, resulting in excessive use of server data;
2. The Code has unreasonable syntax and endless loops;
3. database corruption, especially ACCESS database;
4. install too many third-party software or plug-ins, which conflict with the IIS or webpage function code.
Phase 1 troubleshooting: Perform one-by-one analysis based on the referenced Materials
1. All site code on the server is written by the company's designers. It can be confirmed that the server cache syntax is not called too much (excluded)
2. Whether the Code has unreasonable syntax (uncertain)
3. Depending on the situation, when the IIS process usage increases, the SQL usage increases at the same time. It should be the SQL database site. Based on the phenomenon, the database or table should be normal. It is estimated that the data may be incorrect; (Not Sure)
4. In addition to basic system services, anti-virus protection, and essential services for website operations, the server does not have any third-party software, which is unlikely (excluded ).

After the above analysis and judgment, the uncertain items are linked to the conclusion that an unreasonable syntax exists in the Web Page code of a website using SQL database, resulting in high CPU usage of IIS and SQL processes.

Phase 2 troubleshooting:
Determine the scope and continue to narrow down the scope.
Because there are not many sites using SQL databases on the server, it is easy to establish an independent process ID for observation. All the sites using SQL databases are separately created in the IIS manager, then, enter the iisapp-a command on the CMD interface to view and record the ID of the processes in each IIS pool, an IIS process ID is the cause of this problem.
Solution 3 is as follows:
In iis6, the memory and CPU usage of w3wp.exe cannot be released in time, resulting in slow server response.

To solve the problem of excessive memory usage, you can configure the following:
1. configure a separate application pool for each website in IIS. That is, they do not affect each other.
2. Set the recycle time of the application pool. The default value is 1720 hours, which can be modified as needed. Then, when the memory usage exceeds 500 mb, the memory is automatically reclaimed.

Solve the problem of excessive CPU usage:
1. configure a separate application pool for each website in IIS. That is, they do not affect each other.
2. Set the CPU monitoring of the application pool to no more than 25% (4 CPU for the server), refresh every minute, and close when the limit is exceeded.

Obtain the full text from w3wp ......>
CPU % 100 Problems

Friends who often use computers must have never met the blue screen. The cold blue face is really annoying. So how did the blue screen come from?

We can explain the causes of the blue screen phenomenon from the soft and hard aspects. In terms of hardware, over-frequency is a major cause of blue screen. Excessive overclock, due to overload operations, resulting in too many internal operations, so that the CPU is overheated, resulting in System Operation errors. If you want to exceed the frequency, and do not want to appear blue screen, only do a good job of cooling measures, change a strong fan, plus some silica gel and other heat dissipation materials will be much better. In addition, a proper amount of overclocking or no overclocking is also one of the solutions. It depends on your choice to ensure stability or speed.

If the memory is physically damaged or the memory is incompatible with other hardware, a blue screen is also generated. The solution is to change the memory.

If you have noticed, you will find that when the optical drive is opened abnormally, it will also lead to a blue screen. This problem does not affect the normal operation of the system. You only need to attach the disc or Press ESC.

Another common cause of hardware blue screen is system hardware conflicts. In practice, we often encounter settings conflicts between sound cards or display cards. In "Control Panel"> "system"> "device management", check whether a device with a yellow question mark or exclamation mark exists. If yes, try to delete the device and restart the computer, it is automatically adjusted by Windows and generally solves the problem. If not, you can manually adjust or upgrade the corresponding driver.

Poor components are another culprit in the appearance of blue screens on computers. When a few unscrupulous merchants assemble compatible computers for customers, they use poor-quality boards, memory, and some even sell fake boards and old CPUs and memory. This will make the machines unstable during operation, A crash is inevitable. Therefore, users should be wary of this aspect when purchasing a machine. Please be familiar with your selection and use some new tools and software to test your computer, continuous machine testing (for example, 72 hours) for a long time, and try to maintain the minimum warranty period.

From the software perspective, virus or hacker attacks, registry errors or damages, too many programs are loaded at startup, version conflicts, insufficient virtual memory, system multi-task operation errors, dynamic link library file loss, excessive Fonts files, too many loaded scheduled tasks, conflicts between system resources, or resource depletion will produce a blue screen. In addition, blue screens are also prone to software and hardware conflicts. If you understand the reason why the blue screen appears "soft", you can remedy the problem.

1. Let's take a look at the strange tricks for eliminating the blue screen.

When a Windows Error occurs, a blue screen may appear. You may have become familiar with this, but can it be a "blue" screen, for example, a "red" screen or a "green" screen? Of course! The method is as follows:

1. first, A blue screen error occurs: you only need to copy A file from disk A or the optical drive to your hard disk (note that the file cannot be too small) and take the disk or CD out during the copy process, windows will change immediately-the blue screen will appear immediately, then press Esc to return to the Windows status.

2. Click "start", click "run", enter "msconfig.exe" in the displayed dialog box, and press enter to bring up the System Configuration Utility. Click the "System. ini" tab.

3. Locate [ENH], click "new", and add "MessageBackColor =" under it (note that no quotation marks are entered). The equals sign is followed by a hexadecimal number 0 ~ F. You can enter it as needed to indicate the background color of the incorrect image.

4. In the same way, add a new string "MessageTextColor =" (note that no quotation marks are entered) under [effecenh]. The equal sign is followed by a hexadecimal number 0 ~ F. You can enter it as needed. It indicates the text color of the incorrect image.

5. Now, restart the computer and try again to see if the test is successful: Repeat Step 1 to see if the blue screen has been replaced? Success!

Note: This method does not really change the stability of Vulnerable Windows. It only changes the background color and text color when a Windows error occurs. From this perspective, this is also a DIY behavior.

Ii. close temporarily unused programs in time

Some programs can be closed to save resources even if they are used later ....... Remaining full text>

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.