In general, when performance problems are detected, we will collect AWR reports covering the time period in which the problem occurred-but it is best to collect AWR reports covering only one hour-if
In general, when performance problems are detected, we will collect AWR reports covering the time period in which the problem occurred-but it is best to collect AWR reports covering only one hour-if
How to proactively avoid problems and collect diagnostic information
Some problems are unpredictable, but most other problems can be avoided if some signs are discovered early. At the same time, if the problem does occur, it is very important to collect information when the problem occurs. Collect information about how to proactively avoid problems and diagnose problems.
The AWR report is a useful diagnostic tool for the overall performance of the database.
In general, when performance problems are detected, we will collect AWR reports covering the time period in which the problem occurred-but it is best to collect AWR reports covering only one hour-if the time is too long, the AWR report does not reflect the problem. An AWR report for a period of time without performance issues should also be collected as a reference to compare AWR reports for a period of time in question. The two AWR reports have the same time period. For example, they are both half an hour or one hour.
Generate and view Oracle AWR reports
Install Oracle 11gR2 (x64) in CentOS 6.4)
Steps for installing Oracle 11gR2 in vmwarevm
Install Oracle 11g XE R2 In Debian
Steps for generating an Oracle AWR report
Interpretation
When dealing with performance problems, we are most concerned about what the database is waiting.
When a process cannot be operated for some reason, it needs to wait. The most time-consuming wait event is what we need to pay attention to most because we can get the greatest benefit if we reduce it.
The "Top 5 Timed Events" section in the AWR report provides this information, allowing us to focus only on the main issues.
•
Top 5 Timed Events
As mentioned above, "Top 5 Timed Events" is the most important part in the AWR report. It indicates the most time-consuming wait event for database sessions, as shown below:
Top 5 Timed Events Avg % Total
~~~~~~~~~~~~~~~~~~ Wait Call
Event Waits Time (s) (MS) Time Wait Class
---------------------------------------------------------------------------
Db file scattered read 10,152,564 81,327 8 29.6 User I/O
Db file sequential read 10,327,231 75,878 7 27.6 User I/O
CPU time 56,207 20.5
Read by other session 4,397,330 33,455 8 12.2 User I/O
PX Deq Credit: send blkd 31,398 26,576 846 9.7 Other
-------------------------------------------------------------
The Top 5 Events section contains some Events related information. It records the total number of waits encountered during this period, the total waiting time, and the average waiting time; this part is sorted by the percentage of each Event in the total call time.
Based on the differences in the Top 5 Events section, next we need to check other sections of the AWR report to verify the detected problems or perform quantitative analysis. The waiting event needs to be evaluated based on the duration of the reporting period and the number of concurrent users in the data base at that time. For example, 10 million wait events within 10 minutes are more problematic than 10 million in 10 hours; the 10 million waiting events caused by 10 users are more problematic than the same waiting events caused by 10,000 users.
As in the above example, nearly 60% of the time is waiting for IO-related events.
• The "db file scattered read" event generally indicates that multiple reads are being performed due to full table scan or index fast full scan.
• The "db file sequential read" event is generally caused by a single read (such as a read index) operation that cannot be performed on multiple reads)
Other 20% of the time is spent on using or waiting for CPU time. High CPU usage is often caused by poor performance SQL statements (or these SQL statements may perform the same operation with fewer resources). For such SQL statements ,, too many IO operations are also a symptom. We will discuss the CPU usage later.
Based on the above, we will investigate whether there is a problem with this wait event. If there is a problem, solve it; if it is normal, check the next wait event.
There are usually two main reasons for excessive IO-related WAITS:
• The database has performed too many read operations.
• Each IO read operation is slow.
The information displayed in the Top 5 Events section will help us check:
• Whether the database has performed a large number of read operations:
The figure above shows that during this period, both types of read operations are greater than 10 million. Whether these operations are too many depends on the reported time being 1 hour or 1 minute. We can check the elapsed time in the AWR report. If there are too many read operations, we need to check the SQL Statistics information in the AWR report, because the read operations are initiated by SQL statements.
• Whether each IO read operation is slow:
The figure above shows that the average wait time of the two types of read operations is less than 8 ms during this period.
Whether 8 ms is fast or slow depends on the underlying hardware device. Generally, less than 20 ms can be considered acceptable.
We can also get more detailed information in the "Tablespace IO Stats" section of the AWR report.
Tablespace IO Stats DB/Inst: VMWREP/VMWREP Snaps: 1-15
-> Ordered by IOs (Reads + Writes) desc
Tablespace
------------------------------
Av Buffer Av Buf
Reads/s Rd (MS) Blks/Rd Writes/s Waits Wt (MS)
----------------------------------------------------------------------
TS_TX_DATA
14,246,367 283 7.6 4.6 145,263,880 2,883 3,844,161
USER
204,834 4 10.7 1.0 17,849,021 354 15,249
UNDOTS1
19,725 0 3.0 1.0 10,064,086 200 1,964
AE _TS
4,287,567 85 5.4 6.7 932 0 465,793
TEMP
2,022,883 40 0.0 5.8 878,049 17 0 0.0
UNDOTS3
1,310,493 26 4.6 1.0 941,675 19 43 0.0
TS_TX_IDX
1,884,478 37 7.3 1.0 23,695 0 73,703 8.3
> SYSAUX
346,094 7 5.6 3.9 112,744 2 0 0.0
SYSTEM
101,771 2 7.9 3.5 25,098 0 653 2.7
For example, we care about Av Rd (MS) metrics. If it is higher than 20 ms and there are many read operations at the same time, we may have to start investigating the potential IO problems from the OS perspective.
For more details, please continue to read the highlights on the next page: