It takes too long to execute SQL statements in oracle databases. The result set cannot be obtained at all. The load on the server is not high, and it is estimated that it is a query that consumes a lot of disks. This problem is solved through the parallel query of one column in oracle.
Sure enough, we found a million table and a hash join with more than million tables.
After a SQL statement is optimized, the SQL statement is as follows:
- select count(ui.usin_uid_fk)
- from table1 av, table2 ui
- where av.av_usse_activatedate >= to_date('20090102', 'yyyymmdd')
- and av.av_usse_activatedate < to_date('20090401', 'yyyymmdd')
- and av.av_usse_uid_fk = ui.usin_uid_fk
- and ui.usin_mcnc_fk =XXX%'
It is hard to imagine that the execution is not ideal. The execution time of the last 20 minutes is really crashing.
- COUNT(UI.USIN_UID_FK)
- ---------------------
- 1918591
-
- Elapsed: 00:19:03.07
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 32921639 consistent gets
- 352073 physical reads
- 0 redo size
- 395 bytes sent via SQL*Net to client
- 503 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
For the large table of table 2 that meets the condition of more than 1100 million), it is decided to increase the execution speed through parallel. The SQL statement is as follows:
- select /*+parallel (tbl_userinfo 4)*/ count(ui.usin_uid_fk)
- from table1 av, table2 ui
- where av.av_usse_activatedate >= to_date('20090101', 'yyyymmdd')
- and av.av_usse_activatedate < to_date('20090401', 'yyyymmdd')
- and av.av_usse_uid_fk = ui.usin_uid_fk
- and ui.usin_mcnc_fk like 'XXX%';
The execution effect is very obvious. From 19 minutes to 1 minute 45 seconds! The consistent gets is reduced by an order of magnitude.
- COUNT(UI.USIN_UID_FK)
- ---------------------
- 1918591
-
- Elapsed: 00:01:45.15
-
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 2571109 consistent gets
- 124523 physical reads
- 0 redo size
- 395 bytes sent via SQL*Net to client
- 504 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
Because the server has a 2x4 Core cpu, it can be regarded as eight CPUs, so we can further reduce the execution time by increasing the degree of parallelism. The following SQL statement:
- SQL> select /*+parallel (tbl_userinfo 8)*/ count(ui.usin_uid_fk)
- 2 from table1 av, table2 ui
- 3 where av.av_usse_activatedate >= to_date('20090101', 'yyyymmdd')
- 4 and av.av_usse_activatedate < to_date('20090401', 'yyyymmdd')
- 5 and av.av_usse_uid_fk = ui.usin_uid_fk
- 6 and ui.usin_mcnc_fk like '460%';
-
- COUNT(UI.USIN_UID_FK)
- ---------------------
- 1949033
-
- Elapsed: 00:00:20.60
-
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 2607524 consistent gets
- 55050 physical reads
- 0 redo size
- 395 bytes sent via SQL*Net to client
- 503 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
It can be said that it is still ideal. It takes only about 20 s. Although the maximum degree of parallelism can reach CPU * 2, the effect may not be good. Perform a test on 16 parallel SQL statements.
- COUNT(UI.USIN_UID_FK)
- ---------------------
- 1949033
-
- Elapsed: 00:00:20.64
-
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 2607524 consistent gets
- 55299 physical reads
- 0 redo size
- 395 bytes sent via SQL*Net to client
- 504 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
-
There is no improvement, and the execution time is slightly higher than the SQL statement with a degree of parallelism of 8.
Through the above tests, we can easily find that:
When processing a large amount of data queries, such as hash join, oracle Parallel Processing is very effective. That is to say, parallel queries can be used in applications such as data warehouses ".
However, there are still many restrictions on the parallel use of oracle. For example, relatively small data queries and connections are counterproductive. Blindly increasing the degree of parallelism is also a big worry. Relatively speaking, the degree of parallelism is better than the number of CPUs. The number of CPUs here should be the number of cores. For example, if the CPU on the server is 4 cores and the degree of parallelism is 4, it is good.
It is difficult to have perfect technologies. The most important thing is that the use of specific technologies should be just right, to ensure the advantages and circumvent weaknesses.
Use of oracle rownum statements
Oracle index type
How to create an Oracle Index
C # connect to the Oracle database to query data
Instances that use oracle Stored Procedure Paging