Oracle delete statement optimization example
When I got up to work today, I received an email from the customer saying that it took more than three hours to execute a delete SQL statement in the production environment. Finally, the customer had no choice but to cancel the data cleaning and was ready to redo it at the time of application. So I hope I can tune the SQL statement before the afternoon.
I got the SQL statement. It is a simple delete statement. This table is a partitioned table with about 0.6 billion data records and about more than 9 million data records to be deleted.
Delete event
Where cycle_code = 25
And cycle_month = 2
And cycle_year = 2015
And customer_id = 5289835;
Let's take a look at the execution plan.
Plan hash value: 2439880320
Bytes -----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | Pstart | Pstop |
Bytes -----------------------------------------------------------------------------------------------------------------------
| 0 | delete statement | 3238K | 135M | 404 K (1) | 01:20:52 |
| 1 | DELETE | EVENT |
| 2 | partition range iterator | 3238K | 135M | 404 K (1) | 01:20:52 | 241 | 261 |
| * 3 | table access by local index rowid | EVENT | 3238K | 135M | 404 K (1) | 01:20:52 | 241 | 261 |
| * 4 | index range scan | EVENT_1UQ | 1370K | 40255 (1) | 00:08:04 | 241 | 261 |
Bytes -----------------------------------------------------------------------------------------------------------------------
It seems that the performance will not be poor even when the index scan is adopted?
As a whole, deleting 0.6 billion pieces of data from more than 9 million records makes it a good choice to scan indexes.
First, we checked the partition rules and basic data distribution of the table,
The partition rules are based on the three fields cycle_code, cycle_month, and sub_partition_id. From the query conditions, cycle_code and cycle_month are just partition fields.
TABLE_NAME PARTITION PARTITION_COUNT COLUMN_LIST PART_COUNTS SUBPAR_COUNT STATUS
-------------------------------------------------------------------------------------------------------
Event range 721 CYCLE_CODE, CYCLE_MONTH, SUB_PAR TITION_ID 3 0 VALID
The data distribution is as follows. According to the partition logic, the data can only be in these 20 partitions.
Partition_name high_value tablespace_name num_rows
C25_M2_S1 25, 2, 5 DATAH01 84246910
C25_M2_S2 25, 2, 10 DATAH01 3427570
C25_M2_S3 25, 2, 15 DATAH01 3523431
C25_M2_S4 25, 2, 20 DATAH01 3988140
C25_M2_S5 25, 2, 25 DATAH01 2700687
C25_M2_S6 25, 2, 30 DATAH01 2477792
C25_M2_S7 25, 2, 35 DATAH01 2490349
C25_M2_S8 25, 2, 40 DATAH01 11755212
C25_M2_S9 25, 2, 45 DATAH01 3184953
C25_M2_S10 25, 2, 50 DATAH01 2656802
C25_M2_S11 25, 2, 55 DATAH01 4434668
C25_M2_S12 25, 2, 60 DATAH01 2776079
C25_M2_S13 25, 2, 65 DATAH01 2949885
C25_M2_S14 25, 2, 70 DATAH01 2837790
C25_M2_S15 25, 2, 75 DATAH01 6285172
C25_M2_S16 25, 2, 80 DATAH01 2743439
C25_M2_S17 25, 2, 85 DATAH01 3574228
C25_M2_S18 25, 2, 90 DATAH01 3600820
C25_M2_S19 25, 2, 95 DATAH01 7415434
C25_M2_S20 25, 2,100 DATAH01 3446285
With this information, I found that the harvest was not small. I wrote a script to embed the customer_id field to view the data to be deleted in each partition, the results show that the first partition has more than million pieces of data, and the query takes a long time. In the end, no data can be deleted, and other partitions are executed quickly during testing.
The data to be deleted in the C25_M2_S8 partition contains more than million data records, and no matching data exists in other partitions. At the data level, I cannot determine whether new data may be inserted into these partitions.
Therefore, after analyzing the above situation, I performed a special processing on the partition C25_M2_S1 and performed index scanning according to the execution plan, because the scope of the query conditions is a bit large and no matching data exists, so I tried to scan the entire table and started parallel processing. After testing, I found that the speed was still very fast and the data was quickly filtered out in about 1 minute.
Therefore, at the data level, I provided the following statement to put the largest partition for final processing.
Set linesize 200
Set timing on
Set time on
Alter session force parallel dml parallel 16;
Delete event partition (C25_M2_S2) where cycle_code = 25 and cycle_month = 2 and cycle_year = 2015 and customer_id = 5289835;
Commit;
Delete event partition (C25_M2_S3) where cycle_code = 25 and cycle_month = 2 and cycle_year = 2015 and customer_id = 5289835;
Commit;
Delete event partition (C25_M2_S4) where cycle_code = 25 and cycle_month = 2 and cycle_year = 2015 and customer_id = 5289835;
Commit;
.......
Delete event partition (C25_M2_S20) where cycle_code = 25 and cycle_month = 2 and cycle_year = 2015 and customer_id = 5289835;
Commit;
Delete/* + full (rated_event) parallel (rated_event, 16) */event partition (C25_M2_S1) where cycle_code = 25 and cycle_month = 2 and cycle_year = 2015 and customer_id = 5289835;
Commit;
This is generally the end of the process. After finding me through development, we made further communication. She asked some questions based on the script I provided, and she confirmed it from the business level, the data will only be in the C25_M2_S8 partition. With Business validation, the optimization statement is simpler.
Set linesize 200
Set timing on
Set time on
Alter session force parallel dml parallel 16;
Delete event partition (C25_M2_S8) where cycle_code = 25 and cycle_month = 2 and cycle_year = 2015 and customer_id = 5289835;
Commit;
Check the execution plan and perform a full table scan reasonably. Because there are over more than 9 million records in the partition, it is still reasonable to delete of the data and perform full table scan.
Plan hash value: 1742190108
Certificate ----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | Pstart | Pstop | TQ | IN-OUT | PQ Distrib |
Certificate ----------------------------------------------------------------------------------------------------------------------------------
| 0 | delete statement | 9115K | 382M | 19351 (1) | 00:03:53 |
| 1 | px coordinator |
| 2 | px send qc (RANDOM) |: TQ10000 | 9115K | 382M | 19351 (1) | 00:03:53 | Q1, 00 | P-> S | QC (RAND) |
| 3 | DELETE | EVENT | Q1, 00 | PCWP |
| 4 | px block iterator | 9115K | 382M | 19351 (1) | 00:03:53 | 248 | 248 | Q1, 00 | PCWC |
| * 5 | table access full | EVENT | 9115K | 382M | 19351 (1) | 00:03:53 | 248 | 248 | Q1, 00 | PCWP |
Certificate ----------------------------------------------------------------------------------------------------------------------------------
Through this example, we can see that the execution plan of the original index scan looks good, but the execution efficiency is greatly reduced. After analyzing the Partition Rules and data distribution of the partition table, the discovery can simplify the original 700 multiple partitions to 20, and with the confirmation at the business layer, the deletion of the original 20 partitions can be simplified to a specific partition, performance Tuning is a baton at this time. The problem becomes clearer after further analysis and confirmation.