Best practices for removing partial data from Oracle Large tables

Last Update:2018-07-23 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

If the business can not stop, there are mainly three kinds of ideas:
=======================================================================================================
idea 1: According to ROWID fragmentation, reuse rowid sorting, batch processing, back table deletion.
When the business can not stop, choose this way, is indeed the best. Generally, you can control the submission within every 10,000 lines without causing too much pressure on the rollback segment (when I'm doing big DML, I usually choose
Select one thousand or two thousand lines a submission). When you choose a business low peak, do not have much impact on the application.
Thanks to HTYANSP, a very simple script is provided on the 49 floor, and if you are unfamiliar with this approach, you can refer to this script:
Declare
Cursor MyCursor is SELECT ROWID from TEST WHERE xxx=xxxx order by ROWID; <--------sorted by ROWID cursor, delete condition is xxx=xxxx, according to the actual situation
Conditions to set.
Type Rowid_table_type is table of ROWID index by Pls_integer;
V_rowid Rowid_table_type;
BEGIN
Open mycursor;
Loop
Fetch MyCursor Bulk collect into V_rowid limit 5000; <--------5000 lines per processing, that is, every 5000 lines a submit
Exit when v_rowid.count=0;
ForAll i in V_rowid.first. V_rowid.last
Delete from test where rowid=v_rowid (i);
Commit
End Loop;
Close MyCursor;
End;
/
The disadvantage of this approach is that sorting has the potential to consume too much temporary tablespace space. There is also a way, first according to ROWID fragmentation. Divide a large table into parts by ROWID, each part separately according to the ROWID row
Order. Another advantage of this approach is that it can be parallel.
Once I need to delete a large table in the DW library that satisfies the criteria in the row. The application side guarantees that no more rows of this condition will appear, and I just need to delete all rows that meet the criteria within a few days. This table is located
Table space has dozens of data files (32G per file), I use the following command to generate the ROWID scope of the table in each file:
Select Dbms_rowid. Rowid_create (1,12227,file_id,min (block_id), 0), Dbms_rowid. Rowid_create (1,12227,file_id,max (block_id+blocks-1), 8192) from
dba_extents where segment_name= ' Dml_tst ' GROUP by the file_id order by file_id;
Data_oid in this command is the data_object_id column value in Dba_objects.
Then, the target table is manipulated according to the ROWID range obtained above. In fact, the second line in the stored procedure will be HTYANSP, based on the generated ROWID modified as follows:
Cursor MyCursor is SELECT ROWID the from TEST WHERE ROWID between ' ROWID ' and ' ROWID ' and xxx=xxxx ' the Order by ROWID;

Other rows in the stored procedure are basically unchanged.
Make dozens of of these stored procedures, run in parallel with a few sessions.
In addition, Tom in the 9I&AMP;10G Programming art 648 page to 652 pages have a good example, where 650 pages automatically generate ROWID part, you can refer to.
The biggest advantage of this approach is that the performance is controllable, you need to quickly, you can set a few more parallel. If you want to slow down, you'll have less parallelism. Also, the number of rows processed at a time is limited, and the sort of rowid does not
Burst temporary table space.

Case reference: http://www.askmaclean.com/archives/%e5%88%a9%e7%94%a8rowid%e5%88%86%e5%9d%97%e5%ae%9e%e7%8e%b0%e9%9d%9e% E5%88%86%e5%8c%ba%e8%a1%a8%e7%9a%84%e5%b9%b6%e8%a1%8cupdate%e4%b8%8edelete.html

=======================================================================================================
idea two: According to ROWID fragmentation, non-batch processing, back table Delete
For example, to delete a row with ID equal to value in Dml_tst, the most basic stored procedure is as follows:
Declare
CURSOR Test2_cs (value number,rid1 rowid,rid2 rowid)
is SELECT ID from dml_tst
where Id=value and rowid between Rid1 and Rid2
For UPDATE;
K number:=0;
BEGIN
For C1_rec in Test2_cs (3338, ' aaac/daaeaaaabjaaa ', ' AAAC/DAAEAAAABQCAA ') loop
Delete Dml_tst where current of Test2_cs;
End LOOP;
End;
/
This method can also be based on rowid fragmentation, only one scan of the table. But not batch processing, performance instead of above.
=======================================================================================================
idea three: on prebuilt materialized View method

This way, Ali migration data do use more, but also a good way. There is no way to speed, but it is simpler and has little impact on the business. In addition, for a delete operation, you can release
Deleted space. The disadvantage is that you need a primary key.
Suppose the target table is P3, and the primary key column is ID1, to delete the row ID2 listed in small 1000:
Step 1, establish the middle table p3_m:
CREATE TABLE P3_m as SELECT * from P3 where 0=1;
Step 2, build and materialized views with the same name as the middle table, be sure to have the on prebuilt option:
CREATE materialized VIEW P3_m
On prebuilt TABLE as
SELECT * FROM P3 where id2>=1000; <--------place rows that do not meet the deletion criteria into materialized views
Step 3: Add materialized View logs:
CREATE materialized VIEW LOG on P3 with PRIMARY key,sequence (ID2,ID3,CC1,CC2) including NEW VALUES;
Step 4: When the database is idle, make a full refresh:
exec dbms_mview.refresh (' p3_m ', ' C ');
After a full refresh, you can create indexes, constraints, and so on on the intermediate table that are the same as the target table
Step 5: Make one or two incremental refreshes:
exec dbms_mview.refresh (' p3_m ', ' F ');
Step 6: Lock the original table, make an incremental refresh, and immediately rename the target table to another name
Lock table P3 in EXCLUSIVE mode;
exec dbms_mview.refresh (' p3_m ', ' F ');
Drop materialized VIEW LOG on P3;
ALTER TABLE P3 Rename to P3_n;
Step 7: Delete the materialized view, modify the middle table to the original target table name:
Drop materialized VIEW p3_m;
ALTER TABLE p3_m Rename to P3;
Step 8: Determine the original table if it is not used, you can delete the changed name of the original table
You can also use the line redefinition, and the idea is similar to this one.

Online redefinition case: http://www.askmaclean.com/archives/%e5%88%a9%e7%94%a8oracle%e5%9c%a8%e7%ba%bf%e9%87%8d%e5%ae%9a%e4%b9% 89online-redefinition%e6%b8%85%e7%90%86%e5%8e%86%e5%8f%b2%e6%95%b0%e6%8d%ae.html
======================================================================================
If it doesn't affect the application, it's the usual method. These three kinds of ideas can also be used for update.
According to these three kinds of ideas, we can combine their own application situation to change. Always find a suitable way to apply.
Note is
1, pay attention to backup
2, do not pay attention to not too fierce, once a colleague because of the update too fierce, affected us an important foreground application. It is important to note that the number of rows submitted at a time cannot be too high.

If the application can be stopped, which method is too much.
1, CTAs method to create a new table, excluding the data to delete, and then renamed. To improve speed, you can also disable indexing, and then rebuild after DML.
2, only export the data not deleted, and then import, and then renamed
3, if the table space can be set as read-only, there are altar friends of the table space migration, moved to test platform, slowly deleted and then imported back to the method

Wait a minute. There are a lot of ways to stop the application.

http://blog.csdn.net/robbie1314520/article/details/8272676

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Best practices for removing partial data from Oracle Large tables

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Best practices for removing partial data from Oracle Large tables

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support