Temporary instance processing is disabled when 80 w data is removed at a time

Source: Internet
Author: User

Temporary instance processing is disabled when 80 w data is removed at a time

Recently, When I deduplicate a million data business table, the de-duplication operation was stuck. The following is a simple review.

1. query the data volume of the business table. A total of more than 200 million data records are displayed.

SQL> select count (*) from tb_bj_banker_etl;

2552381

2. query the duplicate data volume that should be removed from the table. There are more than 80 million records.

SQL> select count (*) from tb_bj_banker_etl where (id) in (select id from tb_bj_banker_etl group by id having count (*)> 1) and rowid not in (select max (rowid) from tb_bj_banker_etl group by id having count (*)> 1 );

830099

3. So, before work in the evening, execute the following statement script to remove duplicates

SQL> delete from tb_bj_banker_etl where (id) in (select id from tb_bj_banker_etl group by id having count (*)> 1) and rowid not in (select max (rowid) from tb_bj_banker_etl group by id having count (*)> 1 );

SQL> commit;

4. The next day, when we arrived at the site, we found that the statements executed last night in PL/SQL Developer were still being executed.

First of all, I noticed that I had not finished running more than 80 million de-duplicated data for one night? What's wrong with this?

Suspected lock table.

The user queries whether the table has a lock.

 
 
  1. SELECT
  2. A. OWNER, -- user to which the OBJECT belongs
  3. A. OBJECT_NAME, -- OBJECT Name
  4. B. XIDUSN,
  5. B. XIDSLOT,
  6. B. XIDSQN,
  7. B. SESSION_ID, -- lock the table user's session
  8. B. ORACLE_USERNAME, -- Oracle username of the table lock user
  9. B. OS _USER_NAME, -- lock table user's operating system login Username
  10. B. PROCESS,
  11. B. LOCKED_MODE,
  12. C. MACHINE, -- lock table user's computer name
  13. C. STATUS, -- lock Table STATUS
  14. C. SERVER,
  15. C. SID,
  16. C. SERIAL #,
  17. C. PROGRAM-database management tools used by table lock users
  18. FROM
  19. ALL_OBJECTS,
  20. V $ LOCKED_OBJECT B,
  21. SYS. GV _ $ SESSION C
  22. WHERE
  23. A. OBJECT_ID = B. OBJECT_ID
  24. And B. PROCESS = C. PROCESS
  25. Order by 1, 2

In the following results, we can see that only the initiate session of the deduplicate statement is locked, and no other users cause the lock table. Does this mean that the statement is still being executed? Start to solve the problem with questions.

1 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036: 972 3 WORKGROUP \ backdb active dedicated 913 3381 plsqldev.exe

2 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036: 972 3 WORKGROUP \ backdb inactive dedicated 649 41791 plsqldev.exe

3 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036: 972 3 WORKGROUP \ backdb inactive dedicated 817 27777 plsqldev.exe

4 BJHYL tb_bj_banker_ETL 15 18 9000 913 BJHYL Administrator 4036: 972 3 WORKGROUP \ backdb inactive dedicated 841 1981 plsqldev.exe

5. batch processing is adopted to solve the problem of deduplication

Since direct de-duplication cannot proceed smoothly, I thought of a batch de-duplication method and tried it.

 
 
  1. First time:
  2. Delete from tb_bj_banker_etl where (id) in (select id from tb_bj_banker_etl group by id having count (*)> 1) and rowid not in (select max (rowid) from tb_bj_banker_etl group by id having count (*)> 1) and rownum <= 100000;
  3. Commit;
  4.  
  5. Second:
  6. Delete from tb_bj_banker_etl where (id) in (select id from tb_bj_banker_etl group by id having count (*)> 1) and rowid not in (select max (rowid) from tb_bj_banker_etl group by id having count (*)> 1) and rownum <= 100000;
  7. Commit;
  8.  
  9. .......
  10. .......
  11. .......
  12.  
  13. Eighth time:
  14. Delete from tb_bj_banker_etl where (id) in (select id from tb_bj_banker_etl group by id having count (*)> 1) and rowid not in (select max (rowid) from tb_bj_banker_etl group by id having count (*)> 1 );
  15. Commit;

Result: After Dividing more than 0.8 million data into 10 million data records for single deduplication operation, the total duration is over 140 seconds, and the goal of removing 0.8 million data is achieved. However, the reason for direct handling of zombie cases remains to be tracked and analyzed in the future.

Blog Source: http://blog.csdn.net/huangyanlong/article/details/46041735
 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.