How to update a large table record in Oracle

Source: Internet
Author: User

SQL statements are a convenient and confusing language. This is mainly reflected in its set operation features. Update statements are identical regardless of whether the data volume is one or 0.1 billion. However, the actual execution results (or whether the results can be returned) vary greatly.
 
The author's philosophy in the DBA field is: as a developer, the database and data should be in awe. Before a statement is issued, at least two issues should be considered: what is the total data volume of the target data table (after production )? How much data is involved in this operation? Different answers have different solutions.
 
Updating big table data is a common scenario in development and O & M, especially in the data migration field. The answer to the above two questions is: The target data table is large as a whole, and the update scope is also large. An SQL statement can be processed theoretically. However, in practice, this solution has many problems.
 
This article mainly introduces several common Big Table processing strategies and analyzes their advantages and disadvantages. As our developers and DBAs, the selection criteria are also flexible: According to your operation type (O & M operations or daily system jobs), program running environment (whether the hardware environment supports parallel operation) and the programming environment (whether all resources can be exclusively exclusive.
 
First, we need to prepare a large table.

1. Prepare the environment

We chose Oracle 11.2 for testing.

SQL> select * from v $ version;

 

BANNER

--------------------------------------------------------------------------------
 
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0-Production

PL/SQL Release 11.2.0.1.0-Production

CORE 11.2.0.1.0 Production

 

TNS for Linux: Version 11.2.0.1.0-Production

NLSRTL Version 11.2.0.1.0-Production

 

 

Prepare a large table.

 

 

SQL> create table t as select * from dba_objects;

Table created

 

SQL> insert into t select * from t;

72797 rows inserted

 

SQL> insert into t select * from t;

145594 rows inserted

 

(Space reason, intermediate process ......)

SQL> commit;

Commit complete

 

 

SQL> select bytes/1024/1024/1024 from dba_segments where owner = 'sys 'and segment_name = 'T ';
 
 

BYTES/1024/1024/1024

--------------------

1.0673828125

 

SQL> select count (*) from t;

 

COUNT (*)

----------

9318016

 

Executed in 14.711 seconds

 

 

Data Table T is used as the data source. It contains more than million records and a total space of about 1 GB. The experiment environment of the author is a virtual CPU on the virtual machine, so the scheme for Parallel operations is indicative, not representative.
 
Next, let's look at the simplest method: Direct update.

 

2. Method 1: Update directly

 

The simplest and most prone method is to directly update data tables regardless of the number of novices or novices. Even many old programmers and DBAs always choose such a strategy. In fact, even if the results can come out, there is a lot of luck in it.
 
Let's first look at the author's experiment and then discuss the reasons. Create an experimental data table t_target first.

 

 

SQL> create table t_targettablespace users as select * from t;

Table created

 

 

SQL> update t_target set owner = to_char (length (owner ));

(Long wait ......)

 

 

While waiting, I found the following phenomena:

Ü the database server runs very slowly, and many connection operations are slowed down, so the database server cannot even log on to the database for a period of time;

Ü background session wait time is concentrated on Data Reading, log space buffer, space allocation, and other events;

Ü wait for a long time, and an exception occurs at the operating system level. Undo tablespace expansion;

Ü frequent log Switching;

In addition, a friend who chooses this policy may encounter the following symptoms: foreground error throws an exception, client connection is disconnected, and so on.

I encountered such a scenario, which is also quite tangled. First, long waits (or even one night) may eventually have no results. The most terrible thing is that you do not dare to undo the operation easily, because Oracle needs to perform the rollback operation of the update operation. An hour later, I gave up.
 
 

 

Updatet_target set owner = to_char (length (owner ))

ORA-01013: user request to cancel current action

(Close to one hour unfinished)

 

 

After that, it is a rollback wait of the same time. Generally, the transaction is executed for too long and the rollback duration is long. During this period, you can use the x $ ktuxe background internal table to observe and estimate the rollback speed. In this process, we only have to wait ".
 
 

 

SQL> select KTUXESIZ from x $ ktuxe where KTUXESTA <> 'inactive ';

 

KTUXESIZ

----------

62877

(......)

 

SQL> select KTUXESIZ from x $ ktuxe where KTUXESTA <> 'inactive ';

 

KTUXESIZ

----------

511

 

 

The results of this strategy are generally: Peer complaints (affecting their job execution), fear (not sure where the execution is done), resource depletion (CPU, memory, or I/O is full) ). In the formal production environment, you must take responsibility for affecting business production.
 
We will analyze the problem of this policy in detail:

First, we need to acknowledge that the advantages of this method are simple and one-sided efficiency. Compared with other methods described in this article, this method has the least amount of code. In addition, this method can submit all tasks to the Database SQL engine at a time to maximize one aspect of the system (CPU, IO, or memory.
 
If the data table is small and the experience is limited, this method is suitable. We can consider using it.

On the other hand, we need to see another aspect of Oracle Update, that is, the Undo, Redo and process workload issues. Oracle users know that Undo and Redo are very important during DML operations. When we Update and Delete data, the "pre-image" before the data block is modified will be saved in the Undo Tablespace. Note: Undo Tablespace is a special Tablespace that needs to be stored on the disk. The existence of Undo is mainly to support "consistent read" operations in other sessions of the database. As long as the transaction is not committed or rollback, the Undo data is retained in the database and cannot be overwritten ".
 
Redo records the "post-image" for DML operations. Redo generation is related to the data volume we modified. The actual problem is that the total number of records to be modified remains unchanged unless we try the nologging option. If a single Redo log member is small, the Redo generation speed of Oracle applications is relatively high. The Redo Group has a high frequency of switching, and the system is faced with a large number of Log switching or Log Space Buffer-related wait events.
 
If we select the first method, the Undo tablespace is a big bottleneck. A large amount of pre-image data is stored in the Undo tablespace and cannot be released, which continuously causes Undo File expansion. If the Undo file cannot be expanded (autoextend = no), the Oracle DML operation will report an error at some time. Even if expansion is allowed, a large number of data files are written into DBWR. This means that when we perform a large number of updates, we can see a lot of DBWR writes in the event wait event. Because these writes do not always update your data tables, and many of them are Undo tablespace writes.
 
At the same time, a long wait operation may trigger the upper limit of Oracle and OS load, and many strange things may also happen. For example, the process is dead and the connection is disconnected.

The biggest problem with this method is the rollback action. If we encounter some exceptions during a long transaction process, it is usually because of data exceptions and the entire data needs to be rolled back. Rollback is a tool for Oracle to protect itself and maintain transaction integrity. When a long-term DML update action is interrupted, Oracle enters its own rollback stage until it is finally completed. In this process, the system runs slowly. Even if you restart the server, the rollback process is complete.
 
Therefore, this method must be used with caution when processing large tables !! At least evaluate the risk.

For more details, please continue to read the highlights on the next page:

  • 1
  • 2
  • 3
  • Next Page

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.