Update subquery Use Introduction _ Database Other

Source: Internet
Author: User
Tags create index

Basic knowledge
1, associated subqueries and unrelated subqueries

In a unassociated subquery, an internal query executes only once and returns its value to an external query, and the external query uses the value returned to it by the internal query in its processing. In the associated subquery, the internal query is executed once for each row of data returned by the external query. In addition, the information flow is two-way in the associated subquery. Each row of data in an external query passes a value to the subquery, and the subquery executes once for each row of data and returns its records. The external query then makes decisions based on the records that are returned.

Such as:

SELECT O1. CustomerID, O1. OrderID, O1. OrderDate from
Orders O1
WHERE O1. OrderDate = (SELECT Max (OrderDate) from the
Orders O2
WHERE O2. CustomerID = O1. CustomerID)

is an associative subquery

SELECT O1. CustomerID, O1. OrderID, O1. OrderDate from
Orders O1
WHERE O1. OrderDate in
(SELECT top 2 O2. OrderDate from
Orders O2
WHERE O2. CustomerID = O1. CustomerID) Order by
CustomerID

is an unrelated subquery

2, Hint (HINT)

In general, the Oracle system optimizer determines the execution path of the statement, regardless of the rule-based or cost based approach, when optimizing. This path of choice is not necessarily the best. Therefore, Oracle provides a method called hint. It allows programmers to choose the execution path according to their requirements, which prompts the optimizer to follow the execution rules to execute the current statement. This can be better in performance than Oracle's optimized autonomous decision.

In general, programmers can use hints to optimize decisions. You can specify the following by using a hint:


L The Optimization method of SQL statement;

L for an SQL statement, based on the goal of the cost optimization program;

The access path of the SQL statement access;
L The connection Order of the connection statements;
L The connection operation in the connection statement.

If you want the optimizer to perform as required by the programmer, you will be prompted in the statement. The validity range of the hint is limited, that is, a prompt block of statements is required to follow the prompts. The following statement can specify a hint:

L Simple Select, UPDATE, DELETE statement;
L Compound Subject sentence or sub query sentence;
L part of a query (UNION).

The hint specifies that the original annotation statement is added with the "+" component. The syntax is as follows:

[SELECT | delete| UPDATE]/*+ [hint | text] * *

Or

[SELECT | delete| UPDATE]--+ [hint | text]

Note in the "* *" after not empty directly add "+", the same "--+" is also connected to write.

Warning: If the prompt statement is not written correctly, Oracle ignores the statement.

The common tips are:

Ordered force to connect in the order of the tables specified in the FROM clause
Use_nl force specifies that the connection between two tables is a nested loop (Nested Loops)
Use_hash Force specifies the connection mode between two tables is a hash join
Use_merge force specifies a connection between two tables for a merge sort connection (merge Join)
PUSH_SUBQ let unrelated subqueries execute ahead of time
Index forced to use one of the indexes

3, Implementation plan

When you select an SQL statement in Pl/sql Developer SQL Windows with the mouse or keyboard, and then press F5, an interface that performs plan resolution appears:

4, the characteristics of the update

The internal performance of the update system can be referenced by the enclosure: An internal analysis of the update transaction. doc

The basic point of using update is that

1 try to use the index on the update table to reduce unnecessary updates.
2 The updated data source takes time as short as possible, if you can't do it, insert the update into the middle table, then index the middle table and update
3 If the update is a primary key, it is recommended to delete and insert.
5, the example uses the table

The following two tables will be expanded around:

Create table Tab1 (workdate varchar2 (8), Cino varchar2 (), val1 number, val2 number);
Create table TaB2 (workdate varchar2 (8), Cino varchar2 (), val1 number, val2 number);
Create table Tab3 (workdate varchar2 (8), Cino varchar2 (), val1 number, val2 number);
Create table Tab4 (workdate varchar2 (8), Cino varchar2 (), val1 number, val2 number);

Workdate, Cino is a two-table keyword, by default no primary key index is built.

Two, update two kinds of situation

Update a table with update, there are two scenarios: Update the field based on the associated subquery, and limit the update scope through a unassociated subquery. If there is a third condition, that is the superposition of the first two cases.

1, update the field according to the associated subquery

Update tab1 T
Set (val1, val2) = (select Val1, val2 from
tab2
where workdate = t.workdate and
Cino = t.cin O);

Update the corresponding fields for TAB1 by TAB2. When executing the SQL statement, the system reads the record from the Tab1 line, and then through the associated subquery, finds the corresponding field to update. Whether the correlated subquery can find the corresponding record quickly through the TAB1 condition is the necessary condition for the optimization to be realized. Therefore, it is generally required to have a unique or high rank normal index on the tab2. The duration of the execution is approximately (the time spent in query tab1 for one record + the time spent querying a record in tab2) and the number of records in TAB1.

If the subquery condition is more complex, such as the following statement:

Update tab1 T
Set (val1, val2) = (select Val1, val2 from
tab2 tt
where exists (select 1 from
tab3
wher E workdate = tt.workdate and
Cino = Tt.cino) and workdate = t.workdate and Cino = T.cino
);

The time spent on the subquery for each record in the update tab1 will multiply, and if the number of records in the TAB1 is high, this update statement is almost impossible to complete.

The solution is that the handle query extracted, do the middle table, and then to the middle table index, with the middle table to replace the subquery, so that the speed can be greatly improved:

Insert into TAB4
select Workdate, Cino, Val1, val2 from
tab2 tt
where exists (select 1 from
tab3
where workdate = Tt.workdate and
Cino = Tt.cino);
Create INDEX TAB4_IND01 on TAB4 (workdate, Cino);

Update tab1 T
Set (val1, val2) = (select Val1, val2 from
tab4 tt
where workdate = T.workdate and
Cino = T.cino);

2, through the unrelated subquery, limit the scope of the update

Update tab1 t
set val1 = 1
where (workdate, Cino) in (select Workdate, Cino from TAB2)

The Val1 field for the corresponding record in TAB1 is updated based on the range of data provided by TAB2.

In this case, the default execution of the system is often to execute select Workdate, Cino from TAB2 subquery, form a system view in the system, and then select a record in TAB1 to query the system view for corresponding workdate, Cino combinations, If it exists, the TAB1 is updated and the next record is selected if it does not exist. This way the query time is approximately equal to: Subquery query time + (select a record in Tab1 + in System view scan for a record time) * Tab1 the number of records. Where "all table scans in System view for a record time" will vary depending on the size of the TAB2. If the TAB2 record number is small, the system can read the table to the system area directly, if the TAB2 record number, the system cannot form the system view, at this time will update the action every time, the handle query does once, the speed will be very slow.

There are two kinds of optimizations for this situation

1) on the Tab1 workdate, add the index on the Cino field and add the hint.

The following SQL statements are modified:

Update/*+ordered use_nl (sys, t)/TAB1 t
set val1 = 1
where (workdate, Cino) in (select Workdate, Cino from TAB2)

where SYS represents the system view. If you do not join the ordered prompt, the system will default to the TAB1 table as the driver table, then the TAB1 should be a full table scan. When prompted, use the system view, that is, select Workdate, Cino from TAB2 as the driver table, in normal circumstances, the speed can improve a lot.

2 The workdate on the TaB2 table, the Cino field is indexed and the SQL statement is overwritten:

Update tab1 t
set val1 = 1
where exists (select 1 from
tab2
where workdate = t.workdate and
Cino = t.c Ino

Third, indexing problem

The update index is used in a special, sometimes seemingly full index, but only a fraction of the time, so it is recommended that the fields of the composite index be written together.

For example:

Update/*+ordered use_nl (sys, t)/TAB1 t
set val1 = 1
where Cino in (select Cino from tab2) and
workdate = ' 200506 '

This SQL statement is not fully used on the TAB1 composite index workdate + Cino. All that can be used is the workdate= ' 200506 ' constraint.

If it's written like this, there's no problem:

Update/*+ordered use_nl (sys, t)/TAB1 t
set val1 = 1
where (workdate, Cino) in (select Workdate, Cino from tab 2)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.