Usage of bulkcollect in Oracle

Source: Internet
Author: User
A key improvement of FORALL statements, which greatly simplifies code and significantly improves performance for programs that want to update many rows of data in PLSQL programs.

A key improvement of FORALL statements, which greatly simplifies code and significantly improves performance for programs that want to update many rows of data in PL/SQL programs.

A key improvement of FORALL statements, which greatly simplifies code and significantly improves performance for programs that want to update many rows of data in PL/SQL programs.
1:
Use FORALL to enhance DML processing capabilities
Oracle introduces two new data manipulation language (DML) Statements for PL/SQL in Oracle8i: BULK COLLECT and FORALL. These two statements perform an array processing inside PL/SQL.
Bulk collect provides high-speed data retrieval. FORALL can greatly improve the performance of INSERT, UPDATE, and DELETE operations. Oracle databases use these statements to greatly reduce
PL/SQL and SQL statement execution engine environment switching times, thus significantly improving its performance.
With bulk collect, you can introduce multiple rows into one or more sets, instead of individual variables or records. The following BULKCOLLECT instance includes
All books with "PL/SQL" are retrieved and placed in an associated array of records, all of which are located in a single channel to the database.
DECLARE
TYPE books_aat
Is table of book % ROWTYPE
Index by PLS_INTEGER;
Books books_aat;
BEGIN
SELECT *
Bulk collect into book
FROM books
WHERE title LIKE '% PL/SQL % ';
...
END;
Similarly, FORALL transfers data from a PL/SQL set to a specified table using the set. The following code example provides a process for receiving a nested table of Book Information and
Insert all contents of the Set (bind array) into the book table. Note: This example also uses the enhanced FORALL function of Oracle9i to insert a record directly into the table.
Bulk collect and FORALL are both very useful. They not only improve performance, but also simplify the code written for SQL operations in PL/SQL. The following multi-row forall insert is equivalent
Clearly explains why PL/SQL is considered the best programming language for Oracle databases.
Create type books_nt
Is table of book % ROWTYPE;
/
Create or replace procedure add_books (
Books_in IN books_nt)
IS
BEGIN
FORALL book_index
IN books_in.FIRST .. books_in.LAST
Insert into book
VALUES books_in (book_index );
...
END;
However, before the Oracle database is 10 Gb, there is an important restriction to use the set IN the FORAll mode: the database reads the content of the set from the first row to the last row IN the IN range clause.
. If an undefined row is encountered within this range, the Oracle database raises a ORA-22160 exception event:
ORA-22160: element at index [N] does notexist
For simple FORALL applications, this rule will not cause any trouble. However, if you want to make full use of FORALL, you need to fill any FORALL driver array in sequence.
Can increase the complexity of the program and reduce the performance.
In Oracle Database 10 Gb, PL/SQL now provides two new clauses in the FORALL statement: INDICES OF and VALUES OF. They enable you to carefully select the extension in the driver array.
DML statement.
When the bound array is a sparse array or contains gaps, indices of is very useful. The syntax structure of this statement is:
FORALL indx IN INDICES
OF sparse_collection
Insert into my_table
VALUES sparse_collection (indx );
Values of is used in a different situation: the bound array can be a sparse array or not, but I only want to use a subset OF the elements in the array. Then I can use VALUES
OF to point to the value that I want to use in the DML operation. The syntax structure of this statement is:
FORALL indx in values of pointer_array
Insert into my_table
VALUES binding_array (indx );
Use FORALL instead of FOR Loop
Suppose I need to write a program to raise a salary for a qualified employee (determined by the comp_analysis.is_eligible function), write a report on the employee who does not meet the raise condition, and write it
The employee_history table. I work in a very large company, and we have a lot of employees.
This is not a very difficult task for a PL/SQL developer. I can do this without even using BULKCOLLECT or FORALL, as shown in Listing 1.
, I use a CURSORFOR loop and a separate INSERT and UPDATE statement. This code is concise and clear; unfortunately, it took me 10 minutes to run the code, my "old-fashioned" Method
It takes 30 minutes or longer to run.
Listing 1:
Create or replace PROCEDUREgive_raises_in_department (
Dept_in employee. department_id % TYPE
, Newsal IN employee. salary % TYPE
)
IS
CURSOR emp_cur
IS
SELECT employee_id, salary, hire_date
FROM employee
WHERE department_id = dept_in;
BEGIN
FOR emp_rec IN emp_cur
LOOP
IF comp_analysis.is_eligible (emp_rec.employee_id)
THEN
UPDATE employee
SET salary = newsal
WHERE employee_id = emp_rec.employee_id;
ELSE
Insert into employee_history
(Employee_id, salary
, Hire_date, activity
)
VALUES (emp_rec.employee_id, emp_rec.salary
, Emp_rec.hire_date, 'raise DENIED'
);
End if;
End loop;
END give_raises_in_department;
Fortunately, my company's database has been upgraded to Oracle9i. Fortunately, at the recent Oracle Seminar (and a very good demonstration provided by the Oracle technology website), I understand
To the batch processing method. So I decided to rewrite the program using the set and batch processing methods. The program is shown in List 2.
Listing 2:
1 create or replace PROCEDUREgive_raises_in_department (
2 dept_in IN employee. department_id % TYPE
3, newsal IN employee. salary % TYPE
4)
5 IS
6 TYPE employee_aat is table of employee. employee_id % TYPE
7 index by PLS_INTEGER;
8 TYPE salary_aat is table of employee. salary % TYPE
9 index by PLS_INTEGER;
10 TYPE hire_date_aat is table of employee. hire_date % TYPE
11 index by PLS_INTEGER;
12
13. employee_ids employee_aat;
14 salaries salary_aat;
15 hire_dates hire_date_aat;
16
17 approved_employee_ids employee_aat;
18
19 denied_employee_ids employee_aat;
20 denied_salaries salary_aat;
21 denied_hire_dates hire_date_aat;
22
23 PROCEDURE retrieve_employee_info
24 IS
25 BEGIN
26 SELECT employee_id, salary, hire_date
27 bulk collect into employee_ids, salaries, hire_dates
28 FROM employee
29 WHERE department_id = dept_in;
30 END;
31
32 PROCEDURE partition_by_eligibility
33 IS
34 BEGIN
35 FOR indx IN employee_ids.FIRST... employee_ids.LAST
36 LOOP
37 IF comp_analysis.is_eligible (employee_ids (indx ))
38 THEN
39 approved_employee_ids (indx): = employee_ids (indx );
40 ELSE
41 denied_employee_ids (indx): = employee_ids (indx );
42 denied_salaries (indx): = salaries (indx );
43 denied_hire_dates (indx): = hire_dates (indx );
44 end if;
45 end loop;
46 END;
47
48 PROCEDURE add_to_history
49 IS
50 BEGIN
51 FORALL indx IN denied_employee_ids.FIRST .. denied_employee_ids.LAST
52 insert into employee_history
53 (employee_id
54, salary
55, hire_date, activity
56)
57 VALUES (denied_employee_ids (indx)
58, denied_salaries (indx)
59, denied_hire_dates (indx), 'raise DENIED'
60 );
61 END;
62
63 PROCEDURE give_the_raise
64 IS
65 BEGIN
66 FORALL indx IN approved_employee_ids.FIRST .. approved_employee_ids.LAST
67 UPDATE employee
68 SET salary = newsal
69 WHERE employee_id = approved_employee_ids (indx );
70 END;
71 BEGIN
72 retrieve_employee_info;
73 partition_by_eligibility;
74 add_to_history;
75 give_the_raise;
76 END give_raises_in_department;
By taking a look at listing 1 and Listing 2, you will be aware that switching to collections and batch processing methods will increase the amount of code and complexity. However, if you need to significantly improve the performance, it is worth
. Next, let's not look at these codes. Let's take a look at what we use to process the conditional logic in the CURSORFOR loop when we use FORALL.
Define Collection types and sets
In Listing 2, the first part of the declaration section (rows 6th to rows 11th) defines several different set types that correspond to the columns I will retrieve from the employee table. I prefer employee %
ROWTYPE is used to declare a set type, but FORALL does not support operations on certain record sets. In such records, I will reference individual fields. Therefore, I must also be an employee ID,
The salary and employment date respectively declare their respective sets.
Next, declare the required set for each column (13th rows to 21st rows ). First, define the set corresponding to the queried column (13th rows to 15th rows ):
Employee_ids employee_aat;
Salaries salary_aat;
Hire_dates hire_date_aat;
Then I need a new set to store the ID of the employee who has been granted a raise (Row 1 ):
Approved_employee_ids employee_aat;
Finally, I declare a set in each column (rows 19th to rows 21st) to record employees who are not eligible for a raise:
Denied_employee_ids employee_aat;
Denied_salaries salary_aat;
Denied_hire_dates hire_date_aat;
Deep understanding of code
After the data structure is determined, we will skip the execution part of the program (lines 72nd to 75th) to learn how to use these sets to accelerate the process.
Retrieve_employee_info;
Partition_by_eligibility;
Add_to_history;
Give_the_raise;
I wrote this program using the progressive refinement method (also known as "Top-Down Design "). So the execution part is not very long, it is not hard to understand, there are only four rows, each step in the process by name
Line description. First, retrieve the employee information (all employees in the specified department ). Then, the employees will be divided into salary increases and employees who do not receive salary increases. After that, I can
Add the employee to the employee history table to raise the salary for other employees.
Writing code in this way greatly enhances the readability of the final result. So I can go deep into any part of the program that makes sense to me.
With the declared set, I can now use BULK COLLECT to retrieve employee information (lines 23rd to 30th ). This part effectively replaces the cursor for loop. So far, count
Data is loaded to the collection.
The Division logic (32nd to 46th rows) requires that each row in the just-filled set be checked to see if it meets the salary increase conditions. If yes, I will fill the employee ID from the query
Copy the set to the set of qualified employees. If it does not match, copy the employee ID, salary, and employment date, because these need to be inserted into the employee_history table.
The initial data is now divided into two sets, which can be used as the drive for two different FORALL statements (starting from lines 51st and 66th. A collection of unqualified employees
Data in the batch is inserted into the employee_history (add_to_history) table, and the information of qualified employees is updated in the employee table in batch through the give_the_raise process.
Finally, let's take a closer look at add_to_history (lines 48th to 61st) to end the analysis of this re-compiled program. The FORALL Statement (row 51st) contains an IN clause.
It specifies the range of row numbers to be used for batch INSERT. In the second rewrite of the program, I will call the set used to define the range "driver set ". However
In this version of add_to_history, I simply assume that all rows defined in denied_employee_ids are used. In INSERT itself, there are three sets of unqualified employees
These collections are called "data sets ". We can see that the driver set does not need to match the data set. When learning the new features of Oracle Database 10 Gb, this is a concern
Key.
As a result, the number of rows in Listing 2 is about twice the number of rows in Listing 1, but the code in Listing 2 runs within the required time. Before using the Oracle database for 10 Gb, in this case,
I am only happy to be able to run code and start the next task at this time.
However, with the latest PL/SQL version in the Oracle Database 10 Gb, I can now make more improvements in performance, readability, and code size.
Use values of in this process
In the Oracle Database 10 Gb, you can specify a subset of rows in the driver set used by the FORALL statement. You can use either of the following two methods to define the subset:
Match the row number in the dataset with the row number in the driver set. You need to use the indices of clause.
Match the row number in the dataset with the value found in the row defined in the driver set. This requires the values of clause.
In the second and last rewriting OF give_raises_in_department, I will use the VALUES OF clause. Listing 3 contains all the code for this version. I will skip this program
The same as the previous version.
Starting from the declaration set, please note that I no longer define another set to store qualified and unqualified employee information, but in listing 3 (rows 17th to 21st) declare two "Bootstrap" sets in:
One is used for employees who meet the salary increase requirements, and the other is used for employees who do not meet the salary increase requirements. The Data Types of these two sets are Boolean. We will soon see the data types of these sets.
It has nothing to do with the FORALL statement. The FORALL statement only defines the rows. Percentage of three execution methods of give_raises_in_department with 50 000 rows in the employee table
Time used for executing the Method
Cursor for loop 00:00:38. 01
Batch Processing before 10 Gb in Oracle Database 00:00:06. 09
Batch Processing of Oracle Database 10 Gb 00:00:02. 06
When the three execution methods of give_raises_in_department with 100,000 rows of data are used in the employee table
Cursor for loop 00:00:58. 01
Batch Processing before 10 Gb in Oracle Database 00:00:12. 00
Batch Processing of Oracle Database 10 Gb 00:00:05. 05
Table 1: Test results when processing 50,000 rows and 100,000 rows of data
The retrieve_employee_info subprogram is the same as the previous one, but the data is divided in a completely different way (32nd to 44th rows ). I did not copy records from one set to another
A collection (this operation is relatively slow), but only determines the rows in the corresponding Bootstrap set that match the row number in the employee ID set (by specifying a TRUE value for it ).
You can now use the approved_list and denied_list collections as the driver set in two different FORALL statements (starting from row 49th and row 65th.
To insert data to the employee_history table, I use the following statement:
FORALL indx in values of denied_list
In order to update (give employees a raise), I use this format:
FORALL indx in values of approved_list
In the two DML statements, the data set is the initial set filled in the bulk collect search step; no replication has been performed. Using values of, the Oracle database
To filter the rows in the set. Only the rows whose row number matches the row number in the driver set are used.
By using VALUESOF in this program, you can avoid copying all records, but replace them with a simple list of row numbers. Overhead of copying large arrays
Is very impressive. To test the superiority of Oracle Database 10 Gb, I load the employee table and run the test on the data of rows 50,000 and 100,000. To simulate more practical situations
I modified the execution method of batch processing before the Oracle Database 10 Gb to replicate the set content multiple times. Then I use SQL * Plus SET TIMING ON to display the running items
The time used for different execution methods. Table 1 shows the results.
The conclusion obtained from these time measurements is very clear: changing from a single DML statement to batch processing will greatly shorten the consumption time. When the data is 50,000 rows, the time is reduced from 38 seconds to 6 seconds, and the data is
When the number of rows is 100,000, the value is reduced from 58 seconds to 12 seconds. Furthermore, we can avoid copying data by using values of, which can be reduced by about half.
Even if there is no performance improvement, VALUESOF and Its similar clauses-INDICESOF also improve the flexibility of PL/SQL language, so that developers can more easily write more intuitive and easier
Maintenance code.
PL/SQL is a mature and powerful language for product life. Therefore, many of its new features are gradually increased and improved. However, these new features make applications
The program performance and development efficiency have changed significantly. Values of is a good example OF this feature.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.