The Bulk Collect feature is introduced in Oracle8i for the first time. This feature allows you to use batch queries in PL/SQL. Batch queries can significantly improve query efficiency in some cases. Now, we will perform some simple tests and analysis on this feature.
1. First, create a table and insert 100000 records.
Execute the following script in SQL/Plus:
Drop table empl_tbl
/
Create table empl_tbl (last_name varchar2 (20 ),
First_name varchar2 (10 ),
Salary number (10 ))
/
Begin
For I in 3000 .. 102999 loop
Insert into empl_tbl (last_name, first_name, salary) values ('Carl '| (i-3000), 'wu' | (103000-i), I );
End loop;
End;
/
Commit
/
Select count (*) from empl_tbl;
/
2. Use three methods to calculate the number of non-repeated values contained in a field in the table
2.1 using regular Distinct
SQL> select count (distinct last_name) "Distinct Last Name" from empl_tbl;
Distinct Last Name
------------------
100000
Executed in 0.36 seconds
We can see that the regular method takes 0.36 seconds to find that the table has 100000 non-repeated Last_name values.
2.2 using a cursor
Run the following statement to count the number of distinct Last_name fields:
Declare
All_rows number (10 );
Temp_last_name empl_tbl.last_name % type;
Begin
All_rows: = 0;
Temp_last_name: = '';
For cur in (select last_name from empl_tbl order by last_name) loop
If cur. last_name! = Temp_last_name then
All_rows: = all_rows + 1;
End if;
Temp_last_name: = cur. last_name;
End loop;
Dbms_output.put_line ('all _ rows are '| all_rows );
End;
Note that the simhei part of the code above uses a For Loop cursor. To improve program readability, we do not display the defined cursor variable.
Execution result:
All_rows are 100000
PL/SQL procedure successfully completed
Executed in 1.402 seconds
The cursor takes 1.4 seconds to check that the table has 100000 Distinct Last_name values, which consume more than three times the Distinct query time.
2.3 using Bulk Collect batch query
The sample code is as follows:
Declare
All_rows number (10 );
-- First, define an Index-by table data type.
Type last_name_tab is table of empl_tbl.last_name % type index by binary_integer;
Last_name_arr last_name_tab;
-- Define an Index-by table set variable
Temp_last_name empl_tbl.last_name % type;
Begin
All_rows: = 0;
Temp_last_name: = '';
-- Use Bulk Collect batch query to fill in set variables
Select last_name bulk collect into last_name_arr from empl_tbl;
For I in 1 .. last_name_arr.count loop
If temp_last_name! = Last_name_arr (I) then
All_rows: = all_rows + 1;
End if;
Temp_last_name: = last_name_arr (I );
End loop;
Dbms_output.put_line ('all _ rows are '| all_rows );
End;
Note that in the code above, we first define an Index-by table data type last_name_tab, and then define a variable last_name_arr for the Data Type of the set, finally, we use Bulk Collect batch query to fill the last_name_arr. Pay attention to its syntax.
Execution result:
All_rows are 100000
PL/SQL procedure successfully completed
Executed in 0.28 seconds
From the above execution results, we can see that Bulk Collect batch query only takes 0.28 seconds to find that the table has 100000 non-repeated Last_name values, the time consumed is only 1/5 of the cursor query, at the same time, it is faster than the general query speed of Distinct.
3. Test Result Analysis
Why is the above result displayed? We can use SQL _Trace of Oracle to analyze the results. In the SQL command line, use the alter session set SQL _trace = true statement to open the Trace of Oracle, execute the preceding three queries in the command line, and use the TKPROF tool to generate the Trace report.
3.1 regular Distinct Query Result Analysis ********************************** **************************************** * ***** select count (distinct last_name)
From
Empl_tbl
Call count cpu elapsed disk query current rows
-----------------------------------------------------------------------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.28 0.32 198 425 4 1
-----------------------------------------------------------------------
Total 3 0.28 0.32 198 425 4 1
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 62
Rows Row Source Operation
----------------------------------------------------------
1 SORT GROUP
100000 table access full EMPL_TBL
**************************************** ****************************************
The preceding query first scans the entire table of empl_tbl, and then sorts the results in groups. The time for SQL parsing and execution is negligible, and the main time is spent on reading data, because the current SQL is only a query and there is no addition or modification operation. In the data reading phase, 198 Oracle data blocks need to be read from the disk, and 425 data blocks need to be read in a consistent manner (query, consistent gets. Because the Disk Physical reading is very time-consuming, the query is not executed very quickly.
3.2 analyze the efficiency of cursor query *********************************** **************************************** *****
SELECT LAST_NAME
FROM
EMPL_TBL order by LAST_NAME
Call count cpu elapsed disk query current rows
-----------------------------------------------------------------------
Parse 0 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 100001 0.71 0.62 198 425 4 100000
-----------------------------------------------------------------------
Total 100002 0.71 0.62 198 425 4 100000
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 62 (recursive depth: 1) **************************************** **************************************** this method is mainly time-consuming in the Data Reading phase, although the number of data blocks read from the disk and the number of consistent data blocks are the same as that of the Distinct SQL query, the cursor in this method must be cyclically 100001 times, therefore, the preceding SQL reads 100001 times, and a total of 100000 rows of data are read. This is why it takes 1.4 seconds to use the cursor. Next let's take a look at what will happen when using Bulk Collect?
3.3 Bulk Collect query efficiency analysis ********************************* **************************************** *******
SELECT LAST_NAME
FROM
EMPL_TBL
Call count cpu elapsed disk query current rows
-----------------------------------------------------------------------
Parse 0 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.08 0.07 0 425 0 100000
-----------------------------------------------------------------------
Total 2 0.08 0.07 0 425 0 100000
Misses in library cache during parse: 0
Optimizer goal: CHOOSE
Parsing user id: 62 (recursive depth: 1)
**************************************** ****************************************
Although this method also needs to read 100000 rows of data, the read operation takes place only once, and the Bulk Collect statement reads the required data into the memory at one time, so this method does not read data blocks from the disk, therefore, this method has advantages over the above two methods, so the execution efficiency is the highest.
4. Conclusion
Through the test and analysis above, we can see that Bulk Collect batch query can improve the query efficiency to some extent. It first reads the required data into the memory, and then performs statistical analysis, this improves the query efficiency. However, if the memory of the Oracle database is small and the Shared Pool Size is insufficient to save the Bulk Collect batch query results, this method needs to save the Bulk Collect set results on the disk, in this case, the Bulk Collect method is less efficient than the other two methods. Interested readers can further test it.
In addition to Bulk Collect batch queries, we can also use the FORALL statement to implement batch insert, delete, and update, which can significantly improve the execution efficiency in Bulk data operations.
Author: "Java Walker"