Comparison between one-time transaction and CTE data insertion
Sometimes we need to construct some data for testing, as shown below:
IF OBJECT_ID (N 't14') IS NOT NULL
BEGIN
Drop table T14
END
GO
Create table T14 (t14_id INT)
GO
DECLARE @ I INT = 1
WHILE I <= 1000
BEGIN
Insert into T14 (t14_id)
SELECT @ I
SET @ I = @ I + 1
END
GO
Code-1
There is a problem here. Every time an insert operation is performed, it is equivalent to a commit transaction. If the data volume is small, there will be no problem. What if we want to insert 1 million, 2 million, 10 million or even more data? Since the insert statement is implicitly commit, adding an explicit transaction outside the loop can significantly improve the insert performance. Another method is to use CTE to insert data to a table at a time to improve performance. Now we can compare the performance of these two data insertion methods. Before there is no result, guess which speed is faster? Or are the two similar?
First, add transactions and insert 1 million records:
IF OBJECT_ID (N 't14') IS NOT NULL
BEGIN
Drop table T14
END
GO
Create table T14 (t14_id INT)
GO
DBCC FREESESSIONCACHE
DBCC DROPCLEANBUFFERS
GO
Set nocount on;
BEGIN TRAN
DECLARE @ I INT = 1
WHILE I <= 1000000
BEGIN
Insert into T14 (t14_id)
SELECT @ I
SET @ I = @ I + 1
END
Commit tran;
Set nocount off;
GO
Code-2
I tested it multiple times on my machine and took the average value. It took about 22 seconds to insert 1 million records. The speed was quite fast. (If no explicit transaction is added, how long will it take to complete? If you are interested, try it)
The following uses CTE:
IF OBJECT_ID (N 't15') IS NOT NULL
BEGIN
Drop table T15
END
GO
Create table T15 (t15_id INT)
GO
DBCC FREESESSIONCACHE
DBCC DROPCLEANBUFFERS
GO
WITH CTE1 (
SELECT a. [object_id] FROM master. sys. all_objects AS a, master. sys. all_objects AS B
)
, CTE2AS (
SELECT ROW_NUMBER () OVER (order by [object_id]) as row_no FROM CTE1
)
Insert into T15 (t15_id)
SELECT row_no FROM CTE2 WHERE Fig <= 1000000
GO
Code-3
It was also the average value obtained for multiple tests. It was actually completed in about 5 seconds, which was not surprising to me! Now, insert 10 million records to see how the results are. The former only needs to change 1000000 in code-2 to 10000000 and then run it. The latter requires union all twice because the number of records in CTE1 is not enough. The Code is as follows:
IF OBJECT_ID (N 't15') IS NOT NULL
BEGIN
Drop table T15
END
GO
Create table T15 (t15_id INT)
GO
DBCC FREESESSIONCACHE
DBCC DROPCLEANBUFFERS
GO
WITH CTE1 (
SELECT a. [object_id] FROM master. sys. all_objects AS a, master. sys. all_objects AS B
UNION ALL
SELECT a. [object_id] FROM master. sys. all_objects AS a, master. sys. all_objects AS B
UNION ALL
SELECT a. [object_id] FROM master. sys. all_objects AS a, master. sys. all_objects AS B
)
, CTE2AS (
SELECT ROW_NUMBER () OVER (order by [object_id]) as row_no FROM CTE1
)
Insert into T15 (t15_id)
SELECT row_no FROM CTE2 WHERE Fig <= 10000000
GO
Code-4
Test results: it takes about three minutes to insert the Add transaction, and the CTE completes within one minute and a half minutes. It seems that CTE is more efficient! During the test, we found that the memory usage was not large, but the CPU usage was significantly improved. In addition, if you insert big data into a table, the insertion performance will also be affected if there is no index or log recovery mode.
------- Supplement -----
Here we will add the generation of records in CTE1. If you only need 1 million of the data, you only need master. sys. the databases table cross join can be performed once by yourself, or it is better to find the closer data to the two tables after cross join. It is not enough to have unionl all several times. If you need 10 million or a larger number of records, you can JOIN a small table in cross join mode. For example:
;WITH CTE3 AS ( SELECT a.[object_id] FROM master.sys.all_objects AS a, master.sys.all_objects AS b, master.sys.databases AS c)SELECT COUNT(*) AS counts,LEN(COUNT(*)) AS counts_length FROM CTE3GO
Code-5
Figure-1
My machine generated over 0.1 billion million records.
This article permanently updates the link address: