MySQL not in, left join, is NULL, not EXISTS efficiency problem record

MySQL not in, left join, is NULL, not EXISTS efficiency problem record _mysql

Last Update:2017-01-19 Source: Internet

Author: User

Tags mysql in time 0

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Not in, JOIN, is NULL, not exists efficiency comparison

Statement one: SELECT COUNT (*) from a where a.a to (select a from B)

Statement two: SELECT COUNT (*) from A left join B in A.A = B.A where B.A is null

Statement three: SELECT COUNT (*) from A where NOT exists (select A from B where a.a = B.A)

It has been a long time since the actual effects of the three statements are known to be the same, but there has been no scrutiny of efficiency comparisons. Always feel on the statement two is the fastest.
At work today, you need to delete more than 20 million rows of data because you want to purge data from a tens of millions of-line database. A lot of the above three statements are used to achieve the function. Originally used is statement one, but the result is the execution speed 1个小时32分, the log file occupies 21GB. Although the time is acceptable, but the footprint of hard disk space is indeed a problem. Therefore, all statements are replaced with statement two. We thought it would be quicker. After more than 40 minutes of execution, the first 50000 lines were not erased, but the SQL Server crashed and the results were surprising. Try to execute this statement alone, query nearly 10 million rows of the table, the statement used 4 seconds, statement two but used 18 seconds, the gap is very large. The efficiency of statement three is close to the statement.

The second type of writing is taboo and should be avoided as much as possible. The first and third are almost identical in nature.

Assuming that the buffer pool is large enough, writing two relative to the writing of the following several deficiencies:
(1) The left join itself consumes more resources (more resources are required to handle the resulting intermediate result set)
(2) The middle result set of the left join is not smaller than table a
(3) Writing II also requires that the intermediate results produced by the LEFT join are being null conditional filtering, while the writing one completes the filter at the same time as the two set join, which is an additional

These three points are combined to make a significant difference when dealing with massive amounts of data (mainly memory and CPU overhead). I suspect that when the landlord in the test buffer pool may have been saturated, so that the extra cost of writing two have to use virtual memory on the disk, when the SQL Server to do a page change, due to the slow I/O operation so that the gap is more obvious.

The log file is too large, this is also normal, because the deletion of a lot of records. You can consider setting the recovery model to simple based on the purpose of the database, or truncate the log off after the deletion and shrink the file.

Because there was a previous script to the library for unconditional deletion, which is to delete all the data in the table with a large amount of data, but because the customer request, can not use TRUNCATE TABLE, afraid of destroying the existing library structure. So can only delete delete, at that time also encountered a log file too big problem, at that time the method is to delete in batches, in SQL2K with SET ROWCOUNT @chunk, in the sql2k5 with the delete top @chunk. Such operations not only make the deletion time greatly reduced, but also let the log volume greatly reduced, only increased by 1G or so.
But the task of clearing the data this time is to add the condition that deletes a from a where .... There's a condition behind it. Again using the method of batch deletion, but has no effect.
I wonder if you know why.

MySQL not in and left join efficiency issues record

First, the function of this SQL is to query that collection A does not have data in collection B.
The wording of not in

Copy Code code as follows:

Select ADD_TB. Ruid
From (SELECT DISTINCT Ruid
From Usermsg
where Subjectid =12
and createtime> ' 2009-8-14 15:30:00 '
and createtime<= ' 2009-8-17 16:00:00 '
) ADD_TB
where ADD_TB. Ruid
Not in (SELECT DISTINCT Ruid
From Usermsg
where Subjectid =12
and createtime< ' 2009-8-14 15:30:00 '
)

Copy Code code as follows:

Select A.ruid,b.ruid
From (SELECT DISTINCT Ruid
From Usermsg
where Subjectid =12
and Createtime >= ' 2009-8-14 15:30:00 '
and createtime<= ' 2009-8-17 16:00:00 '
) a LEFT join (
SELECT DISTINCT Ruid
From Usermsg
where Subjectid =12 and createtime< ' 2009-8-14 15:30:00 '
) b on a.ruid = B.ruid
where B.ruid is null

Copy Code code as follows:

SELECT DISTINCT A.ruid
From Usermsg A
Left JOIN Usermsg b
On a.ruid = B.ruid
and B.subjectid =12 and B.createtime < ' 2009-8-14 15:30:00 '
where A.subjectid =12
and A.createtime >= ' 2009-8-14 15:30:00 '
and A.createtime <= ' 2009-8-17 16:00:00 '
and B.ruid is null;

Copy Code code as follows:

SELECT DISTINCT A.ruid
From Usermsg A
where A.subjectid =12
and A.createtime >= ' 2009-8-14 15:30:00 '
and A.createtime <= ' 2009-8-17 16:00:00 '
And NOT EXISTS (
SELECT DISTINCT Ruid
From Usermsg
where Subjectid =12 and Createtime < ' 2009-8-14 15:30:00 '
and Ruid=a.ruid
)

Copy Code code as follows:

Select A.ruid,b.ruid
From (SELECT DISTINCT Ruid
From Usermsg
where Createtime >= ' 2009-8-14 15:30:00 '
and createtime<= ' 2009-8-17 16:00:00 '
A LEFT join Usermsg b
On a.ruid = B.ruid
and B.createtime < ' 2009-8-14 15:30:00 '
where b.ruid is null;

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More