Is null and is not null in SQL SERVER cause index failure ?, Sqlnull

Source: Internet
Author: User

Is null and is not null in SQL SERVER cause index failure ?, Sqlnull

In fact, there is nothing to say about this problem. I encountered an SQL statement during optimization today. Because it is interesting, I have cut and simplified the SQL statement and demonstrated it for you, as shown below:

declare @bamboo_Code varchar(3);
 
set @bamboo_Code='-01';
 
 
SELECT DISTINCT yarn_lot
FROM   dbo.rsjob WITH ( nolock )
WHERE  RIGHT(ges_no, 3) = @bamboo_Code
       AND Isnull(yarn_lot, '') <> ''; 

As shown above, Isnull (yarn_lot, '') <>'' is used for the column yarn_log in SQL, I guess the person who wrote this SQL statement should be convinced that "is null and is not null will cause index failure" is a widely-spread rule on the Internet. Where can this suggestion be spread, cannot be verified. In this case, we verify whether is null or is not null will cause index failure.

Table rsjob is a heap table with an index yarn_lot on the column yarn_lot. Let's verify it through experiment.

SELECT DISTINCT yarn_lot
FROM   dbo.rsjob WITH(nolock)
WHERE  yarn_lot IS NOT NULL;
 
 
SELECT DISTINCT yarn_lot
FROM   dbo.rsjob WITH(nolock)
WHERE  yarn_lot IS NULL 

As shown in the preceding figure, no matter whether it is null or is not null, an index IS used for search.

declare @bamboo_Code varchar(3);
 
set @bamboo_Code='-01';
 
 
SELECT DISTINCT yarn_lot
FROM   dbo.rsjob WITH ( nolock )
WHERE  RIGHT(ges_no, 3) = @bamboo_Code
       AND Isnull(yarn_lot, '') <> ''; 
       
       
SELECT DISTINCT yarn_lot
FROM   dbo.rsjob WITH ( nolock )
WHERE  RIGHT(ges_no, 3) = @bamboo_Code
       AND yarn_lot IS NOT NULL;

In addition, let's take a look at the overhead ratio of the two original SQL Execution plans as 52: 48. That IS to say, the performance of using is not null is better. Because the first SQL statement IS converted, as a result, it performs an index scan, while is not null performs an index search.

"Is null and is not null will lead to index failure. So those who believe in this dogma should verify it by themselves.

Next, we can verify through experiments that the actual environment may be complicated. We can build the following scenarios. In fact, the real environment is more complicated. However, there are basically the following scenarios:

Case 1: Separate index columns on the heap table predicates

USE Test;
GO
DROP TABLE TEST;
GO
  
CREATE TABLE TEST (OBJECT_ID  INT, NAME VARCHAR(12));
 
CREATE INDEX PK_TEST ON TEST(OBJECT_ID) INCLUDE(NAME);
 
DECLARE @Index INT =0;
 
WHILE @Index < 10000
BEGIN
    INSERT INTO TEST
    SELECT @Index, 'kerry'+ CAST(@Index AS VARCHAR);
   
    SET @Index = @Index +1;
END
 
INSERT INTO TEST
SELECT NULL, 'only test1' UNION ALL
SELECT NULL, 'only test2'
 
UPDATE STATISTICS TEST WITH FULLSCAN;
 
SELECT * FROM TEST WHERE OBJECT_ID IS NULL;
SELECT * FROM TEST WHERE  OBJECT_ID IS NOT NULL;

Delete the index and create the following index. As shown below

Drop index PK_TEST on test;

Create index PK_TEST on test (OBJECT_ID)

It can be seen that the execution plan of is null or is not null is related to the index and data distribution.

Case 2: No index on the predicate of the heap table

USE Test;
GO
DROP TABLE TEST;
GO
 
CREATE TABLE TEST (OBJECT_ID  INT, NAME VARCHAR(12));
 
 
 
DECLARE @Index INT =0;
 
WHILE @Index < 10000
BEGIN
    INSERT INTO TEST
    SELECT @Index, 'kerry'+ CAST(@Index AS VARCHAR);
   
    SET @Index = @Index +1;
END
 
INSERT INTO TEST

SELECT NULL, 'only test1' UNION ALL

 

SELECT NULL, 'only test2'
 
UPDATE STATISTICS TEST WITH FULLSCAN;
 
 
SELECT * FROM TEST WHERE OBJECT_ID IS NULL;
SELECT * FROM TEST WHERE  OBJECT_ID IS NOT NULL;

As shown above, if a heap table does NOT have any indexes, the use of is null or is not null must undergo a full table scan, but this is not covered in our discussion. Then let's take a look at the index built on other fields (mainly to compare it with the clustered index table), which still scans the entire table.

CREATE INDEX PK_TEST ON TEST(OBJECT_ID) INCLUDE(NAME);
 
INSERT INTO TEST
 SELECT 10000, NULL UNION ALL
 SELECT 10001, NULL ;
 
SELECT * FROM TEST WHERE NAME  IS NULL;
SELECT * FROM TEST WHERE  NAME IS NOT NULL;

Case 3: Composite Index columns in the heap table

USE Test;
GO
DROP TABLE TEST;
GO
   
CREATE TABLE TEST (OBJECT_ID  INT, NAME VARCHAR(12), AGE INT);
 
CREATE INDEX IDX_TEST_N1 ON TEST(NAME, AGE);
 
DECLARE @Index INT =0;
 
WHILE @Index < 10000
BEGIN
    INSERT INTO TEST
    SELECT @Index, 'kerry'+ CAST(@Index AS VARCHAR), floor(rand()*100) ;
   
    SET @Index = @Index +1;
END
 
 
INSERT INTO TEST
SELECT NULL, 'only test1', 12 UNION ALL
SELECT NULL, 'only test2',24
UPDATE STATISTICS TEST WITH FULLSCAN;
 
 
SELECT * FROM TEST WHERE NAME IS NULL;
SELECT * FROM TEST WHERE  NAME IS NOT NULL;

If the predicate in the Union index is at the second or later position of the Union Index, what is the situation? We can see from the following that SQL is scanned in the full table.

DROP INDEX IDX_TEST_N1 ON TEST;
 
CREATE INDEX IDX_TEST_N1 ON TEST( AGE,NAME);
 
UPDATE STATISTICS TEST WITH FULLSCAN;

4. Separate index columns in the clustered index table

USE Test;
GO
DROP TABLE TEST;
GO
   
CREATE TABLE TEST (OBJECT_ID  INT, NAME VARCHAR(12));
 
CREATE CLUSTERED INDEX PK_TEST ON TEST(OBJECT_ID)
 
DECLARE @Index INT =0;
 
WHILE @Index < 10000
BEGIN
    INSERT INTO TEST
    SELECT @Index, 'kerry'+ CAST(@Index AS VARCHAR);
   
    SET @Index = @Index +1;
END
INSERT INTO TEST
SELECT NULL, 'only test1' UNION ALL
SELECT NULL, 'only test2'
SELECT * FROM TEST WHERE OBJECT_ID IS NULL;
SELECT * FROM TEST WHERE  OBJECT_ID IS NOT NULL;

If I use IS NULL or IS NOT NULL for query on the column NAME, you will find that the execution plan IS changed from clustered index search to clustered index scan.

INSERT INTO TEST 
 
SELECT 10000, NULL UNION ALL 
 
SELECT 10001, NULL ; 
 
SELECT * FROM TEST WHERE NAME IS NULL; 
 
SELECT * FROM TEST WHERE NAME IS NOT NULL;

4. Composite Index columns in the clustered index table

USE Test;
GO
DROP TABLE TEST;
GO
   
CREATE TABLE TEST (OBJECT_ID  INT, NAME VARCHAR(12), AGE INT);
 
CREATE CLUSTERED INDEX PK_TEST ON TEST(OBJECT_ID)
 
DECLARE @Index INT =0;
 
WHILE @Index < 10000
BEGIN
    INSERT INTO TEST
    SELECT @Index, 'kerry'+ CAST(@Index AS VARCHAR), floor(rand()*100) ;
   
    SET @Index = @Index +1;
END
 
 
INSERT INTO TEST
SELECT 10001, 'NULL', 12 UNION ALL
SELECT 10002, 'NULL',24
 
CREATE INDEX IDX_TEST_N2 ON TEST(NAME,AGE);
UPDATE STATISTICS TEST WITH FULLSCAN;

If the predicate in the Union index is not in the first column, will it be indexed if it is null or is not null?

DROP INDEX IDX_TEST_N2 ON TEST; 
 
CREATE INDEX IDX_TEST_N2 ON TEST(AGE,NAME); 
 
UPDATE STATISTICS TEST WITH FULLSCAN; 

As shown above, it changes from index search to index scan.

Conclusion: 1: "is null and is not null will lead to index failure." This dogma is completely shit. The SQL Server Index contains the null value, oracle indexes do not contain null values. Different databases are different.

2: if there is an index on the predicate, the index will basically be taken, so there is a certain relationship between index search or index scan and the index type, it is also related to the position where the field is located in the Union index. In addition, the wide skew of data distribution will also lead to full table scanning without indexing, but this does NOT mean that is null and is not null cause index failure. It is very important to infer the execution plan by observing the SQL statement. You need to comprehensively examine the index, data distribution, and statistical information of the table involved in the SQL statement to make a comprehensive judgment, in plain words, it should be combined with specific scenarios.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.