The difference between select COUNT (*) and COUNT (1) in SQL Server and how to perform it _mssql

Source: Internet
Author: User

Either count (*) or COUNT (1) or count ([column]) in SQL Server is perhaps the most commonly used aggregate function. Many people actually distinguish between the three. This article will explain the role of these three, relations and the underlying principles.

I often see some so-called optimization recommendations that use COUNT (1) instead of Count (*) to improve performance, given the reason that count (*) brings a full table scan. There's no difference in how you actually write count.

The actual meaning of count (1) and COUNT (*) is to evaluate whether the expression in count () is null, and if it is null, it is counted rather than null. For example, if we look at code 1, specify NULL in count (the optimizer does not allow an explicit null designation, so you need to assign a value to a variable to specify it).

DECLARE @xx INT
SET @xx =null
 
SELECT COUNT (@xx) from [adventureworks2012].[ Sales]. [SalesOrderHeader]

Code Listing 1. Specify NULL in Count

Because all rows are null, the result is not counted to 0, as shown in Figure 1.

Figure 1. Obviously, the result is 0.

So when you specify count (*) or COUNT (1) or count (' anything '), the result will be the same, because these values are not NULL, as shown in Figure 2.

Figure 2. As long as you specify a non-null expression in count, the result is no different

What about the Count column?

For Count (column), the same applies to the above rule, evaluates whether the value of each row in the column is null, and if NULL is not counted, not null. Therefore count (column) calculates the count of the column or the combination of this column is not empty.

How is count (*) specifically implemented?

When Count () has a value that is not NULL, it is only necessary in SQL Server to find the number of rows that are not NULL in the specific table, that is, all rows (the row is equivalent to nonexistent if the row value is all null). The simplest way to do this is to find a column of NOT NULL, and if the column has an index, use that index, and of course, for performance, SQL Server chooses the narrowest index to reduce IO.

We are Adventureworks2012 the sample database [person]. To delete all nonclustered indexes on the Address table and to index the modifydate of this data type datetime, we look at the execution plan, as shown in Figure 3:

Figure 3. Index with CreateDate

We continue indexing on the StateProvinceID column, which is an int column that is 4 bytes shorter than the previous 8-byte datetime type column, so SQL Server chooses the StateProvinceID index. As shown in Figure 4.

Figure 4. A shorter StateProvinceID index is selected

Therefore, if the count (*) on a table is used more than once, consider creating a single-column index in a shortest column, which can greatly improve performance.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.