Either count (*) or COUNT (1) or count ([column]) in SQL Server is perhaps the most commonly used aggregate function. Many people actually distinguish between the three. This article will explain the role of these three, relations and the underlying principles.
I often see some so-called optimization recommendations that use COUNT (1) instead of Count (*) to improve performance, given the reason that count (*) brings a full table scan. There's no difference in how you actually write count.
The actual meaning of count (1) and COUNT (*) is to evaluate whether the expression in count () is null, and if it is null, it is counted rather than null. For example, if we look at code 1, specify NULL in count (the optimizer does not allow an explicit null designation, so you need to assign a value to a variable to specify it).
DECLARE @xx INT
SET @xx =null
SELECT COUNT (@xx) from [adventureworks2012].[ Sales]. [SalesOrderHeader]
Code Listing 1. Specify NULL in Count
Because all rows are null, the result is not counted to 0, as shown in Figure 1.
Figure 1. Obviously, the result is 0.
So when you specify count (*) or COUNT (1) or count (' anything '), the result will be the same, because these values are not NULL, as shown in Figure 2.
Figure 2. As long as you specify a non-null expression in count, the result is no different
What about the Count column?
For Count (column), the same applies to the above rule, evaluates whether the value of each row in the column is null, and if NULL is not counted, not null. Therefore count (column) calculates the count of the column or the combination of this column is not empty.
How is count (*) specifically implemented?
When Count () has a value that is not NULL, it is only necessary in SQL Server to find the number of rows that are not NULL in the specific table, that is, all rows (the row is equivalent to nonexistent if the row value is all null). The simplest way to do this is to find a column of NOT NULL, and if the column has an index, use that index, and of course, for performance, SQL Server chooses the narrowest index to reduce IO.
We are Adventureworks2012 the sample database [person]. To delete all nonclustered indexes on the Address table and to index the modifydate of this data type datetime, we look at the execution plan, as shown in Figure 3:
Figure 3. Index with CreateDate
We continue indexing on the StateProvinceID column, which is an int column that is 4 bytes shorter than the previous 8-byte datetime type column, so SQL Server chooses the StateProvinceID index. As shown in Figure 4.
Figure 4. A shorter StateProvinceID index is selected
Therefore, if the count (*) on a table is used more than once, consider creating a single-column index in a shortest column, which can greatly improve performance.