Select count (*), Count (1), and Count (column)
In SQL Server, Count (*), Count (1), or Count ([column]) is perhaps the most common aggregate function. Many people cannot tell the difference between the three. This article will explain the functions, relationships, and principles behind these three.
I often see some so-called optimization suggestions that use Count (*) instead of Count (*) but Count (1) to improve performance. The reason is Count (*) this will cause a full table scan. In fact, there is no difference in how to write Count.
Count (1) and Count (*) actually mean to evaluate whether the expression in Count () is NULL. If it is NULL, It is not counted, but not NULL. For example, as shown in code 1, specify NULL in Count (the optimizer cannot explicitly specify NULL, so you need to assign a value to the variable to specify ).
Copy codeThe Code is as follows:
DECLARE @ xx INT
SET @ xx = NULL
Select count (@ xx) FROM [AdventureWorks2012]. [Sales]. [SalesOrderHeader]
Code List 1. specify NULL in Count
Because all rows are NULL, the result does not count as 0, as shown in result 1.
Figure 1. Obviously, the result is 0
Therefore, when you specify Count (*), Count (1), or whatever Count ('anything '), the results will be the same, because these values are not NULL, as shown in 2.
Figure 2. As long as a non-NULL expression is specified in Count, the result is no different.
What about the Count column?
For Count (column), it also applies to the above rule. It evaluates whether the value of each row in the column is NULL. If it is NULL, it does not Count. If it is not NULL, it counts. Therefore, Count (column) calculates the Count of a column or a combination of these columns that are not empty.
How to execute Count?
When Count () has a non-NULL value, you only need to find the number of rows not NULL in the specific table in SQL Server, that is, all rows (if the value of a row is NULL, the row does not exist ). The simplest way to do this is to find a column not null. If the column has an index, use this index. Of course, for performance, SQL Server selects the narrower index to reduce IO.
In the Adventureworks2012 sample database, we [Person]. [Address] delete all non-clustered indexes on the table. Create an index on the column where the data type of ModifyDate is DateTime. Let's look at the execution plan, as shown in 3:
Figure 3. CreateDate index used
We continue to create an index on the StateProvinceID column, which occupies 4 bytes and is shorter than the DateTime column of the previous 8 bytes. Therefore, SQL Server selects the StateProvinceID index. 4.
Figure 4. Select a shorter StateProvinceID Index
Therefore, if Count (*) is used for a large number of tables, it is recommended that you create a single column index in the shortest column, which greatly improves the performance.