SQLServer 集合函數 COUNT 最佳化分析，sqlservercount

最後更新：2015-03-24 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

目前的版本：

Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (Intel X86) Apr 2 2010 15:53:02 Copyright (c) Microsoft Corporation Enterprise Edition on Windows NT 5.2 <X86> (Build 3790: Service Pack 2) (Hypervisor)

--建立測試表--drop table tb_CountTestcreate table tb_CountTest([uniqueidentifier] [uniqueidentifier] not null,[bigint] [bigint] not null,[tinyint] [tinyint] not null,[int] [int] not null,[int0] [int] null)go--uniqueidentifier(16 位元組),bigint(8 位元組),int(4 位元組),smallint(2 位元組),tinyint(1 位元組)--插入2000行測試資料insert into tb_CountTest([uniqueidentifier],[bigint],[tinyint],[int],[int0])select NEWID(),number*3-1,number*2%256,number,case when number%6=0 then null else number endfrom (select distinct number from master.dbo.spt_values where number between 1 and 2000)tabgo--建立叢集索引 ([uniqueidentifier])--drop index ix_tb_CountTest_uniqueidentifier on tb_CountTestcreate clustered index ix_tb_CountTest_uniqueidentifier on tb_CountTest([uniqueidentifier])go--建立非叢集索引 ([int])--drop index ix_tb_CountTest_int on tb_CountTestcreate index ix_tb_CountTest_int on tb_CountTest([int])go

--執行以下語句,查看執行計畫.結果如下:select count(*) from dbo.tb_counttestselect count(1) from dbo.tb_counttestselect count([uniqueidentifier]) from dbo.tb_counttestselect count([bigint]) from dbo.tb_counttestselect count([tinyint]) from dbo.tb_counttestselect count([int]) from dbo.tb_counttest

可以看到，統計資訊都是一致的。以上的查詢統計結果都為2000，全都是使用非叢集索引(ix_tb_CountTest_int)掃描，上面6中方法統計的開銷都是一樣的。而下面這個統計，卻是使用叢集索引掃描（ix_tb_CountTest_uniqueidentifier），結果為1667行，篩選了空值。

select count([int0]) from dbo.tb_counttest

兩個問題：

Q1.為什麼都是使用非叢集索引掃描？

Q2.為什麼count([int0])使用的是叢集索引？

A1. 為什麼都是使用非叢集索引掃描？

因為使用非叢集索引返回的資料頁更少。使用使用的都是索引，下面可以搜尋到，按索引查詢時，返回的資料頁有多少。

如：

DBCC TRACEON(3604,-1)

DBCC IND(TestDB,tb_counttest,-1)

DBCC PAGE(TestDB,1,590,3) --叢集索引(根節點)

DBCC PAGE(TestDB,1,959,3) --非叢集索引(根節點)

上面可以看到，叢集索引尋找資料有11頁，加上2頁的IAM頁，IO讀取的頁總數是13頁。

而非叢集索引頁的子分葉節點，有6頁的索引頁，加上2頁的IAM頁，IO讀取的頁總數是8頁。

按理說，非叢集索引中包括了叢集索引的鍵列才對，但是有索引的情況下，尋找資料只要訪問到上一級的頁就行，沒有實際訪問到子葉的資料頁（叢集索引）或者索引頁（非叢集索引）。因此使用非叢集索引(ix_tb_CountTest_int)統計的資料，即時使用count([uniqueidentifier])統計，走的還是非叢集索引掃描。資料庫引擎自動最佳化了。

當我們使用【set statistics io on】查看時，前6中情況count(*)中，讀取資料頁8也，而count([int0])讀取了13頁。

A2.為什麼count([int0])使用的是叢集索引？

因為列[int0]中有空值(null)，當執行下面這個時，我們就發現效能非常不好了。

強制使用非叢集索引！（結果是排除了null值的）

select count([int0])fromdbo.tb_counttestwith(index(ix_tb_CountTest_int))

看到IO讀取4008頁，也就是先讀取非叢集索引子葉2000行資料進行索引掃描，再讀取叢集索引子葉2000行進行鍵尋找，加上2次中每次讀取的 2 IAM頁+2索引中間節點頁，共4008頁。且執行計畫也不好。所以count([int0])用了叢集索引.

--現在再建立另一個索引：--建立非叢集索引([tinyint])--drop index ix_tb_CountTest_tinyint on tb_CountTestcreate index ix_tb_CountTest_tinyint on tb_CountTest([tinyint])go

總共只有5(索引頁)+2(IAM頁)=7頁，這時效能更好些了！

同樣執行count統計時，使用了這個非叢集索引(ix_tb_CountTest_tinyint)掃描。這次比上面使用的INT做索引少了一頁。其實主要是索引中列類型長度減少了，這樣一頁資料中就能夠儲存更多的資料。查詢時取出的資料頁更少，IO更好一些，資料庫引擎自動選擇了統計。

附：

COUNT 與COUNT_BIG函數類似。兩個函數唯一的差別是它們的返回值。COUNT始終返回int資料類型值。COUNT_BIG始終返回bigint 資料類型值。

總結：

1. Count統計，只要該列不含空值，統計效能都是一樣的，系統預設選擇最優索引。

2. 如果表中有更小的欄位做索引，統計將使用它並更快統計。

因此，當count統計表中的資料很慢時，除了給表加上nolock，同時可以找一個長度很小的欄位建立索引。

參考：http://bbs.csdn.net/topics/390635419（個人與樓主分析有出入）

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More