近期有人提出想查看Postgresql的執行計畫,下面分析下PG執行計畫中的cost等相關值是怎麼計算出來的:
PG的版本是9.1.2
1.終端工具PGADMIN,對執行的語句按F7即可,然後看資料輸出和解釋
2.命令列分析:explain select * from table_name;
一般我們會比較關注消耗值cost和掃描的方式,如走索引或者full scan全表掃描.當COST值消耗比較大時需要注意是否有最佳化的可能。
與執行計畫相關的幾個參數,參看下面的樣本:
kenyon=# select count(1) from dba.website ; --普通堆棧表,無任何索引約束 count------- 20(1 row)kenyon=# explain select * from dba.website ; QUERY PLAN -------------------------------------------------------- Seq Scan on website (cost=0.00..1.20 rows=20 width=4)(1 row) --relpages磁碟頁,reltuples是行數(與實際不一定相符,一般略小)kenyon=# select relpages,reltuples from pg_class where relname = 'website'; relpages | reltuples----------+----------- 1 | 20(1 row)kenyon=# select 1*1+20*0.01; --cost = relpages * seq_page_cost + reltuples * cpu_tuple_cost ?column?---------- 1.20(1 row)kenyon=# show cpu_tuple_cost ; cpu_tuple_cost---------------- 0.01(1 row)kenyon=# show seq_page_cost; seq_page_cost--------------- 1(1 row)--加限制條件的執行計畫kenyon=# select count(1) from dba.website where hits >15; count------- 5(1 row)kenyon=# explain select * from dba.website where hits >15; QUERY PLAN ------------------------------------------------------- Seq Scan on website (cost=0.00..1.25 rows=5 width=4) Filter: (hits > 15)(2 rows)kenyon=# show cpu_operator_cost ; cpu_operator_cost------------------- 0.0025(1 row)因為掃描的總數是20行,不變的,所以COST不會下降,相反反而增加了0.05,這是因為額外消耗了CPU的時間去檢查符合約束條件資料,即cost 在原來的基礎上再增加 20 * 0.0025 = 0.05 (reltuples * cpu_operator_cost)--加索引的執行計畫kenyon=# select count(1) from dba.website_2 ; count------- 8000(1 row)kenyon=# explain select * from dba.website_2 ; QUERY PLAN -------------------------------------------------------------- Seq Scan on website_2 (cost=0.00..112.00 rows=8000 width=4)(1 row)kenyon=# select relpages,reltuples from pg_class where relname = 'website_2'; relpages | reltuples----------+----------- 32 | 8000(1 row)kenyon=# explain select * from dba.website_2 where hits >7900; --走的索引 QUERY PLAN ---------------------------------------------------------------------------------- Index Scan using ind_website_2 on website_2 (cost=0.00..10.00 rows=100 width=4) Index Cond: (hits > 7900)(2 rows)()kenyon=# explain select * from dba.website_2 where hits >10; --未走索引(不滿足索引條件,full scan) QUERY PLAN -------------------------------------------------------------- Seq Scan on website_2 (cost=0.00..132.00 rows=7991 width=4) -- 132 = 112+8000*0.0025 Filter: (hits > 10)(2 rows)雖然讀取的COST更大,但是因為索引的緣故,訪問的資料量變小了,所以總體COST是下降的。
--多表JOIN的執行計畫 樣本: 若想看實際的一個執行時間,可以加上 analyze 參數
kenyon=# explain analyze select * from dba.website a ,dba.website_2 b where a.hits = b.hits and a.hits >18; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=1.26..1.90 rows=2 width=8) (actual time=0.070..0.075 rows=2 loops=1) Merge Cond: (b.hits = a.hits) -> Index Scan using ind_website_2 on website_2 b (cost=0.00..235.25 rows=8000 width=4) (actual time=0.013..0.020 rows=21 loops=1) -> Sort (cost=1.26..1.26 rows=2 width=4) (actual time=0.035..0.037 rows=2 loops=1) Sort Key: a.hits Sort Method: quicksort Memory: 17kB -> Seq Scan on website a (cost=0.00..1.25 rows=2 width=4) (actual time=0.009..0.011 rows=2 loops=1) Filter: (hits > 18) Total runtime : 0.120 ms (9 rows)
total runtime 是執行器啟動和關閉的時間,但不包括解析,重寫和規劃的時間
注意: pg_class中的relpages,reltuples資料不是即時更新的,一般在vacuum analyze和少部分DDL(如建立索引)後更新。
樣本1:
kenyon=# insert into dba.website select generate_series(8000,9000);INSERT 0 1001kenyon=# select relpages,reltuples,relname,relkind from pg_class where relname like '%website%'; relpages | reltuples | relname | relkind----------+-----------+---------------+--------- 1 | 20 | website | r 32 | 8000 | website_2 | r 20 | 8000 | ind_website_2 | i(3 rows)kenyon=# vacuum analyze dba.website;VACUUMkenyon=# vacuum analyze dba.website;VACUUMkenyon=# select relpages,reltuples,relname,relkind from pg_class where relname like '%website%'; relpages | reltuples | relname | relkind----------+-----------+---------------+--------- 5 | 1021 | website | r 36 | 8999 | website_2 | r 22 | 8999 | ind_website_2 | i(3 rows)
樣本2:
kenyon=# insert into dba.website select generate_series(8000,9000);INSERT 0 1001kenyon=# select relpages,reltuples,relname,relkind from pg_class where relname like '%website%'; relpages | reltuples | relname | relkind----------+-----------+---------------+--------- 1 | 21 | website | r 36 | 8999 | website_2 | r 22 | 8999 | ind_website_2 | i(3 rows)kenyon=# create index ind_website on dba.website(hits);CREATE INDEXkenyon=# select relpages,reltuples,relname,relkind from pg_class where relname like '%website%'; relpages | reltuples | relname | relkind----------+-----------+---------------+--------- 5 | 1022 | website | r 36 | 8999 | website_2 | r 22 | 8999 | ind_website_2 | i 5 | 1022 | ind_website | i(4 rows)
所涉及的系統資料表:
pg_stats
pg_statistic
pg_class
pg_stat是任何人都可以看的,而且可讀性高,比較直觀,pg_statistic只有superuser才能讀,並且可讀性差,普通人員建議看pg_stats,pg_stats是pg_statistic的視圖。 這兩個表也不是即時更新的,需要vacuum analyze時會更新
所涉及的系統變數:
default_statistics_target
geqo_threshold
join_collapse_limit
from_collapse_limit
kenyon=# show default_statistics_target ; default_statistics_target--------------------------- 100(1 row)kenyon=# show geqo_threshold ; --這個參數的大小會設定執行計畫從窮舉搜尋到機率選擇性搜尋的臨界值 geqo_threshold---------------- 12(1 row)kenyon=# show join_collapse_limit ; --join串連走執行計畫上限 join_collapse_limit--------------------- 8(1 row)kenyon=# show from_collapse_limit ; from_collapse_limit--------------------- 8(1 row)
EXPLAIN
Name
EXPLAIN— show the execution plan of a statement
Synopsis
EXPLAIN [ ( option [, ...] ) ] statement
EXPLAIN [ ANALYZE ] [ VERBOSE ] statement
where option can be one of:
ANALYZE [ boolean ]
VERBOSE [ boolean ]
COSTS [ boolean ]
BUFFERS [ boolean ]
FORMAT { TEXT | XML | JSON | YAML }
例子:
kenyon=# explain (analyze,verbose,costs,buffers) select id from dba.test222 order by id desc limit 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Limit (cost=1807.80..1807.80 rows=1 width=4) (actual time=87.167..87.168 rows=1 loops=1) Output: id Buffers: shared hit=393 -> Sort (cost=1807.80..2043.60 rows=94320 width=4) (actual time=87.165..87.165 rows=1 loops=1) Output: id Sort Key: test222.id Sort Method: top-N heapsort Memory: 17kB Buffers: shared hit=393 -> Seq Scan on dba.test222 (cost=0.00..1336.20 rows=94320 width=4) (actual time=0.036..42.847 rows=100000 loops=1) Output: id Buffers: shared hit=393 Total runtime: 87.183 ms(12 rows)kenyon=# explain (analyze,verbose,costs,buffers) select max(id) from dba.test222; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=1572.00..1572.01 rows=1 width=4) (actual time=77.679..77.680 rows=1 loops=1) Output: max(id) Buffers: shared hit=393 -> Seq Scan on dba.test222 (cost=0.00..1336.20 rows=94320 width=4) (actual time=0.012..36.908 rows=100000 loops=1) Output: id Buffers: shared hit=393 Total runtime: 77.701 ms(7 rows)
explain參數解釋:
ANALYZE :執行命令並顯示執行事件,預設false
VERBOSE :對執行計畫提供額外的資訊,如查詢欄位資訊等,預設false
COSTS :顯示執行計畫的,預設true
BUFFERS :預設false,前置條件是analyze
FORMAT :預設格式是text