Postgresql 資料在一張表中存在,另一張表不滿足完整性的尋找

來源:互聯網
上載者:User

標籤:postgresql   效能   sql   

有兩張表T1,T2,表結構和資料如下:
create table t1 (  id int);create table t2 (  id int,  finished int);insert into t1 values (1);insert into t1 values (2);insert into t1 values (3);insert into t1 values (4);insert into t1 values (5);insert into t1 values (6);insert into t2 values (1,1);insert into t2 values (2,0);insert into t2 values (3,1);insert into t2 values (4,0);
SQL Fiddle 可以測試SQL語句的執行。
想要實現T1中存在前提下,T2中不存在或者finished=0,也就是查詢結果是:2, 4, 5, 6.

一、效能測試: 1. not in:
explain (analyze,verbose,costs,buffers) select ID from T1 where ID not in (select ID from T2 where finished=1);
Total runtime: 0.128 ms

2. not exists:
explain (analyze,verbose,costs,buffers) select ID from T1 where not exists (select 1 from T2 where T1.ID=T2.ID and T2.finished=1);
Total runtime: 0.105 ms

3. left join:
explain (analyze,verbose,costs,buffers) select T1.ID from T1 left join T2 on T1.ID=T2.ID and T2.finished=1 where T2.ID is null;
Total runtime: 0.096 ms

4. 網上還看到一種更快方法,但測試下來此方法不對,所以不討論:
select ID from T2 where (select count(1) from T1 where T1.ID=T2.ID) = 0; 這條語句查詢結果為空白


因此在postgresql 9.3 上語句執行速度 left join > not exists > not in
當T1和T2表中ID出現null時,not in 語句會有不同的表現,所以推薦總是用not exists 代替 not in.



二、大資料量效能測試:在大量資料的時候,not in有嚴重性能下降的問題,下面是我在i5 2.4GHz MAC pro 13吋上的測試。
department(T1) 為59280條資料,資料長度29字元;dept(T2) 為23633條資料,資料長度29字元。

1. explain analyze select department.id from department where department.id not in (select id from dept where finished=true);
Total runtime: 447073.065 ms

2. explain analyze select department.id from department where not exists (select 1 from dept where department.id=dept.id and finished=true);
Total runtime: 325.732 ms

3. explain analyze select department.id from department left join dept on department.id=dept.id and dept.finished=true where dept.id is null;
Total runtime: 319.869 ms


三、總結:在Postgresql 9.3上:
not in 不僅效能差,而且邏輯可能有問題。
not exists 效能不錯,思考起來比較容易。
left join 效能最好,但總體跟not exists 比也快不了多少,思考稍微有點繞。


下面是一張網上的left join 的圖,但找不到出處,有助於理解 left join 的過程:


Postgresql 資料在一張表中存在,另一張表不滿足完整性的尋找

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.