Test the selectivity of multiple DK Columns

Source: Internet
Author: User
Tags psql
To test whether multi-Table association can be filtered by Image segment when multiple DK distribution keys exist, the test is as follows: [gpadmin @ gtlions50 ~] $ Psqlgtlionspsql (8.2.15) Typehelpforhelp. gtlions # createtablegtt1 (idint, namecharactervarying (5) distributedby (id, name); CREATETAB

To test whether multi-Table association can be filtered by Image segment when multiple DK distribution keys exist, the test is as follows: [gpadmin @ gtlions50 ~] $ Psql gtlions psql (8.2.15) Type help for help. gtlions = # create table gtt1 (id int, name character varying (5) distributed by (id, name); CREATE TAB

To test whether multi-Table association can be filtered by Image segment when multiple DK distribution keys exist, the test is as follows:
[Gpadmin @ gtlions50 ~] $ Psql gtlions
Psql (8.2.15)
Type "help" for help.


Gtlions = # create table gtt1 (id int, name character varying (5) distributed by (id, name );
CREATE TABLE
Gtlions = # create table gtt2 (id int, name character varying (5) distributed by (id, name );
CREATE TABLE
Gtlions = # insert into gtt1 values (1, 'A ');
INSERT 0 1
Gtlions = # insert into gtt1 values (11, 'B ');
INSERT 0 1
Gtlions = # select gp_segment_id, * from gtt1 order by 1;
Gp_segment_id | id | name
--------------- + ---- + ------
2 | 11 | B
3 | 1 |
(2 rows)


Gtlions = # insert into gtt1 values (111, 'C ');
INSERT 0 1
Gtlions = # insert into gtt1 values (1111, 'D ');
INSERT 0 1
Gtlions = # select gp_segment_id, * from gtt1 order by 1;
Gp_segment_id | id | name
--------------- + ------
Zero | 1111 | d
2 | 11 | B
(3) | 111 | c
3 | 1 |
(4 rows)


Gtlions = # insert into gtt2 select * from gtt1;
INSERT 0 4
Gtlions = # explain analyze select * from gtt1 where id = 11;
QUERY PLAN
Bytes --------------------------------------------------------------------------------------------------------
Gather Motion (slice1; segments: 4) (cost = 0. 00 .. 1.01 rows = 1 width = 6)
Rows out: 1 rows at destination with 3.593 ms to first row, 4.067 ms to end, start offset by 28 ms.
-> Seq Scan on gtt1 (cost = 0. 00 .. 1.01 rows = 1 width = 6)
Filter: id = 11
Rows out: 1 rows (seg2) with 0.159 ms to first row, 0.160 ms to end, start offset by 359 ms.
Slice statistics:
(Slice0) Executor memory: 139 K bytes.
(Slice1) Executor memory: 157 K bytes avg x 4 workers, 157 K bytes max (seg0 ).
Statement statistics:
Memory used: 128000 K bytes
Total runtime: 32.320 MS
(11 rows)


Gtlions = # explain analyze select * from gtt1 where name = 'B ';
QUERY PLAN
Bytes --------------------------------------------------------------------------------------------------------
Gather Motion (slice1; segments: 4) (cost = 0. 00 .. 1.01 rows = 1 width = 6)
Rows out: 1 rows at destination with 3.537 ms to first row, 3.538 ms to end, start offset by 31 ms.
-> Seq Scan on gtt1 (cost = 0. 00 .. 1.01 rows = 1 width = 6)
Filter: name: text = 'B': text
Rows out: 1 rows (seg2) with 0.231 ms to first row, 0.233 ms to end, start offset by-250 ms.
Slice statistics:
(Slice0) Executor memory: 139 K bytes.
(Slice1) Executor memory: 157 K bytes avg x 4 workers, 157 K bytes max (seg0 ).
Statement statistics:
Memory used: 128000 K bytes
Total runtime: 34.830 MS
(11 rows)


Gtlions = # explain analyze select * from gtt1 where id = 11 and name = 'B ';
QUERY PLAN
Bytes -----------------------------------------------------------------------------------------------------------
Gather Motion (slice1; segments: 1) (cost = 0. 00 .. 1.01 rows = 1 width = 6)
Rows out: 1 rows at destination with 2.032 ms to first row, 2.033 ms to end, start offset by 0.500 ms.
-> Seq Scan on gtt1 (cost = 0. 00 .. 1.01 rows = 1 width = 6)
Filter: id = 11 AND name: text = 'B': text
Rows out: 1 rows with 0.062 ms to first row, 0.065 ms to end, start offset by 328 ms.
Slice statistics:
(Slice0) Executor memory: 139 K bytes.
(Slice1) Executor memory: 131 K bytes (seg2 ).
Statement statistics:
Memory used: 128000 K bytes
Total runtime: 2.987 MS
(11 rows)


Gtlions = # explain analyze select * from gtt1 t1, gtt2 t2 where t1.id = t2.id;
QUERY PLAN

Certificate -------------------------------------------------------------------------------------------------------------------------------------------------------------
-----
Gather Motion (slice2; segments: 4) (cost = 1. 11 .. 4.20 rows = 1 width = 12)
Rows out: 4 rows at destination with 27 ms to first row, 29 ms to end, start offset by 54 ms.
-> Hash Join (cost = 4.20 .. rows = 1 width = 12)
Hash Cond: t2.id = t1.id
Rows out: Avg 1.3 rows x 3 workers. Max 2 rows (seg3) with 18 ms to first row, 21 ms to end, start offset by 367 ms.
Executor memory: 1 K bytes avg, 1 K bytes max (seg0 ).
Work_mem used: 1 K bytes avg, 1 K bytes max (seg0). Workfile: (0 spilling, 0 reused)
(Seg3) Hash chain length 1.0 avg, 1 max, using 4 of 524341 buckets.
-> Seq Scan on gtt2 t2 (cost = 0. 00 .. 3.04 rows = 1 width = 6)
Rows out: Avg 1.3 rows x 3 workers. Max 2 rows (seg3) with 0.219 ms to first row, 0.220 ms to end, start offset by 385 ms.
-> Hash (cost = 1. 06 .. 1.06 rows = 1 width = 6)
Rows in: Avg 4.0 rows x 4 workers. Max 4 rows (seg0) with 0.197 ms to end, start offset by-607 ms.
-> Broadcast Motion (slice1; segments: 4) (cost = 0. 00 .. 1.06 rows = 1 width = 6)
Rows out: Avg 4.0 rows x 4 workers at destination. Max 4 rows (seg0) with 0.107 MS to first row, 0.118 MS to end, start offset by-607
Ms.
-> Seq Scan on gtt1 t1 (cost = 0. 00 .. 1.01 rows = 1 width = 6)
Rows out: Avg 1.3 rows x 3 workers. Max 2 rows (seg3) with 3.709 ms to first row, 3.711 ms to end, start offset by 369 ms.
Slice statistics:
(Slice0) Executor memory: 184 K bytes.
(Slice1) Executor memory: 212 K bytes avg x 4 workers, 216 K bytes max (seg0 ).
(Slice2) Executor memory: 8401 K bytes avg x 4 workers, 8401 K bytes max (seg0). Work_mem: 1 K bytes max.
Statement statistics:
Memory used: 128000 K bytes
Total runtime: 84.302 MS
(23 rows)


Gtlions = # explain analyze select * from gtt1 t1, gtt2 t2 where t1.name = t2.name;
QUERY PLAN

Certificate -------------------------------------------------------------------------------------------------------------------------------------------------------------
------
Gather Motion (slice2; segments: 4) (cost = 1. 11 .. 4.20 rows = 1 width = 12)
Rows out: 4 rows at destination with 24 ms to first row, 32 ms to end, start offset by 50 ms.
-> Hash Join (cost = 4.20 .. rows = 1 width = 12)
Hash Cond: t2.name: text = t1.name: text
Rows out: Avg 1.3 rows x 3 workers. Max 2 rows (seg3) with 10 ms to first row, 13 ms to end, start offset by-253 ms.
Executor memory: 1 K bytes avg, 1 K bytes max (seg0 ).
Work_mem used: 1 K bytes avg, 1 K bytes max (seg0). Workfile: (0 spilling, 0 reused)
(Seg3) Hash chain length 1.0 avg, 1 max, using 4 of 524341 buckets.
-> Seq Scan on gtt2 t2 (cost = 0. 00 .. 3.04 rows = 1 width = 6)
Rows out: Avg 1.3 rows x 3 workers. Max 2 rows (seg3) with 0.187 ms to first row, 0.189 ms to end, start offset by-243 ms.
-> Hash (cost = 1. 06 .. 1.06 rows = 1 width = 6)
Rows in: Avg 4.0 rows x 4 workers. Max 4 rows (seg0) with 3.477 ms to end, start offset by-1556 ms.
-> Broadcast Motion (slice1; segments: 4) (cost = 0. 00 .. 1.06 rows = 1 width = 6)
Rows out: Avg 4.0 rows x 4 workers at destination. Max 4 rows (seg0) with 0.161 MS to first row, 0.177 MS to end, start offset by-155
6 ms.
-> Seq Scan on gtt1 t1 (cost = 0. 00 .. 1.01 rows = 1 width = 6)
Rows out: Avg 1.3 rows x 3 workers. Max 2 rows (seg3) with 0.143 ms to first row, 0.146 ms to end, start offset by-260 ms.
Slice statistics:
(Slice0) Executor memory: 184 K bytes.
(Slice1) Executor memory: 212 K bytes avg x 4 workers, 216 K bytes max (seg0 ).
(Slice2) Executor memory: 8401 K bytes avg x 4 workers, 8401 K bytes max (seg0). Work_mem: 1 K bytes max.
Statement statistics:
Memory used: 128000 K bytes
Total runtime: 82.710 MS
(23 rows)


Gtlions = # explain analyze select * from gtt1 t1, gtt2 t2 where t1.id = t2.id and t1.name = t2.name;
QUERY PLAN
Certificate --------------------------------------------------------------------------------------------------------------------------------------------------
Gather Motion (slice1; segments: 4) (cost = 1. 02 .. 4.12 rows = 1 width = 12)
Rows out: 4 rows at destination with 15 ms to first row, 27 ms to end, start offset by 29 ms.
-> Hash Join (cost = 1. 02 .. 4.12 rows = 1 width = 12)
Hash Cond: t2.id = t1.id AND t2.name: text = t1.name: text
Rows out: Avg 1.3 rows x 3 workers. Max 2 rows (seg3) with 16 ms to first row, 23 ms to end, start offset by 344 ms.
Executor memory: 1 K bytes avg, 1 K bytes max (seg3 ).
Work_mem used: 1 K bytes avg, 1 K bytes max (seg3). Workfile: (0 spilling, 0 reused)
(Seg0) Hash chain length 1.0 avg, 1 max, using 1 of 524341 buckets.
(Seg3) Hash chain length 1.0 avg, 1 max, using 2 of 524341 buckets.
-> Seq Scan on gtt2 t2 (cost = 0. 00 .. 3.04 rows = 1 width = 6)
Rows out: Avg 1.3 rows x 3 workers. Max 2 rows (seg3) with 0.186 ms to first row, 0.188 ms to end, start offset by 344 ms.
-> Hash (cost = 1. 01 .. 1.01 rows = 1 width = 6)
Rows in: Avg 1.3 rows x 3 workers. Max 2 rows (seg3) with 0.706 ms to end, start offset by 359 ms.
-> Seq Scan on gtt1 t1 (cost = 0. 00 .. 1.01 rows = 1 width = 6)
Rows out: Avg 1.3 rows x 3 workers. Max 2 rows (seg3) with 0.493 ms to first row, 0.499 ms to end, start offset by 359 ms.
Slice statistics:
(Slice0) Executor memory: 184 K bytes.
(Slice1) Executor memory: 6333 K bytes avg x 4 workers, 8385 K bytes max (seg0). Work_mem: 1 K bytes max.
Statement statistics:
Memory used: 128000 K bytes
Total runtime: 56.840 MS
(21 rows)
Gtlions = # explain analyze select * from gtt1 t1, gtt2 t2 where t1.id = t2.id and t1.name = t2.name and t1.id = 11 and t1.name = 'B ';
QUERY PLAN
Bytes -------------------------------------------------------------------------------------------------------
Gather Motion (slice1; segments: 1) (cost = 0. 00 .. 4.11 rows = 4 width = 12)
Rows out: 1 rows at destination with 1.874 ms to end, start offset by 29 ms.
-> Nested Loop (cost = 0. 00 .. 4.11 rows = 1 width = 12)
Rows out: 1 rows with 0.298 ms to first row, 0.305 ms to end, start offset by-313 ms.
-> Seq Scan on gtt1 t1 (cost = 0. 00 .. 1.02 rows = 1 width = 6)
Filter: id = 11 AND name: text = 'B': text AND 'B': text = name: text AND 11 = id
Rows out: 1 rows with 0.187 ms to first row, 0.189 ms to end, start offset by-313 ms.
-> Seq Scan on gtt2 t2 (cost = 0. 00 .. 3.08 rows = 1 width = 6)
Filter: 'B': text = name: text AND 11 = id AND name: text = 'B': text AND id = 11
Rows out: 1 rows with 0.037 ms to first row, 0.039 ms to end, start offset by-312 ms.
Slice statistics:
(Slice0) Executor memory: 184 K bytes.
(Slice1) Executor memory: 202 K bytes (seg2 ).
Statement statistics:
Memory used: 128000 K bytes
Total runtime: 31.186 MS
(16 rows)


Gtlions = # explain analyze select * from gtt1 t1, gtt2 t2 where t1.id = t2.id and t1.name = t2.name and t1.id = 11 and t1.name = 'B' and t2.id = 11 and t2.name = 'B ';
QUERY PLAN
Bytes --------------------------------------------------------------------------------------------------------
Gather Motion (slice1; segments: 1) (cost = 0. 00 .. 4.08 rows = 4 width = 12)
Rows out: 1 rows at destination with 2.433 ms to first row, 2.434 ms to end, start offset by 32 ms.
-> Nested Loop (cost = 0. 00 .. 4.08 rows = 1 width = 12)
Rows out: 1 rows with 0.270 ms to first row, 0.277 ms to end, start offset by-315 ms.
-> Seq Scan on gtt1 t1 (cost = 0. 00 .. 1.01 rows = 1 width = 6)
Filter: id = 11 AND name: text = 'B': text
Rows out: 1 rows with 0.183 ms to first row, 0.184 ms to end, start offset by-315 ms.
-> Seq Scan on gtt2 t2 (cost = 0. 00 .. 3.06 rows = 1 width = 6)
Filter: id = 11 AND name: text = 'B': text
Rows out: 1 rows with 0.017 ms to first row, 0.019 ms to end, start offset by-315 ms.
Slice statistics:
(Slice0) Executor memory: 184 K bytes.
(Slice1) Executor memory: 176 K bytes (seg2 ).
Statement statistics:
Memory used: 128000 K bytes
Total runtime: 35.278 MS
(16 rows)
From the test results:
0. For Single-table queries, where one dk column is used as the query condition, the Gather Motion step cannot filter segment;
1. to query a single table and use all dk columns, you can select a segment with only data in the Gather Motion step;
2. join queries between two tables. If one dk column is used as the query condition, you must execute the Broadcast Motion action to re-distribute data;
3. join queries between two tables. Using all dk columns can avoid Broadcast Motion;
-EOF-

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.