A case of performance improvement for PostgreSQL hstore columns

Source: Internet
Author: User
Tags create index modifiers postgresql


PostgreSQL supports hstore to store data such as Key->value, which is in fact similar to array or JSON type. Efficient indexing is essential to the efficient use of this type of data. Let's take a look at the performance issues of two different types of indexes for the same retrieval request today.








If we had such an original table. There is a Btree index based on the Str1 field.

t_girl = # \ d status_check;
          Table "ytt.status_check"
 Column | Type | Modifiers
-------- + ----------------------- + -----------
 is_yes | boolean | not null
 str1 | character varying (20) | not null
 str2 | character varying (20) | not null
Indexes:
    "index_status_check_str1" btree (str1)



There are 10W records. The data is probably as follows.
t_girl = # select * from status_check limit 2;
 is_yes | str1 | str2
-------- + ------ + ----------------------
 f | 0 | cfcd208495d565ef66e7
 t | 1 | c4ca4238a0b923820dcc
(2 rows)


Time: 0.617 ms
t_girl = #




Store the status_check_hstore table structure of hstore type, based on the str1_str2 field has a GIST index.
 Table "ytt.status_check_hstore"
  Column | Type | Modifiers
----------- + --------- + -----------
 is_yes | boolean |
 str1_str2 | hstore |
Indexes:
    "idx_str_str2_gist" gist (str1_str2)



t_girl = # select * from status_check_hstore limit 2;
 is_yes | str1_str2
-------- + -----------------------------
 f | "0" => "cfcd208495d565ef66e7"
 t | "1" => "c4ca4238a0b923820dcc"
(2 rows)


Time: 39.874 ms




Next we have to get the same result as querying the original table, of course the query of the original table is very efficient. Table statements and results are as follows,
t_girl = # select * from status_check where str1 in (‘10‘, ‘23‘, ‘33‘);
 is_yes | str1 | str2
-------- + ------ + ----------------------
 t | 10 | d3d9446802a44259755d
 t | 23 | 37693cfc748049e45d87
 f | 33 | 182be0c5cdcd5072bb18
(3 rows)


Time: 0.690 ms


The above statement took less than 1 millisecond.


Next we query the hstore table.


t_girl = # select is_yes, skeys (str1_str2), svals (str1_str2) from status_check_hstore where str1_str2? | array [‘10 ‘,‘ 23 ‘,‘ 33 ‘];
 is_yes | skeys | svals
-------- + ------- + ----------------------
 t | 10 | d3d9446802a44259755d
 t | 23 | 37693cfc748049e45d87
 f | 33 | 182be0c5cdcd5072bb18
(3 rows)


Time: 40.256 ms


Oh my God. Queries that are dozens of times slower than the original table.


Look at the query plan and scan all the rows.
                                    QUERY PLAN
-------------------------------------------------- ---------------------------------
 Bitmap Heap Scan on status_check_hstore (cost = 5.06..790.12 rows = 100000 width = 38)
   Recheck Cond: (str1_str2?
| ‘{10,23,33}’ :: text [])
   -> Bitmap Index Scan on idx_str_str2_gist (cost = 0.00..5.03 rows = 100 width = 0)
         Index Cond: (str1_str2? | ‘{10,23,33}‘ :: text [])
(4 rows)


Time: 0.688 ms






We think of ways to optimize this statement, assuming that this statement becomes the same as the original statement. Is it possible to use the BTREE index?
Next, create a function index based on BTREE,


t_girl = # create index idx_str1_str2_akeys on status_check_hstore using btree (array_to_string (akeys (str1_str2), ‘,‘));
CREATE INDEX
Time: 394.123 ms



OK, change the statement to run the same search,
t_girl = # select is_yes, skeys (str1_str2), svals (str1_str2) from status_check_hstore where array_to_string (akeys (str1_str2), ‘,‘) in (‘10‘, ‘23‘, ‘33’);
 is_yes | skeys | svals
-------- + ------- + ----------------------
 t | 10 | d3d9446802a44259755d
 t | 23 | 37693cfc748049e45d87
 f | 33 | 182be0c5cdcd5072bb18
(3 rows)


Time: 0.727 ms




This time as fast as the original query.
An example of PostgreSQL hstore column performance improvement

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.