PostgreSQL hstore Column Performance Improvement example

Source: Internet
Author: User

PostgreSQL supports hstore to store data such as KEY-> VALUE, which is also similar to ARRAY or JSON. To use this type of data efficiently, you must use efficient indexes. Let's take a look at the performance of two different types of indexes for the same retrieval request.


Suppose we have an original table with a BTREE index based on the str1 field.


t_girl=# \d status_check;          Table "ytt.status_check" Column |         Type          | Modifiers --------+-----------------------+----------- is_yes | boolean               | not null str1   | character varying(20) | not null str2   | character varying(20) | not nullIndexes:    "index_status_check_str1" btree (str1) 



There are 10 million records. The data is roughly as follows,
t_girl=# select * from status_check limit 2; is_yes | str1 |         str2         --------+------+---------------------- f      | 0    | cfcd208495d565ef66e7 t      | 1    | c4ca4238a0b923820dcc(2 rows)Time: 0.617 mst_girl=# 




Stores the status_check_hstore table structure of the hstore type. There is a GIST index based on the str1_str2 field.
 Table "ytt.status_check_hstore"  Column   |  Type   | Modifiers -----------+---------+----------- is_yes    | boolean |  str1_str2 | hstore  | Indexes:    "idx_str_str2_gist" gist (str1_str2) 



t_girl=# select * from status_check_hstore limit 2; is_yes |          str1_str2          --------+----------------------------- f      | "0"=>"cfcd208495d565ef66e7" t      | "1"=>"c4ca4238a0b923820dcc"(2 rows)Time: 39.874 ms




Next we will get the same results as the original table query, of course, the original table query is very efficient. The table statements and results are as follows,
t_girl=# select * from status_check where str1 in ('10','23','33');         is_yes | str1 |         str2         --------+------+---------------------- t      | 10   | d3d9446802a44259755d t      | 23   | 37693cfc748049e45d87 f      | 33   | 182be0c5cdcd5072bb18(3 rows)Time: 0.690 ms


The preceding statement takes less than 1 ms.


Next we will query the hstore table,


t_girl=# select is_yes,skeys(str1_str2),svals(str1_str2) from status_check_hstore where str1_str2 ?| array['10','23','33']; is_yes | skeys |        svals         --------+-------+---------------------- t      | 10    | d3d9446802a44259755d t      | 23    | 37693cfc748049e45d87 f      | 33    | 182be0c5cdcd5072bb18(3 rows)Time: 40.256 ms


My days are dozens of times slower than the query of the original table.


Check the query plan and scan all rows.
                                    QUERY PLAN                                     ----------------------------------------------------------------------------------- Bitmap Heap Scan on status_check_hstore  (cost=5.06..790.12 rows=100000 width=38)   Recheck Cond: (str1_str2 ?| '{10,23,33}'::text[])   ->  Bitmap Index Scan on idx_str_str2_gist  (cost=0.00..5.03 rows=100 width=0)         Index Cond: (str1_str2 ?| '{10,23,33}'::text[])(4 rows)Time: 0.688 ms





We want to optimize this statement. If we convert this statement into the same as the original statement, can we use the BTREE index?
Next, create a function index based on B-tree,


t_girl=# create index idx_str1_str2_akeys on status_check_hstore using btree (array_to_string(akeys(str1_str2),','));CREATE INDEXTime: 394.123 ms



OK. Change the statement to execute the same search,
t_girl=# select is_yes,skeys(str1_str2),svals(str1_str2) from status_check_hstore where array_to_string(akeys(str1_str2),',') in ('10','23','33');         is_yes | skeys |        svals         --------+-------+---------------------- t      | 10    | d3d9446802a44259755d t      | 23    | 37693cfc748049e45d87 f      | 33    | 182be0c5cdcd5072bb18(3 rows)Time: 0.727 ms




This is as fast as the original query.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.