A case of performance improvement for PostgreSQL hstore columns

Source: Internet
Author: User
Tags modifiers

PostgreSQL supports hstore to store data such as Key->value, which is similar to the array or JSON type. Efficient indexing is essential to the efficient use of this type of data. Let's take a look at the performance issues of two different types of indexes for the same retrieval request today.


If we have such an original table, there is a Btree index based on the Str1 field.


t_girl=# \d Status_check;          Table "Ytt.status_check" Column |         Type          | Modifiers--------+-----------------------+-----------Is_yes | Boolean               | NOT null str1   | character varying (ALL) | NOT null str2   | character varying (ALL) | not nullindexes:    



There are 10W records inside. The data is probably as follows
t_girl=# select * from Status_check limit 2; Is_yes | str1 |         STR2         --------+------+----------------------F      | 0    | cfcd208495d565ef66e7 T      | 1    




The Status_check_hstore table structure that holds the Hstore type has a GIST index based on the Str1_str2 field.
Table "Ytt.status_check_hstore"  Column   |  Type   | Modifiers-----------+---------+-----------Is_yes    | boolean |  STR1_STR2 | Hstore  | Indexes:    



t_girl=# select * from Status_check_hstore limit 2; Is_yes |          STR1_STR2          --------+-----------------------------F      | "0" = "cfcd208495d565ef66e7" T      | "1" = "C4CA4238A0B923820DCC" (2 rows) time:39.874 ms




Next we have the same results as querying the original table, and the original table query is very efficient. Table statements and the results are as follows,
t_girl=# SELECT * from Status_check where str1 in (' Ten ', ' Max ', ' $ ');         Is_yes | str1 |         STR2         --------+------+----------------------T      | |   d3d9446802a44259755d T      | |   37693cfc748049e45d87 F      |   182be0c5cdcd5072bb18 (3 rows) time:0.690 ms


The above statement took less than 1 milliseconds.


Next we query the Hstore table,


t_girl=# Select Is_yes,skeys (STR1_STR2), Svals (STR1_STR2) from Status_check_hstore where str1_str2? | array[' 10 ', ' 23 ', ' 33 ']; Is_yes | Skeys |        Svals         --------+-------+----------------------T      | |    d3d9446802a44259755d T      | |    37693cfc748049e45d87 F      |    182be0c5cdcd5072bb18 (3 rows) time:40.256 ms


My God, it's dozens of times times slower than the original table query.


Look at the query plan and scan all the lines.
                                    QUERY PLAN                                     -----------------------------------------------------------------------------------Bitmap Heap Scan on Status_check_hstore  (cost=5.06..790.12 rows=100000 width=38)   recheck Cond: (str1_str2? | ' {10,23,33} ':: text[])   ,  Bitmap Index Scan on Idx_str_str2_gist  (cost=0.00..5.03 rows=100 width=0)         Index Cond: (str1_str2? | ' {10,23,33} ':: text[]) (4 rows) time:0.688 ms





We would like to optimize this statement, if the statement becomes the same as the original statement, then whether the Btree index can be used?
Next, create a function index based on the Btree,


t_girl=# CREATE index Idx_str1_str2_akeys on Status_check_hstore using Btree (array_to_string (Akeys (STR1_STR2), ', ')); CREATE indextime:394.123 ms



OK, change the statement to perform the same search,
t_girl=# Select Is_yes,skeys (STR1_STR2), Svals (STR1_STR2) from Status_check_hstore where Array_to_string (Akeys (str1_ STR2), ', ') in (' Ten ', ' Max ', ' + ');         Is_yes | Skeys |        Svals         --------+-------+----------------------T      | |    d3d9446802a44259755d T      | |    37693cfc748049e45d87 F      |    182be0c5cdcd5072bb18 (3 rows) time:0.727 ms




This time is as fast as the original query.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.