PostgreSQL supports hstore to store data such as Key->value, which is similar to the array or JSON type. Efficient indexing is essential to the efficient use of this type of data. Let's take a look at the performance issues of two different types of indexes for the same retrieval request today.
If we have such an original table, there is a Btree index based on the Str1 field.
t_girl=# \d Status_check; Table "Ytt.status_check" Column | Type | Modifiers--------+-----------------------+-----------Is_yes | Boolean | NOT null str1 | character varying (ALL) | NOT null str2 | character varying (ALL) | not nullindexes:
There are 10W records inside. The data is probably as follows
t_girl=# select * from Status_check limit 2; Is_yes | str1 | STR2 --------+------+----------------------F | 0 | cfcd208495d565ef66e7 T | 1
The Status_check_hstore table structure that holds the Hstore type has a GIST index based on the Str1_str2 field.
Table "Ytt.status_check_hstore" Column | Type | Modifiers-----------+---------+-----------Is_yes | boolean | STR1_STR2 | Hstore | Indexes:
t_girl=# select * from Status_check_hstore limit 2; Is_yes | STR1_STR2 --------+-----------------------------F | "0" = "cfcd208495d565ef66e7" T | "1" = "C4CA4238A0B923820DCC" (2 rows) time:39.874 ms
Next we have the same results as querying the original table, and the original table query is very efficient. Table statements and the results are as follows,
t_girl=# SELECT * from Status_check where str1 in (' Ten ', ' Max ', ' $ '); Is_yes | str1 | STR2 --------+------+----------------------T | | d3d9446802a44259755d T | | 37693cfc748049e45d87 F | 182be0c5cdcd5072bb18 (3 rows) time:0.690 ms
The above statement took less than 1 milliseconds.
Next we query the Hstore table,
t_girl=# Select Is_yes,skeys (STR1_STR2), Svals (STR1_STR2) from Status_check_hstore where str1_str2? | array[' 10 ', ' 23 ', ' 33 ']; Is_yes | Skeys | Svals --------+-------+----------------------T | | d3d9446802a44259755d T | | 37693cfc748049e45d87 F | 182be0c5cdcd5072bb18 (3 rows) time:40.256 ms
My God, it's dozens of times times slower than the original table query.
Look at the query plan and scan all the lines.
QUERY PLAN -----------------------------------------------------------------------------------Bitmap Heap Scan on Status_check_hstore (cost=5.06..790.12 rows=100000 width=38) recheck Cond: (str1_str2? | ' {10,23,33} ':: text[]) , Bitmap Index Scan on Idx_str_str2_gist (cost=0.00..5.03 rows=100 width=0) Index Cond: (str1_str2? | ' {10,23,33} ':: text[]) (4 rows) time:0.688 ms
We would like to optimize this statement, if the statement becomes the same as the original statement, then whether the Btree index can be used?
Next, create a function index based on the Btree,
t_girl=# CREATE index Idx_str1_str2_akeys on Status_check_hstore using Btree (array_to_string (Akeys (STR1_STR2), ', ')); CREATE indextime:394.123 ms
OK, change the statement to perform the same search,
t_girl=# Select Is_yes,skeys (STR1_STR2), Svals (STR1_STR2) from Status_check_hstore where Array_to_string (Akeys (str1_ STR2), ', ') in (' Ten ', ' Max ', ' + '); Is_yes | Skeys | Svals --------+-------+----------------------T | | d3d9446802a44259755d T | | 37693cfc748049e45d87 F | 182be0c5cdcd5072bb18 (3 rows) time:0.727 ms
This time is as fast as the original query.