Introduction to the JSONB data types in PostgreSQL _ database other

Source: Internet
Author: User
Tags create index json postgresql

PostgreSQL 9.4 is loading a new feature called JSONB, a new type of data that can store JSON data that supports gin indexes. In other words, this feature, in the upcoming update, most importantly, if this is not important, then put Postgres in the file as the recommended location for the database system.

Since 9.2, a consolidated JSON data type already exists with a complete set of features (such as data generation and data deconstruction) and 9.3 new operators. When you use the JSON data type, the data is stored as a copy of exactly the same, and the function is on it, and a back-end operation is required.

This experience JSONB data types are stored in a degraded 2-dollar format, so inserting this data is more efficient than JSON because the background no longer needs to be parsed, so it runs faster and takes care of the gin index. It is for this last reason that we actually advise the reader to use JSONB instead of the JSON program (you can also use JSON if you want). Keep in mind that JSONB uses the same operator and function, readers can read my previous posts to give you some inspiration (or simply look at the Postgres file).

Now let's take a look at how JSONB works, and compare it to JSON. The test data used is 8.6 million of the GeoBase type data, about 1.1G of size, including the city name, country code (can see the complete list here) and many other fields. This data is first stored in a new table in the database by the underlying copy (raw copy), which is then converted to json/jsonb by a table with a set of fill factor 100, and then to see how much space they each occupy.

=# COPY geodata from ' $HOME/downloads/allcountries.txt ';
COPY 8647839
=# CREATE TABLE geodata_jsonb (Data jsonb) with (fillfactor=100);
CREATE TABLE
=# CREATE TABLE Geodata_json (data json) with (fillfactor=100);
CREATE TABLE
=# \timing
Timing is on.
=# INSERT into Geodata_json SELECT Row_to_json (geodata) from Geodata;
Insert 0 8647839
time:287158.457 ms
=# INSERT INTO geodata_jsonb SELECT Row_to_json (geodata):: Jsonb from Geoda TA;
INSERT 0 8647839
time:425825.967 ms

It takes a little longer to generate JSONB data, does it differ in size?

=# SELECT Pg_size_pretty (pg_relation_size (' Geodata_json ':: Regclass)) as JSON,
     Pg_size_pretty (pg_relation_size (' geodata_jsonb ':: Regclass)) as JSONB;
 JSON  | jsonb 
---------+---------
 3274 MB | 3816 MB
(1 row)

Indexing the JSON data starts with the 9.3 version, such as using an operator (note that '->> ' is adopted because it returns text, and the index uses different keywords depending on the query)

=# CREATE INDEX Geodata_index on Geodata_json ((data->> ' Country_code '), (data->> ' asciiname '));
 CREATE INDEX =# SELECT pg_size_pretty (pg_relation_size (' Geodata_index ':: Regclass)) as Json_index; Json_index------------310 MB (1 row) =# SELECT (data->> ' population '):: int as population, data-> ' latitude
    ' As latitude, data-> ' longitude ' as longitude from Geodata_json WHERE data->> ' country_code ' = ' JP ' and
 Data->> ' asciiname ' = ' Tokyo ' and (data->> ' population '):: int!= 0; Population | Latitude | Longitude------------+----------+-----------8336599 | 35.6895 | 139.69171 (1 row) =#-Explain of previous query query plan----- --------------------------------------------------------------------------------------------------------------- -----Bitmap Heap Scan on Geodata_json (cost=6.78..865.24 rows=215 width=32) Recheck Cond: ((Data->> ' Country_c Ode ':: Text) = ' JP ':: Text) and ((Data->> ' asciiname ':: Text) = ' Tokyo ':: Text) Filter: (((Data->> ' population ':: text):: I Nteger <> 0)-> Bitmap index Scan on Geodata_index (cost=0.00..6.72 rows=216 width=0) Index Cond: ((data ->> ' Country_code ':: Text) = ' JP ':: Text) and ((Data->> ' asciiname ':: Text) = ' Tokyo ':: text)) planning time:0.
 172 MS (6 rows)

In this case, the plan (planner) can use the bitmap index scan and use the previously generated index.


Now, a new feature of JSONB is to check the data capacity that contains the operator @>, which can be indexed by gin, and this operator data also includes?? | and?& (to check for the existence of a given keyword). The gin index works on two types of operators:

The default operator class, four previously listed;

Jsonb_hash_ops only supports @>, but performs well when searching for data, and takes up less disk space;

Here's how it works:

=# CREATE INDEX geodata_gin on geodata_jsonb USING gin (data jsonb_hash_ops); CREATE INDEX =# SELECT (data->> ' population '):: int as population, data-> ' latitude ' as latitude, Data-> ' lo Ngitude ' as longitude from geodata_jsonb WHERE data @> ' {' Country_code ': ' JP ', ' asciiname ': ' Tokyo '} ' and (data-&
 Gt;> ' population '):: int!= 0; Population | Latitude | Longitude------------+----------+-----------8336599 | 35.6895 |
 139.69171 (1 row) =# SELECT pg_size_pretty (pg_relation_size (' Geodata_gin ':: Regclass)) as Jsonb_gin; Jsonb_gin-----------1519 MB (1 row) =#-EXPLAIN of previous query query plan- -----------------------------------------------------------------------------------Bitmap Heap Scan on Geodata_ JSONB (cost=131.01..31317.76 rows=8605 width=418) Recheck Cond: (Data @> ' {"Asciiname": "Tokyo", "Country_code": "JP" } ':: JSONB) Filter: (((Data->> ' population ':: text):: integer <> 0)->Bitmap index Scan on geodata_gin (cost=0.00..128.86 rows=8648 width=0) Index Cond: (Data @> ' {"Asciiname": "Tokyo"
 , "Country_code": "JP"} ':: JSONB) Planning time:0.134 ms

Depending on the requirements of your application, you may want to use an index of low space consumption, such as the type of index that btree builds on JSON data; Gin index has more advantages because it overwrites all JSON fields and checks capacity;

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.