Is PostgreSQL your next JSON database?

Source: Internet
Author: User
Tags create index



According to Betteridge's law (any headline question can be answered in one word: no), unless your JSON data is rarely modified and has many queries.



The latest version of PostgreSQL adds more support for JSON, and we've asked if PostgreSQL can replace MongoDB as a JSON database, the answer is obvious, but hopefully, aha, the question is asked by the reader.


"Isn't PostgreSQL already having some JSON support?" ”


Yes, the version before PostgreSQL 9.4 also has JSON data type, you can do this:

CREATE TABLE justjson (id INTEGER, doc JSON)
> INSERT INTO justjson VALUES (1, ‘{
    "name": "fred",
    "address": {
        "line1": "52 The Elms",
        "line2": "Elmstreet",
        "postcode": "ES1 1ES"
        }
    } ‘);
The original text of JSON was saved to the database, including blank rows and key sequences and new keys. Let's take a look at the saved data:

> SELECT * FROM justjson;
 id | doc
---- + ---------------------------------
  1 | {+
    | "name": "fred", +
    | "address": {+
    | "line1": "52 The Elms", +
    | "line2": "Elmstreet", +
    | "postcode": "ES1 1ES" +
    |} +
    |}
(1 row)
It is exactly the same as the text before saving, but we can still parse out the specific data. PostgreSQL provides a set of JSON operation methods for searching. For example, we only need to find out the address information. If we do?

select doc->> ‘address’ FROM justjson;
            ? column?
---------------------------------
 {+
         "line1": "52 The Elms", +
         "line2": "Elmstreet", +
         "postcode": "ES1 1ES" +
         }
(1 row)
The->> operator of the doc field is to query a field of the JSON object and return the text. The number can also be used as the index of the array, but the text is still returned. Similar to->>, there is the-> operator, which returns the content without text conversion. You can use it to navigate and search for JSON objects, such as:

select doc-> ‘address’->> ‘postcode’ FROM justjson;
 ? column?
----------
 ES1 1ES
(1 row)
There is a shorter way to specify the search path, using the # >> operator, like a dream:

select doc # >> ‘{address, postcode}‘ FROM justjson;
 ? column?
----------
 ES1 1ES
(1 row)
By saving the complete JSON data type, it can be exactly the same as the source data and will not lose content, but it also brings costs, lack of performance, and cannot be indexed to maintain complete consistency ... all, although it can be easily maintained Consistency and preservation of JSON documents, but there is still much room for improvement, so JSONB was introduced.

"What's different about JSONB?"
JSONB can transfer the entire JSON document to hierarchical KEY / VALUE data pairs, all whitespace characters are removed, duplicate keys are retained only for the last time, and keys are not sorted, but are stored using HASH. In the above example, JSONB is used. For the version, it looks like this:

> CREATE TABLE justjsonb (id INTEGER, doc JSONB)
> INSERT INTO justjsonb VALUES (1, ‘{
    "name": "fred",
    "address": {
        "line1": "52 The Elms",
        "line2": "Elmstreet",
        "postcode": "ES1 1ES"
        }
    } ‘);
> SELECT * FROM justjsonb;
 id | doc
---- + --------------------------------------------- -------------------------------------------------- -----
  1 | {"name": "fred", "address": {"line1": "52 The Elms", "line2": "Elmstreet", "postcode": "ES1 1ES"}}
(1 row)
It can be seen that all non-text content has disappeared and replaced with the minimum format required by the JSON document. This compression method means that when the data is inserted, it will be automatically formatted, which can reduce the workload of accessing data analysis and processing in the future.

"This kind of data for PostgreSQL is a bit like HSTORE"
Seeing key-value pairs, JSONB is really a bit like the PostgreSQL HSTORE extension. It can also store key-value pairs, but it is an extension. However, JSONB (and JSON) is in the PostgreSQL kernel. HSTORE has only one level, but PostgreSQL can have nested elements, and HSTORE can only store strings, and JSONB can also store JSON numeric types.

"So what benefit does JSONB give me?"
Indexes are used everywhere. You cannot create real indexes on JSON types in PostgreSQL. You can create expression indexes, but only for the content you want to index.

create index justjson_postcode on justjson ((doc-> ‘address’->> ‘postcode’));
Only the postcode is indexed, the others are not.

JSONB supports GIN index, a type of Generalized Inverted Index. PostgreSQL provides another set of index operators to support, including @> including JSON, <@ the most contained,? Test whether the string exists, | Does any string exist?? & All large strings.

There are two types of indexes available. The default is json_ops. It supports all operators (translator: refers to ordinary json operators) and a jsonb_path_ops index (translator: refers to index operators) that supports only the> operator. Each key value in the index is created. In fact, jsonb_path_ops only creates a hash table index that is more complex and more compressed than the default, but the default index can perform more operations and increase the cost of space. Add some data to the table, let's take a look at a certain zip code, if we create a default GIN JSON index and then query:

explain select * from justjsonb where doc @> ‘{" address ": {" postcode ":" HA36CC "}}‘;
                           QUERY PLAN
-------------------------------------------------- ---------------
 Seq Scan on justjsonb (cost = 0.00..3171.14 rows = 100 {"address": {"postcode": "HA36CC"}} ‘:: jsonb)
(2 rows)
It can be seen that the scan table is sequential. If we add a default JSON GIN index and see what is the difference?

create index justjsonb_gin on justjsonb using gin (doc);
> explain select * from justjsonb where doc @> ‘{" address ": {" postcode ":" HA36CC "}}‘;
                                  QUERY PLAN
-------------------------------------------------- -----------------------------
 Bitmap Heap Scan on justjsonb (cost = 40.78..367.62 rows = 100 {"address": {"postcode": "HA36CC"}} ‘:: jsonb)
   -> Bitmap Index Scan on justjsonb_gin (cost = 0.00..40.75 rows = 100 {"address": {"postcode": "HA36CC"}} ‘:: jsonb)
(4 rows)
The search performance is greatly improved, but the space consumption is hidden. In the example, the data size is 41%. Let us delete the index and repeat the jsonb_path_ops GIN index.

create index justjsonb_gin on justjsonb using gin (doc jsonb_path_ops);
> explain select * from justjsonb where doc @> ‘{" address ": {" postcode ":" HA36CC "}}‘;
                                  QUERY PLAN
-------------------------------------------------- -----------------------------
 Bitmap Heap Scan on justjsonb (cost = 16.78..343.62 rows = 100 {"address": {"postcode": "HA36CC"}} ‘:: jsonb)
   -> Bitmap Index Scan on justjsonb_gin (cost = 0.00..16.75 rows = 100 {"address": {"postcode": "HA36CC"}} ‘:: jsonb)
(4 rows)
The total cost is a bit lower, and the index volume is much smaller. This is a typical method for creating index speed and space balance, but it has much higher performance than sequential scanning.



"Should I use it as my JSON database?"
If you frequently update your JSON documents, the answer is no, PostgreSQL is best at storing and extracting JSON documents and their fields, but even so you can take out a single field, you cannot update a single field; in fact you can Parse the entire JSON, add new fields and write it back to let the JSON parser handle the repetition, but you obviously don't want to rely on this.

If your main data is well used in a relational database, and JSON data is just a bunch of supplements (static data), then PostgreSQL is fine, and the JSONB representation and indexing capabilities will be more efficient. In addition, if your data model is a collection of variable content, then you may look for mainstream industrial json document databases such as MongoDB or RethinkDB.

reference

PostgreSQL vs MongoDB http://my.oschina.net/Suregogo/blog/358277

Query JSON in PostgreSQL http://schinckel.net/2014/05/25/querying-json-in-postgres/

Original: https://www.compose.io/articles/is-postgresql-your-next-json-database/

<Translation: zhu jin [email protected]>

Is PostgreSQL your next JSON database?

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.