EDB Blog

An Illustration of JSONB Capabilities in Postgres 9.5

marc.linster_enterprisedb.com's picture
Author: Marc Linster
06/22/2016

Postgres introduced JSON and JSONB in versions 9.3 and 9.4, respectively, with JSONB (Binary Javascript Object Notation) being the canonical form of JSON, stored as binary objects with improved compression and more capabilities. Early releases of JSONB had a limited set of operators, but 9.5 introduced a complete set of JSONB capabilities, including operators for updating and deleting, and very good indexing capabilities.

In this blog, I will review some of those capabilities and show that in Postgres, JSONB is a first order data type that can be used easily and efficiently in standard SQL queries, indexes, and constraints.

To illustrate these capabilities, I have created a simple data model of customers, products, orders and order lines. This example primarily addresses the customer table.

 

 

 

jsonb=# \d customer

                 Table "example.customer"

     Column     |            Type             | Modifiers

----------------+-----------------------------+-----------

 id             | character varying(40)       | not null

 name           | jsonb                       |

 customer_since | timestamp without time zone |

 

Indexes:

    "customer_pkey" PRIMARY KEY, btree (id)

 

Check constraints:

    "validate_customer_name" CHECK ((name ->> 'first name'::text) IS NOT NULL AND (name ->> 'last name'::text) IS NOT NULL)

 

Referenced by:

    TABLE "customer_order" CONSTRAINT "customer_order_customer_id_fkey" FOREIGN KEY (customer_id) REFERENCES customer(id)

 

The customer table has 4 columns, and most of our example will deal with the ‘name’ column, which is of type JSONB.

A typical customer record looks like this:

 

jsonb=# \x

Expanded display is on.

jsonb=# select * from customer where id = 'cust000190943';

-[ RECORD 1 ]--+----------------------------------------------------------------------------------

id             | cust000190943

name           | {"id": "cust000190943", "initial": "R", "last name": "Smith", "first name": "Hans"}

customer_since | 17-JUN-16 11:25:53.44615

 

Alternatively, I can also use the jsonb_pretty function to make the output more readable:

 

jsonb=# SELECT id, customer_since, jsonb_pretty(name)

from customer where id = 'cust000190943';

-[ RECORD 1 ]--+---------------------------

id             | cust000190943

customer_since | 17-JUN-16 11:25:53.44615

jsonb_pretty   | {                         +

               |     "id": "cust000190943",+

               |     "initial": "R",       +

               |     "last name": "Smith", +

               |     "first name": "Hans"  +

               | }

 

My sample data set has 1,000,000 records. For illustration purposes, I have replicated the primary key ‘id’ in the JSONB column ‘name’. Please note: all query performance is measured on a VM on OSX running Windows 10.

Query performance for JSONB columns

When I run a query based on the primary key ‘id’, then response time is shorter than  0.6 milliseconds, as the primary key is supported by a btree index.

 

jsonb=# \timing

Timing is on.

jsonb=# select * from customer where id = 'cust000000001';

-[ RECORD 1 ]--+----------------------------------------------------------------------------------------------------

id             | cust000000001

name           | {"id": "cust000000001", "title": "Dr.", "initial": "Q", "last name": "Kordon", "first name": "Bob"}

customer_since |

 

Time: 0.569 ms

 

When I run the same query against the value of the key ‘id’ in the JSONB column ‘name’, I get an average performance of approximately 430 milliseconds, as that column is not indexed.

 

jsonb=# select * from customer where name->>'id' = 'cust000000001';

-[ RECORD 1 ]--+----------------------------------------------------------------------------------------------------

id             | cust000000001

name           | {"id": "cust000000001", "title": "Dr.", "initial": "Q", "last name": "Kordon", "first name": "Bob"}

customer_since |

 

Time: 430.838 ms

 

Once I index that column with a btree index, I get similar performance on the JSONB data element as I get in the primary key defined as character varying(40).

 

jsonb=# CREATE INDEX customer_jsonbid_idx ON customer ((name->>'id'));

 

jsonb=# select * from customer where name->>'id' = 'cust000000001';

-[ RECORD 1 ]--+----------------------------------------------------------------------------------------------------

id             | cust000000001

name           | {"id": "cust000000001", "title": "Dr.", "initial": "Q", "last name": "Kordon", "first name": "Bob"}

customer_since |

 

Time: 0.640 ms

 

This allows me to use JSONB data in complex queries efficiently and effectively, just as if it were a conventional SQL data type. In the sample query below, I am joining on a JSONB field (c.name->>id) and a conventional SQL data type (co.customer_id) to find all the customers who have bought products made in New Zealand.

 

SELECT

       c.name->>'title' "Title",

       c.name->>'first name' "First Name",

       c.name->>'last name' "Last Name",

       to_char(co.order_date, 'Day - DD, Mon, YYYY') "Purchase Date",

       col.quantity "Quantity",

       p.name "Product Name"

FROM customer c

JOIN customer_order co ON c.name->>'id' = co.customer_id

JOIN customer_order_line col ON co.id = col.customer_order_id

JOIN product p ON p.id = col.product_id

WHERE p.country_iso3 = 'NZL';

 

 Title | First Name | Last Name |       Purchase Date       | Quantity | Product Name

-------+------------+-----------+---------------------------+----------+--------------

 Mr.   | Haley      | Horn      | Friday    - 24, Jun, 2016 |       24 | Green

 Dr.   | Joe        | Kordon    | Tuesday   - 31, May, 2016 |       52 | Green

 Dr.   | Sue        | Smith     | Sunday    - 05, Jun, 2016 |       74 | Yellow

 Hon.  | Breanna    | Brown     | Sunday    - 15, May, 2016 |       62 | Green

 Dr.   | Marc       | Wagner    | Wednesday - 18, May, 2016 |       84 | White

 Mr.   | Haley      | Tinker    | Sunday    - 26, Jun, 2016 |        9 | White

 Mr.   | Sue        | Tinker    | Saturday  - 18, Jun, 2016 |       31 | Green

 Mr.   | Sue        | Tinker    | Saturday  - 18, Jun, 2016 |       50 | White

       | Kahleesi   | Horn      | Thursday  - 26, May, 2016 |       16 | Green

 Dr.   | Kahleesi   | Wagner    | Thursday  - 28, Apr, 2016 |       42 | Yellow

 Ms.   | Marc       | Kordon    | Tuesday   - 31, May, 2016 |       58 | White

 Hon.  | Joe        | Wagner    | Saturday  - 16, Apr, 2016 |       63 | Yellow

Integrity Constraints on JSONB columns

JSONB in Postgres 9.5 also supports integrity constraints. In the customer table defined above, we have a check constraint ‘validate_customer_name’ to make sure that customers cannot have first names or last names that are undefined, i.e., the JSONB keys ‘first name’ and ‘last name’ cannot be null.

 

Check constraints:

    "validate_customer_name" CHECK ((name ->> 'first name'::text) IS NOT NULL AND (name ->> 'last name'::text) IS NOT NULL)

 

If I run the following query:

 

INSERT INTO customer VALUES (

       'test2',

       '{"id": "test2","first name": "Hans"}')

 

I get an error as that constraint is violated

 

ERROR:  new row for relation "customer" violates check constraint "validate_customer_name"

DETAIL:  Failing row contains (test2, {"id": "test2", "first name": "Hans"}, null, null).

 

This query, which adds a first name and a last name, succeeds, as it meets the check constraint

 

INSERT INTO customer VALUES (

       'test2',

       '{"id": "test2","first name": "Hans", "last name": "Prince"}')

JSONB data manipulation

Postgres 9.5 also provides numerous capabilities to manipulate JSONB data elements. The previous queries created a record with the following data:

 

jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';

  id   |        jsonb_pretty

-------+----------------------------

 test2 | {                         +

       |     "id": "test2",        +

       |     "last name": "Prince",+

       |     "first name": "Hans"  +

       | }

 

I now add a new key ‘title’ with value ‘Hon’ using ‘||’ operator to concatenate two JSONB values.

 

UPDATE customer

       SET name = name || '{"title": "Hon"}'

       WHERE id = 'test2';

 

This  results in:

 

jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';

  id   |        jsonb_pretty

-------+----------------------------

 test2 | {                         +

       |     "id": "test2",        +

       |     "title": "Hon",       +

       |     "last name": "Prince",+

       |     "first name": "Hans"  +

       | }

 

If I want to change the value of the key ‘title’, I run a simple update query that replaces the value for the key ‘title’:

 

UPDATE customer

       SET name = name || '{"title": "His Excellence"}'

       WHERE id = 'test2';

 

Resulting in

 

jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';

  id   |          jsonb_pretty

-------+--------------------------------

 test2 | {                             +

       |     "id": "test2",            +

       |     "title": "His Excellence",+

       |     "last name": "Prince",    +

       |     "first name": "Hans"      +

       | }

(1 row)

 

The same operator can be used to add nested documents

 

UPDATE customer

       SET name = name ||

'{"contact": {"fax": "617-123-5678", "cell": "617-123-4567"}}'

       WHERE id = 'test2';

 

jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';

  id   |          jsonb_pretty

-------+--------------------------------

 test2 | {                             +

       |     "id": "test2",            +

       |     "title": "His Excellence",+

       |     "contact": {              +

       |         "fax": "617-123-5678",+

       |         "cell": "617-123-4567"+

       |     },                        +

       |     "last name": "Prince",    +

       |     "first name": "Hans"      +

       | }

(1 row)

 

The operator jsonb_set allows me to update values inside the nested documents

 

UPDATE customer

       SET name = jsonb_set (name, '{contact, cell}', '"617-123-0101"')

              WHERE id = 'test2';

 

jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';

  id   |          jsonb_pretty

-------+--------------------------------

 test2 | {                             +

       |     "id": "test2",            +

       |     "title": "His Excellence",+

       |     "contact": {              +

       |         "fax": "617-123-5678",+

       |         "cell": "617-123-0101"+

       |     },                        +

       |     "last name": "Prince",    +

       |     "first name": "Hans"      +

       | }

(1 row)

 

The ‘#-’ operator allows me to delete keys and their values inside a nested JSONB document.

 

UPDATE customer

       SET name = name #- '{contact, fax}'

              WHERE id = 'test2';

 

jsonb=# SELECT id, jsonb_pretty(name)from customer where id = 'test2';

  id   |          jsonb_pretty

-------+--------------------------------

 test2 | {                             +

       |     "id": "test2",            +

       |     "title": "His Excellence",+

       |     "contact": {              +

       |         "cell": "617-123-0101"+

       |     },                        +

       |     "last name": "Prince",    +

       |     "first name": "Hans"      +

       | }

(1 row)

 

Make JSONB data look like a standard SQL table

The following query uses the LATERAL command (see https://www.postgresql.org/docs/9.5/static/sql-select.html), which allows columns in the FROM clause in a JOIN or a sub-select to refer to columns in the SELECT clause, combined with jsonb_to_record, which builds a record from a JSONB object.

 

SELECT 

      c.id "Customer Id",

      l.title "Title",

      l."first name" "First Name",

      l.initial "Initials",

      l."last name" "Last Name",

      l."postfix" "Postfix"

FROM  customer c

JOIN  LATERAL

      ( SELECT *

             FROM jsonb_to_record(name)

                      AS detail(

                             id text,

                             title text,

                             initial text,

                             postfix text,

                             "last name" text,

                             "first name" text

                             )) l

ON (l.id = c.id)

ORDER BY c.id ASC

LIMIT 20;

 

  Customer Id  | Title | First Name | Initials | Last Name | Postfix

---------------+-------+------------+----------+-----------+---------

 cust000000001 | Dr.   | Bob        | Q        | Kordon    |

 cust000000002 | Mr.   | James      | F        | Schiller  |

 cust000000003 | Dr.   | Haley      | V        | Buchholz  | III

 cust000000004 | Hon.  | Marc       | V        | Goethe    | Sr

 cust000000005 |       | James      | W        | Schiller  | II

 cust000000006 | Ms.   | Breanna    | J        | Horn      | II

 cust000000007 | Ms.   | Haley      | H        | Cleland   | Jr.

 cust000000008 | Mr.   | James      | E        | Tinker    | Jr.

 cust000000009 |       | Hans       | V        | Kordon    | Jr.

Conclusions

JSONB in Postgres 9.5 provides a rich array of capabilities that makes it easy to implement solutions that need the flexibility of NoSQL, without giving up on the strengths of the relational model. The fact that standard SQL and JSONB can be used in the same transactional context, with a tight integration that allows converting JSONB to records (and vice-versa), creates a perfect NoSQL (Not only SQL) solution.

For a complete definition of the JSONB capabilities in Postgres 9.5, please review the JSON and JSONB Functions and Operators page in the Postgres documentation. 

To learn more about NoSQL and JSON, please also view the video Greater Developer Flexibility with NoSQL in EDB Postgres or contact us directly at sales@enterprisedb.com.

Marc Linster, Ph.D., is Senior Vice President, Products and Services, at EnterpriseDB.

SHARE