On 12-Dec-2017, 2ndQuadrant held the first in a series of PostgreSQL webinars. The session was conducted by Andrew Dunstan, Principal Contributor of JSON functionality to PostgreSQL and Senior Developer & PostgreSQL Committer at 2ndQuadrant.
As promised, the recording of the Webinar is now available. Those who couldn’t make it to the live session, can now view Introduction to JSON data types in PostgreSQL here.
While Andrew was able to address many questions live, there were several queries that couldn’t be answered due to time restrictions. Andrew Dunstan has taken time to answer those questions below.
Q1: Since PostgreSQL 9.4 JSONB supports GIN index, does that means that GIN index has a better performance on JSONB? What about JSON?
A: There is no direct indexing support at all for the JSON data type, only for the JSONB data type. However, you can use expression indexes on the JSON type using the operators that return text. In general, if you want indexing, use JSONB.
Q2: Assuming that I have a following relationship:
role -> permission
In relational database I would create a pivot table to hold all the combinations:
Id, role_id, permission_id
Assuming that I could do the same with a jsonb field in the role table:
Role->permissions = jsonb[permission_id, permission_id, permission_id]
But if I remove or add a permission, it would always perform an update of the whole column instead of updating just the element in the permission JSON.
In addition, I am not sure if I can join the permission_ids with the permission table from a JSONB. I wonder if it would be best to stick to the relational model for this scenario or if this would be a valid scenario.
A: In general permissions, schemes are likely to be done best by fairly normalised tables, possibly with the addition of some array fields to avoid the need for join tables. Some specialised and highly dynamic permissions schemes might benefit from being done using JSON, but without knowing the precise details of the case it’s difficult to make a judgement.
Q3: Are there any maximum length data/keys?
A: Not practically. For example, the following works:
create table foo as select jsonb_build_object(repeat('x',10000),repeat('y',10000000))
Indexing might impose smaller limits.
Q4: What is the difference between JSON and JSONB? For example, JSONB stores structured data whereas JSON stores unstructured data. Do you have any example? It is hard to understand as I’m from SQL RDBMS background.
A: JSON and JSONB are both PostgreSQL data types. You can have a column of one type or the other. They store the data in different ways – JSON stores the data as text and JSONB stored the data in a decomposed format, which allows us to process it far more efficiently.
Q5: Is it more efficient to put some fields together in a JSON field and index them, compared to having multiple fields each with their own index?
A: It’s possible that it could be more efficient, but it’s very dependent on the data. This would be a very good case for doing thorough benchmarking.
Q6: If there are data points that will be queried more often and also sortable, is it still ok to store that data in JSONB?
A: In general, data that has a known data type and well understood values (e.g. Social Security Numbers) should not be stored in JSON. It would be far better to store such data in a regular text field and index it with a btree index.
Q7: I have a JSONB column, which has few 100’s of array elements. How do i update and what is the cost of this update?
A: PostgreSQL does not ever update a row or a field directly. Instead, it creates a new version of the row. That’s how Multi Version Concurrency Control works. It’s why PostgreSQL requires far fewer locks than many other database systems. We never update in place. So if you update a few values in a 600 key JSON field, it rewrites not just the whole field but the whole row. The cost is therefore the same as to cost of updating any field in the table.
Q8: I am working on ERP project which has all its business logic at Database layer. How could this JSON data type be useful for me? Till now, I haven’t used this in my application.
A: Not every application will benefit from storing data in a JSON format in the database. Many ERP applications are fairly fixed in structure, and these are not likely to benefit a great deal from such storage. Applications with more dynamic data structure requirements are likely to benefit more. However, static applications can still benefit from the JSON facilities in PostgreSQL by being able to import JSON data from elsewhere very easily, and by being able to export data to JSON very easily.
Q9: You said, Much better alternative to Entity-Attribute-Value..
Like for this example:
Product: id, name
Attribute: Id, name
Value: id, name
Attribute_value: id, attribute_id, value_id
Product_attribute_value: product_id, Attribute_value_id
Here I would have foreign keys as well. Assuming I want to allow only certain values for an attribute. In addition, if I want to re-arrange a value to another attribute I would need to perform an update to the whole database and change each JSON field, instead of just changing the relationships in the pivot.
How would this work in JSONB?
A: If you’re using EAV then you can’t use Foreign Keys with it, since different attributes will have different value ranges, and foreign keys don’t work like that. Apart from that I don’t understand the question.
For any questions, comments, or feedback regarding the 2ndQuadrant PostgreSQL Webinar series, please visit our website or send an email to webinar@2ndquadrant.com