Bruce Momjian, PostgreSQL core team member and VP - Postgres Evangelist at EnterpriseDB, recently spoke at the Boston Postgres User Group meeting with a fascinating presentation about eight ways Postgres provides flexibility to store non-relational data.
Why would you want to do that? Bruce has the answer: “Because, believe it or not, relational is not always ideal. It’s really that simple.”
Here is a quick summary of non-relational options handled by Postgres.
#1 Arrays are a sequence of the same data types. As an example, Bruce cited setting up a database of employees and their certifications. The employee is ‘Bill’ and his certifications listed are in one field: ‘CCNA, ACSP, CISSP’.
With Postgres, it is possible to look for employees with ACSP certifications, as well as both ACSP and CISSP certifications – or any sort of combination.
#2 Range Types can be described as putting two fields into one. The example used here is a car rental with checkout and check-in times.
Why would anyone want to use Range Types? For efficient querying and great performance when there are millions – or even billions – of rows in the database. Postgres has great concurrency capabilities so queries can be run without having to lock the database.
#3 Geometry is a data type used to plot location and implemented as a PostgreSQL extension called PostGIS (as in, geographic information system). Typically, without using Postgres, a query would designate an area (in this example, darts on a board) then ask for a specific number of items to be returned. If not enough or too many is the result, then the query is run again either expanding or reducing the area until the right number was found. With Postgres, it is much simpler to put in the query a specific number, or limit, to get the desired result and Postgres will determine the area required so the query is run just once.
#4 XML, which stands for EXtensible Markup Language, can be loaded into Postgres then it is possible to manipulate the data in lots of ways using Postgres functions, such as parsing the complex XML data in discrete pieces for queries.
#5 JSON (JavaScript Object Notation) data is treated much the same as XML so that the data can be loaded into Postgres to take advantage of the many ways that Postgres is able to manipulate the data. Postgres provides greater flexibility for searches of JSON data while delivering high performance.
#6 JSONB is a binary representation of JSON data where Postgres indexes everything for any type of query. A product catalogue is a good example where this data type is very applicable because the data for different products can vary a lot and you would want the flexibility to use any combination of data.
#7 Row Types are unordered data with the example of a driver’s license where there is a combination of text and numbers.
As shown here, one field includes the state (PA), license number (175319) and expiration date (2017-03-12). As cited previously, using Postgres to store this data type offers the advantage of greater flexibility for queries while providing high performance.
#8 Character Strings are combinations of data where using Postgres it is possible to search on prefixes (not full text), as well as adjacent letters where any number of letters can be specified in a query.
Summary
For all these non-relational data types, Postgres delivers high performance by using the index in queries without requiring sequential searches that are very compute-intensive and have a negative impact on performance. Only six different index types are used in all the cases cited here.
Are there other DBMSes that can handle all eight different data types? No, there is only one and that is Postgres which has greater extensibility than any other DBMS.
Bruce offered this final bit of advice: only use non-relational data types when required. Don’t be overzealous using non-relational data types, which can get complicated.
To see examples Bruce used throughout his talk, the presentation slides are freely available to anyone.