Dynamic Columns

April 10, 2013

Sql has never been good at handling dynamic columns — it can easily process tables with a fixed number of columns, but dynamic column handling has always been a challenge. (Dynamic columns are often used to implement user-defined fields.) One approach has been to define columns for every possible value that might be stored in a row; this leads to sparse tables containing mostly null values. Because Postgres stores null values in a single bit, there is little overhead to this approach. However, it is hard to call these dynamic columns since every new column requires an alter table ... add column command.

Another approach is to use entity/attribute/value layout, where each value is stored in its own row. This obviously increases storage requirements, has performance challenges, and adds complexity to sql queries.

Fortunately, Postgres has various options to efficiently handle dynamic columns. The simplest solution is arrays which allow multiple values of the same data type to be stored in a single field. This assumes that the values are independent, or that their ordinal position in the array is sufficiently meaningful.

A more complex approach is to associate a key with each value. (Values internally are strings, rather than bound to a specific data type.) Hstore is a very popular key/value store for Postgres, and has index lookup support for arbitrary key/value pairs. (It does not allow hierarchical storage, i.e. key/values inside a key).

Xml and more recently json allow hierarchical storage and more naturally match client application languages. Unfortunately, none of the Postgres native index types understand the structure of json or xml data. (Json and xml documents are stored internally as single text strings.) To index json, it is necessary to create an expression index on a plv8 function (which understands json). Andrew Dunstan has a good blog entry using a plv8 function (jmember) to index specific json keys. Postgres 9.3 will add functions (blog, blog, blog, blog) to access json values directly — this will eliminate the need to use plv8 in simple cases.

For arbitrary json key lookups, it is possible to use a Postgres full text search index to filter out most non-matching rows and use a json-aware function to compare the remaining rows — this avoids the requirement of creating an index per lookup key.

Xml has similar capabilities to json — it can store hierarchical data, and can use full text search to filter matching rows. While many server-side languages understand xml, e.g. PL/Java, PL/Perl, most people use the built-in Postgres xml functions in expression indexes.

As you can see, Postgres offers a variety of options to store dynamic columns while maintaining a fixed-column table format that is easily accessed via sql — this gives you the best of both worlds when storing dynamic data.

Share this

Relevant Blogs

Postgres 15 is here!

Another year is passing by and another version of Postgres is here! The PostgreSQL project roadmap is quite simple for any Postgres user: there is a new major version each...
October 10, 2022

More Blogs

How to Choose Which Database to Use

Some notes  What is SQL database? What is NoSQL database? NoSQL types of databases  (Document oriented, columnar, keyvalue and Graph) Relational Vs Document (NoSQL) key differences.  Today, many people are selecting...
October 21, 2020

Which Postgres Is Right for Me?

Open source software has been saving organizations money for years in operating systems with Linux, in the middleware layer with tools like JBoss, and virtualization with Xen and other options...
November 23, 2015