Dynamic Columns

April 10, 2013

Sql has never been good at handling dynamic columns — it can easily process tables with a fixed number of columns, but dynamic column handling has always been a challenge. (Dynamic columns are often used to implement user-defined fields.) One approach has been to define columns for every possible value that might be stored in a row; this leads to sparse tables containing mostly null values. Because Postgres stores null values in a single bit, there is little overhead to this approach. However, it is hard to call these dynamic columns since every new column requires an alter table ... add column command.

Another approach is to use entity/attribute/value layout, where each value is stored in its own row. This obviously increases storage requirements, has performance challenges, and adds complexity to sql queries.

Fortunately, Postgres has various options to efficiently handle dynamic columns. The simplest solution is arrays which allow multiple values of the same data type to be stored in a single field. This assumes that the values are independent, or that their ordinal position in the array is sufficiently meaningful.

A more complex approach is to associate a key with each value. (Values internally are strings, rather than bound to a specific data type.) Hstore is a very popular key/value store for Postgres, and has index lookup support for arbitrary key/value pairs. (It does not allow hierarchical storage, i.e. key/values inside a key).

Xml and more recently json allow hierarchical storage and more naturally match client application languages. Unfortunately, none of the Postgres native index types understand the structure of json or xml data. (Json and xml documents are stored internally as single text strings.) To index json, it is necessary to create an expression index on a plv8 function (which understands json). Andrew Dunstan has a good blog entry using a plv8 function (jmember) to index specific json keys. Postgres 9.3 will add functions (blog, blog, blog, blog) to access json values directly — this will eliminate the need to use plv8 in simple cases.

For arbitrary json key lookups, it is possible to use a Postgres full text search index to filter out most non-matching rows and use a json-aware function to compare the remaining rows — this avoids the requirement of creating an index per lookup key.

Xml has similar capabilities to json — it can store hierarchical data, and can use full text search to filter matching rows. While many server-side languages understand xml, e.g. PL/Java, PL/Perl, most people use the built-in Postgres xml functions in expression indexes.

As you can see, Postgres offers a variety of options to store dynamic columns while maintaining a fixed-column table format that is easily accessed via sql — this gives you the best of both worlds when storing dynamic data.

Share this

Relevant Blogs

Postgres 15 is here!

The PostgreSQL project roadmap is quite simple: there is a new major version each year around October/November and minor versions each quarter (unless there is a critical bug to fix)...
October 10, 2022

More Blogs

EDB Contributions to PostgreSQL 15

EDB thanks everyone who has contributed to this great new release in any way! EDB is committed to helping the Community improve Postgres with each release, as it has done...
September 28, 2022

EDB Contributions to Postgres 15

The first public beta for Postgres was released on May 19, 2022 by the PostgreSQL Global Development Group. The beta signifies that no new features will be added to Postgres...
June 07, 2022

How to Choose Which Database to Use

Some notes What is SQL database? What is NoSQL database? NoSQL types of databases (Document oriented, columnar, keyvalue and Graph) Relational Vs Document (NoSQL) key differences. Today, many people are...
October 21, 2020