What day could be better than Valentine’s Day to pen a blog about TOAST and 10 other things to love about Postgres? And this year is a good year for it because advances in recent releases have expanded the solution’s capacity to support larger workloads, new data types and unstructured or semi-structured data. So much more to love, including support for NoSQL-like workloads.
For years, Postgres lagged many features that enterprises required as the database industry was taking shape and adoption was on the rise. In just the past few years, however, the Postgres community has propelled Postgres forward significantly to increase its relevance to larger enterprises. EnterpriseDB has also contributed significantly to the features and capabilities of Postgres. If you'd like to learn more about EDB's Postgres Plus Advanced Server, please visit: http://bit.ly/1nyZr9j
So, to celebrate Valentine's Day, here’s my tribute to TOAST and 10 other things to love about Postgres:
TOAST
Otherwise known as The Oversized-Attribute Storage Technique, TOAST enables Postgres to automatically compress and store very large files external to a table while only storing the necessary information to retrieve the data when needed. This makes incorporating large datatypes into your application a snap!
JSON
JSON (JavaScript Object Notation) is one of the most popular data-interchange formats on the web today. Some NoSQL systems, such as MongoDB, even use JSON as their native data storage format. Postgres offers robust support for JSON: it has a JSON data type, which validates and stores JSON text and provides functions for extracting elements from JSON values; and it offers the ability to easily encode query result sets using JSON. This last piece of functionality is particularly important, as it means that applications that prefer to work natively with JSON can easily obtain their data from Postgres in that format. In addition to the native JSON data type, Postgres has a JSON parser and various JSON functions. All together, the JSON data, operator and function enhancements mean that developing NoSQL-like applications is much easier.
HSTORE
Much like JSON, HSTORE is a mechanism for storing semi-structured key-value data in a Postgres database. It is particularly useful for sparse attributes – instead of creating a large number of columns, each of which will be non-null for only a small percentage of the records in the table, you can create a single HSTORE column and include, for each row, only those keys which pertain to that record. It is also useful in cases where new attributes may need to be added on the fly and without advancing planning. Again like JSON, it can be used to emulate a schema-less database when that is desirable. Ultimately, this fills a unique need in relational table storage when you don't want to waste a lot of space for attributes that will never have a value for many records. This can be another very handy tool for a web developer or someone needing the ACID properties of Postgres along with some NoSQL-like behaviors too.
Extensibility and PostGIS
Postgres has been designed with extensibility in mind from the ground up, allowing any user to write new code in C that can be dynamically loaded and integrated with the database backend. This allows the addition of new data types and new indexing capabilities without modification of the core code. As a result, a rich ecosystem of extensions has grown up around Postgres, and the most important of these is probably PostGIS, which adds industry-leading geospatial capabilities to the database.
Transactional DDL Did you ever want to be able to recover from changes to your schema that were made by mistake? Postgres has the ability to do this through what's called transactional DDL (Data Definition Language). Using the logs, you can easily recover from even very large changes. You might think that this is true for all databases, but it just isn't.
Unlogged Tables
Postgres allows you to create a table without the WAL (Write Ahead Log) overhead, effectively making them unlogged (what the NoSQL world calls non-durable). This speeds up the response times for table writes and updates by effectively letting the DBA prioritize data and designate some as OK to lose if the system fails.
CTE
Writeable Common Table Expressions allows INSERT, UPDATE and DELETE statements to be executed in the WITH clause of an SQL statement. This dramatically reduces the amount and complexity of code and time needed to perform some common tasks like creating an Order record with multiple Line Item records at the same time, performing an UPSERT operation (if the record exists just update it and if it doesn't exist then INSERT it), or archiving and deleting data at the same time.
Query XML Data
Many organizations are working with a lot of XML data these days. If you are in this position, you should take a look at Postgres' XML features, which allow users to query XML data. Using Postgres, you can directly query XML data stored in the database and extract elements from the data stored in your database.
Replication
For many years, Postgres did not have a built-in replication solution, greatly limiting the environments where it could be used. However, progress over the last few years has been rapid. Postgres 9.0, released in 2010, allowed changes to be streamed from the master to standby servers in real time, eliminating the multi-minute replication delays characteristic of prior releases. It also allowed standby servers to service read-only queries, opening up many new possibilities for load balancing. In 2011, Postgres 9.1 added synchronous replication, a critical feature for preservation of very high-value data. In 2012, Postgres 9.2 added cascading replication, so that a standby database can in turn replicate to other standbys, reducing load on the master and allowing more flexible network topologies. Postgres 9.3, released in September 2013, speeds up standby promotion and simplifies remastering. This fast failover feature to replication makes the promotion of a replica to a master much faster for organizations with stringent uptime requirements. Future releases promise to continue the rapid pace of development in this area.
Authentication
One of the challenges that teams face when deploying any service is making sure that the service supports the organizational authentication that's in use. While Postgres may not support every type of authentication imaginable, it certainly comes close. Current Postgres releases supports 12 authentication systems.
SQL/MED and Foreign Data Wrappers
Last but not least, Postgres supports SQL/MED (SQL Management of External Data) and has foreign data wrappers (FDWs), which enables users to integrate structured and unstructured data with Postgres – a bridge between the world of structured data and unstructured data. That is a unique capability that allows enterprises to leverage the strength of relational and unstructured data systems, without having to compromise. There are FDWs available for Postgres, Oracle, MySQL, ODBC, JDBC, CouchDB, MongoDB, Redis, Neo4j, JSON, flat files, LDAP, Twitter and more.
It was hard to stop at TOAST and 10 cool things. Note how I cleverly inserted an 11th into the usual top 10 format (12 if you count the extensibility framework that allows anyone to build extensions for Postgres)!
It’s a fabulous time to be connected with Postgres, and the ranks of Postgres lovers are growing every day.
Marc Linster is senior vice president, products and services at EnterpriseDB.