Data Alignment in PostgreSQL

Linux x86-64 (RHEL 8) 14 SQL Profiler

Kuntal Ghosh Senior Software Engineer

When data are naturally aligned, CPU can perform read and write to memory efficiently. Hence, each data type in PostgreSQL has a specific alignment requirement. When multiple attributes are stored consecutively in a tuple, padding is inserted before an attribute so that it begins from the required aligned boundary. A better understanding of these alignment requirements may help minimizing the amount of padding required while storing a tuple on disk, thus saving disk space.


Data types in Postgres are divided into following categories:

  1. 1. Pass-by-value, fixed length: Data types that are passed by values to Postgres internal routines and have fixed lengths fall into this category.. The length can be 1, 2,  or 4 (or 8 on 64-bit systems) bytes.
  2. 2. Pass-by-reference, fixed length: For these data types, an address reference from the in-memory heap page is sent to internal Postgres routines. They also have fixed lengths.
  3. 3. Pass-by_reference, variable length: For variable length data types, Postgres prepends a varlena header before the actual data. It stores some information about how the data is actually stored on-disk (uncompressed, compressed or TOASTed) and the actual length of the data. For TOASTed attributes, the actual data is stored in a separate relation. In these cases, the varlena headers follow some information about the actual location of the data in their corresponding TOAST relation.

 Typically, on-disk size of a varlena header is 1-byte. But, if the data cannot be toasted and size of the uncompressed data crosses 126 bytes, it uses a 4-bytes header. For example,

 , a varchar
insert into t1 values(repeat('a',126));
insert into t1 values(repeat('a',127));
select pg_column_size(a) from t1;

 Besides, attributes having 4-bytes varlena header need to be aligned to a 4-bytes aligned memory location. It may waste upto 3-bytes of additional padding space. So, some careful length restrictions on such columns may save space.

        4. Pass-by_reference, variable length(cstring, unknown): Finally, there are two data types, viz. ctring and unknown, which are string literals. They can be stored from any 1-byte aligned boundary. Also, they do not require any varlena header.

You can check the alignment requirements of each type using the following query,

select typname,typbyval,typlen,typalign from pg_type;

Where typname is name of the data type; typbyval is true if the data type is accessed as passed-by-value, else false; typlen is the actual length of the data type, however for data-types of variable length it has a value < 0 (-2 for cstring and unknown, -1 otherwise); typalign is the required alignment for the data type.

After checking the alignment requirement of each data type, you can reduce the padding space by positioning them in their alignment-favourable way. For example, following table schema wastes a lot of disk space for padding:

 , a char
 , b int2    -- 1 byte of padding after a
 , c char
 , d int4    -- 3 bytes of padding after c
 , e char
 , f int8    -- 7 bytes of padding after e

If you reorder the columns as double aligned column first, int aligned, short aligned and char aligned column in last, you can save (1+3+7)=11 bytes of space per tuple. 

 , f int8
 , d int4
 , b int2
 , a char
 , c char
 , e char

Before the tuple attributes, a fixed size 23-byte tuple header followed by an optional null-bitmap and an optional object ID are stored. The attributes always starts from a MAXALIGN-ed boundary - typically 8 bytes on a 64-bit OS (or 4 bytes on a 32-bit OS). So, the effective size of a minimal tuple header is 24-bytes (23-bytes header + 1-byte padding). When the null-bitmap is present, it occupies enough bytes to have one bit per data column. In this list of bits, a 1 bit indicates not-null, a 0 bit is a null. When the bitmap is not present, all columns are assumed not-null. For example[3],


This includes the 23-bytes fixed-sized tuple header and a null-bitmap consisting of 8-bits. Now, if we increase the number of columns to nine, it includes 23-bytes fixed-sized tuple header, a null-bitmap consisting of 16-bits and 7 bytes of padding.

Saving a few bytes per tuple in a relation consisting of millions of rows may result in saving some significant storage space. Besides, if we can fit more tuple in a data page, performance can be improved due to lesser I/O activity.





Kuntal GhoshSenior Software Engineer