How to improve PostgreSQL performance by fitting more tuples in a data page

January 20, 2023

When data are naturally aligned, CPU can perform read and write to memory efficiently. Hence, each data type in PostgreSQL has a specific alignment requirement. When multiple attributes are stored consecutively in a tuple, padding is inserted before an attribute so that it begins from the required aligned boundary. A better understanding of these alignment requirements may help to minimize the amount of padding required while storing a tuple on disk, thus saving disk space.

Data types in Postgres are divided into the following categories:

  • Pass-by-value, fixed-length: Data types that are passed by values to Postgres internal routines and have fixed lengths fall into this category.. The length can be 1, 2,  or 4 (or 8 on 64-bit systems) bytes.
  • Pass-by-reference, fixed-length: For these data types, and address reference from the in-memory heap page is sent to internal Postgres routines. They also have fixed lengths.
  • Pass-by_reference, variable-length: For variable-length data types, Postgres prepends a varlena header before the actual data. It stores some information about how the data is actually stored on-disk (uncompressed, compressed or TOASTed) and the actual length of the data. For TOASTed attributes, the actual data is stored in a separate relation. In these cases, the varlena headers follow some information about the actual location of the data in their corresponding TOAST relation.

Typically, the on-disk size of a varlena header is 1-byte. But, if the data cannot be toasted and the size of the uncompressed data crosses 126 bytes, it uses a 4-bytes header. For example:


, a varchar


insert into t1 values(repeat('a',126));

insert into t1 values(repeat('a',127));

select pg_column_size(a) from t1;






Besides, attributes having 4-bytes varlena header need to be aligned to a 4-bytes aligned memory location. It may waste upto 3-bytes of additional padding space. So, some careful length restrictions on such columns may save space.

  • Pass-by_reference, variable length(cstring, unknown): Finally, there are two data types, viz. ctring and unknown, which are string literals. They can be stored from any 1-byte aligned boundary. Also, they do not require any varlena header.

You can check the alignment requirements of each type using the query below, where typname is name of the data type; typbyval is true if the data type is accessed as passed-by-value, else false; typlen is the actual length of the data type, however for data-types of variable length it has a value < 0 (-2 for cstring and unknown, -1 otherwise); typalign is the required alignment for the data type.

select typname,typbyval,typlen,typalign from pg_type;


After checking the alignment requirement of each data type, you can reduce the padding space by positioning them in their alignment-favorable way. For example, the following table schema wastes a lot of disk space for padding:


 , a char

 , b int2    -- 1 byte of padding after a

 , c char

 , d int4    -- 3 bytes of padding after c

 , e char

 , f int8    -- 7 bytes of padding after e



If you reorder the columns as double aligned column first, int aligned, short aligned and char aligned column in last, you can save (1+3+7)=11 bytes of space per tuple.


 , f int8

 , d int4

 , b int2

 , a char

 , c char

 , e char



Before the tuple attributes, a fixed size 23-byte tuple header followed by an optional null-bitmap and an optional object ID are stored. The attributes always starts from a MAXALIGN-ed boundary - typically 8 bytes on a 64-bit OS (or 4 bytes on a 32-bit OS). So, the effective size of a minimal tuple header is 24-bytes (23-bytes header + 1-byte padding). When the null-bitmap is present, it occupies enough bytes to have one bit per data column. In this list of bits, a 1 bit indicates not-null, a 0 bit is a null. When the bitmap is not present, all columns are assumed not-null. For example[3]:






This includes the 23-bytes fixed-sized tuple header and a null-bitmap consisting of 8-bits. Now, if we increase the number of columns to nine, it includes 23-bytes fixed-sized tuple header, a null-bitmap consisting of 16-bits and 7 bytes of padding.






Saving a few bytes per tuple in a relation consisting of millions of rows may result in saving some significant storage space. Besides, if we can fit more tuple in a data page, performance can be improved due to lesser I/O activity.





Share this

Relevant Blogs

Surviving Without a Superuser: Coming to v16

EDB is proud to present the following excerpt from Surviving Without a Superuser from EDB VP, Chief Data Scientist Robert Haas' personal blog. You can read the entire article here...
January 30, 2023

More Blogs

How does VACUUM work in PostgreSQL

Introduction I recently had a customer who was trying to figure out why his VACUUM-for-wraparound was taking so long–he had a 6TB database, and the VACUUM had been running...
January 23, 2023