Chapter 57. Writing a Table Sampling Method
Table of Contents
PostgreSQL's implementation of the
clause supports custom table sampling methods, in addition to
SYSTEM methods that are required
by the SQL standard. The sampling method determines which rows of the
table will be selected when the
TABLESAMPLE clause is used.
At the SQL level, a table sampling method is represented by a single SQL function, typically implemented in C, having the signature
method_name(internal) RETURNS tsm_handler
The name of the function is the same method name appearing in the
TABLESAMPLE clause. The
internal argument is a dummy
(always having value zero) that simply serves to prevent this function from
being called directly from a SQL command.
The result of the function must be a palloc'd struct of
TsmRoutine, which contains pointers to support functions for
the sampling method. These support functions are plain C functions and
are not visible or callable at the SQL level. The support functions are
described in Section 57.1.
In addition to function pointers, the
TsmRoutine struct must
provide these additional fields:
This is an OID list containing the data type OIDs of the parameter(s) that will be accepted by the
TABLESAMPLEclause when this sampling method is used. For example, for the built-in methods, this list contains a single item with value
FLOAT4OID, which represents the sampling percentage. Custom sampling methods can have more or different parameters.
true, the sampling method can deliver identical samples across successive queries, if the same parameters and
REPEATABLEseed value are supplied each time and the table contents have not changed. When this is
REPEATABLEclause is not accepted for use with the sampling method.
true, the sampling method can deliver identical samples across successive scans in the same query (assuming unchanging parameters, seed value, and snapshot). When this is
false, the planner will not select plans that would require scanning the sampled table more than once, since that might result in inconsistent query output.
TsmRoutine struct type is declared
src/include/access/tsmapi.h, which see for additional
The table sampling methods included in the standard distribution are good
references when trying to write your own. Look into
src/backend/access/tablesample subdirectory of the source
tree for the built-in sampling methods, and into the
subdirectory for add-on methods.