How to Auto Generate Hash Partitions In EDB Postgres Advanced Server

Advanced How-tos Beginner

Beena Emerson Senior Software Engineer

This article covers the new feature to auto create hash partitions in EDB Postgres Advanced Server 13.

  1. Partitions number
  2. Subpartitions number
  3. Store in command
  4. Add partition behaviour
  5. Modifying subpartition template

 

Creating a large number of partitions can be tedious as we have to specify all the details for each partition separately. Among the three types of partitions that Postgres supports - list, range and hash - it is possible to automate the creation of hash partitions by only specifying the number of partitions required. Hash partitions enable uniform distribution of data across the specified number of partitions using a hashing algorithm which uses modulus and remainder as input.

Partitions number

The following syntax enables us to create a desired number of hash partitions for a table.

CREATE TABLE <table_name>(<column_list>)

PARTITION BY hash (<column>) PARTITIONS <number>;

 

The partitions created will have an auto generated name of the format <table_name>_SYSnnnn where nnnn is an integer starting from 0101. The modulus for  all partitions will be the number of partitions specified and the remainder will range from 0 to (partition count -1). 

CREATE TABLE tbl1 (id int, col1 int)

PARTITION BY hash (col1) PARTITIONS 3;

 

The above command will create three partitions for table tbl1 with names SYS0101, SYS102, SYS103 each with modulus as 3 and remainder 0, 1 and 2 respectively. 

=# \d+ tbl1

                             Partitioned table "public.tbl1"

 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 

--------+---------+-----------+----------+---------+---------+--------------+-------------

 id     | integer |           |          |         | plain   |              | 

 col1   | integer |           |          |         | plain   |              | 

Partition key: HASH (col1)

Partitions: tbl1_sys0101 FOR VALUES WITH (modulus 3, remainder 0),

            tbl1_sys0102 FOR VALUES WITH (modulus 3, remainder 1),

            tbl1_sys0103 FOR VALUES WITH (modulus 3, remainder 2)

 

Subpartitions Number

Similar to partitions, even subpartitions can be auto created by just specifying a number. This is implemented by specifying SUBPARTITIONS keyword with a number after the SUBPARTITION BY clause or instead of the subpartition specification in the partition description of the CREATE TABLE command. The former becomes a subpartition template which is used even by the partitions added later to the table.

CREATE TABLE tbl1 (id int, col1 int, col2 int)

PARTITION BY list (col1)

SUBPARTITION BY hash (col2) SUBPARTITIONS 3

(

  PARTITION p1 values (1,2) (SUBPARTITION s1, SUBPARTITION s2),

  PARTITION p2 VALUES (4,5) SUBPARTITIONS 4,

  PARTITION p3 VALUES (3)

);

 

The above command will create a table tbl1 with three partitions - p1, p2, p3. Partition p1 will have two subpartitions s1 and s2 with modulus 2 for reminders 0 and 1 respectively. Partition p2 will have four subpartitions with auto generated names having modulus 4 for remainders ranging from 0 to 3. Partition p3 will use the template provided since it does not have any explicit subpartition description and will have 3 subpartitions with auto generated names and modulus 3 for remainders ranging from 0 to 2. 

=# SELECT partition_name, backing_table FROM SYS.ALL_TAB_SUBPARTITIONS WHERE table_name = 'TBL1';

 partition_name | backing_table 

----------------+---------------

 P1             | tbl1_s1

 P1             | tbl1_s2

 P2             | tbl1_sys0101

 P2             | tbl1_sys0102

 P2             | tbl1_sys0103

 P2             | tbl1_sys0104

 P3             | tbl1_sys0105

 P3             | tbl1_sys0106

 P3             | tbl1_sys0107

(9 rows)



\d+ tbl1_p2

                            Partitioned table "public.tbl1_p2"

 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 

--------+---------+-----------+----------+---------+---------+--------------+-------------

 id     | integer |           |          |         | plain   |              | 

 col1   | integer |           |          |         | plain   |              | 

 col2   | integer |           |          |         | plain   |              | 

Partition of: tbl1 FOR VALUES IN (4, 5)

Partition constraint: ((col1 IS NOT NULL) AND (col1 = ANY (ARRAY[4, 5])))

Partition key: HASH (col2)

Partitions: tbl1_sys0101 FOR VALUES WITH (modulus 4, remainder 0),

            tbl1_sys0102 FOR VALUES WITH (modulus 4, remainder 1),

            tbl1_sys0103 FOR VALUES WITH (modulus 4, remainder 2),

            tbl1_sys0104 FOR VALUES WITH (modulus 4, remainder 3)

 

Store In Command

The STORE IN clause along with a list of tablespaces separated by comma can be specified after the PARTITIONS or SUBPARTITIONS clause to uniformly spread the auto generated partitions or subpartitions across the given tablespace list.  At least one tablespace should be specified and if the number of tablespaces exceeds the number of partitions or subpartitions provided then the extra data is ignored.

Partitions

For a given list of tablespaces, the partitions generated are stored in the specified tablespaces in a round-robin manner. The first partition will be in the first entry of tablespace list, the second partition in second tablespace and so on.

CREATE TABLE tbl1 (id int, col1 int)

PARTITION BY hash(col1) PARTITIONS 5 STORE IN (ts1, ts2);

 

The above command will create a table tbl1 with five partitions alternatively stored in tablespace ts1 and ts2.

=# SELECT backing_table, tablespace_name FROM SYS.ALL_TAB_PARTITIONS WHERE table_name = 'TBL1';

 backing_table | tablespace_name 

---------------+-----------------

 tbl1_sys0101  | TS1

 tbl1_sys0102  | TS2

 tbl1_sys0103  | TS1

 tbl1_sys0104  | TS2

 tbl1_sys0105  | TS1

(5 rows)

 

Subpartitions

The behaviour of STORE IN is similar to that of partitions number. It is to be noted that, unlike the subpartitions number, the STORE IN command used in the template is not stored in the system catalog, hence its scope is only limited to the partitions created in the CREATE TABLE command and any partitions added later cannot use this information (example in next section).

CREATE TABLE tbl1 (id int, col1 int, col2 int)

PARTITION BY list (col1)

SUBPARTITION BY hash (col2) SUBPARTITIONS 3 STORE IN (ts1, ts2, ts3)

(

  PARTITION p1 values (1,2),

  PARTITION p2 VALUES (4,5) SUBPARTITIONS 4 STORE IN (ts4, ts5)

);

 

The above command will create a table tbl1 with two partitions - p1 using the template has three subpartitions spread across tablespaces ts1, ts2 and ts3 and p2 uses the subpartition specified with four subpartitions spread across ts4, ts5.

=# SELECT partition_name, backing_table, tablespace_name FROM SYS.ALL_TAB_SUBPARTITIONS WHERE table_name = 'TBL1';

 partition_name | backing_table | tablespace_name 

----------------+---------------+-----------------

 P1             | tbl1_sys0101  | TS1

 P1             | tbl1_sys0102  | TS2

 P1             | tbl1_sys0103  | TS3

 P2             | tbl1_sys0104  | TS4

 P2             | tbl1_sys0105  | TS5

 P2             | tbl1_sys0106  | TS4

 P2             | tbl1_sys0107  | TS5

(7 rows)    

 

Add Partition

When we add a new partition to the table with a subpartitions template, it will use the template unless overridden by an explicit subpartition specification. If no template is defined, a single subpartition will be created.

Considering the table in the above example, the following add partition command auto generates three partitions as specified in the template in the default tablespace. 

=# ALTER TABLE tbl1 ADD PARTITION p3 VALUES (7,8);



=# SELECT partition_name, backing_table, tablespace_name FROM SYS.ALL_TAB_SUBPARTITIONS WHERE table_name = 'TBL1' AND partition_name = 'P3';

 partition_name | backing_table | tablespace_name 

----------------+---------------+-----------------

 P3             | tbl1_sys0109  | 

 P3             | tbl1_sys0110  | 

 P3             | tbl1_sys0111  | 

(3 rows)

 

We can explicitly specify the subpartitions count for a hash subpartition in the ADD PARTITION command which will override the template behaviour. As in the following example, the partition will have only 2 subpartitions.

=# ALTER TABLE tbl1 ADD PARTITION p4 VALUES (10) SUBPARTITIONS 2;



=# SELECT partition_name, backing_table, tablespace_name FROM SYS.ALL_TAB_SUBPARTITIONS WHERE table_name = 'TBL1' AND partition_name = 'P4';

 partition_name | backing_table | tablespace_name 

----------------+---------------+-----------------

 P4             | tbl1_sys0113  | 

 P4             | tbl1_sys0114  | 

(2 rows)

 

Modifying SUBPARTITIONS template

The template information is stored in the pg_partitioned_table and visible in the ‘\d’ and ‘\d+’ output of the table.

=# SELECT partrelid::regclass, partsubparttemplate FROM pg_partitioned_table ;

 partrelid | partsubparttemplate 

-----------+---------------------

 tbl1      | 3

 tbl1_p1   | 

 tbl1_p2   | 

(3 rows)



=# \d tbl1

          Partitioned table "public.tbl1"

 Column |  Type   | Collation | Nullable | Default 

--------+---------+-----------+----------+---------

 id     | integer |           |          | 

 col1   | integer |           |          | 

 col2   | integer |           |          | 

Partition key: LIST (col1) SUBPARTITIONS 3

Number of partitions: 2 

 

This template information can be modified by using the ALTER TABLE … SET SUBPARTITION TEMPLATE command. The new value will be used for the partitions added after this command; the existing partitions will be unaffected.

=# ALTER TABLE tbl1 SET SUBPARTITION TEMPLATE 7;



=# ALTER TABLE tbl1 ADD PARTITION p5 VALUES (50);



=# SELECT partition_name, backing_table, tablespace_name FROM SYS.ALL_TAB_SUBPARTITIONS WHERE table_name = 'TBL1' AND partition_name = 'P5';

 partition_name | backing_table | tablespace_name 

----------------+---------------+-----------------

 P5             | tbl1_sys0116  | 

 P5             | tbl1_sys0117  | 

 P5             | tbl1_sys0118  | 

 P5             | tbl1_sys0119  | 

 P5             | tbl1_sys0120  | 

 P5             | tbl1_sys0121  | 

 P5             | tbl1_sys0122  | 

(7 rows)

 

To reset the subpartitions count to 1, you can use empty parentheses instead of a number in the alter table command.  The following example shows that resetting the template generates only a single subpartition.

=# ALTER TABLE tbl1 SET SUBPARTITION TEMPLATE ();



=# ALTER TABLE tbl1 ADD PARTITION p6 VALUES (60);



=# SELECT partition_name, backing_table, tablespace_name FROM SYS.ALL_TAB_SUBPARTITIONS WHERE table_name = 'TBL1' AND partition_name = 'P6';

 partition_name | backing_table | tablespace_name 

----------------+---------------+-----------------

 P6             | tbl1_sys0124  | 

(1 row)

 

This feature saves a lot of effort when we intend to generate a large number of hash partitions or subpartitions. Also the template feature allows to have a blueprint for partitions which will be added later for the table.

 

 

Beena EmersonSenior Software Engineer