Parallelism Becomes a Reality in Postgres

December 14, 2015

This post originally appeared on Amit Kapila's personal blog

Parallelism is now a reality in PostgreSQL.  With 9.6, I hope we will see many different forms of queries that can use parallelism to execute.  For now, I will limit this discussion to what we can already do, which is Parallel Sequential Scans.

Parallel Sequential Scans are used to scan a relation in parallel with the help of background workers, which in turn improves the performance of such scans. I will discuss the scenarios where users can expect a performance boost due to this feature later in this blog, but first let us understand the basic feature and how it works. Three new GUC parameters have been added to tune the usage of this feature.

  1. max_parallel_degree - This is used to set the maximum number of workers that can be used for an individual parallel operation. It is very well possible that the requested number of workers is not available at execution time. Parallel workers are taken from the pool of established by max_worker_processes, which means that the value of max_parallel_degree should be lesser than max_worker_processes. It might not be useful to set the value of this parameter more than the number of CPU count on your system.
  2. parallel_tuple_cost - This is used by the planner to estimate the cost of transferring a tuple from the parallel worker process to the master backend. The default is 0.1. The more the number of tuples that needs to be passed from worker backend processes to master backend processes, the more this cost will be and more overall cost of the parallel sequential scan plan.
  3. parallel_setup_cost - This is used by the planner to estimate the cost of launching the parallel worker processes and setting up dynamic shared memory to communicate. The default is 1000.

Now let us see the simple example to demonstrate how parallel sequential scan works:

create table tbl_parallel_test(c1 int, c2 char(1000));     insert into tbl_parallel_test values(generate_series(1,1000000),'aaaaa');     Analyze tbl_parallel_test;     Explain analyze select * from tbl_parallel_test where c1 < 10000 and    c2 like '%bb%';                   QUERY PLAN                 -------------------------------------------------------------------------------------------------------------      Seq Scan on tbl_parallel_test             (cost=0.00..157858.09 rows=1 width=1008)             (actual time=378.414..378.414 rows=0 loops=1)       Filter: ((c1 < 10000) AND (c2 ~~ '%bb%'::text))       Rows Removed by Filter: 1000000      Planning time: 0.075 ms      Execution time: 378.431 ms      (5 rows)   

Set the max parallel degree to enable the use of parallelism in queries. 

set max_parallel_degree = 6;    Explain analyze select * from tbl_parallel_test where c1 < 10000    and c2 like '%bb%';                                   QUERY PLAN                      -------------------------------------------------------------------------------------------------------------     Gather (cost=1000.00..29701.57 rows=1 width=1008)            (actual time=182.708..182.708 rows=0 loops=1)      Number of Workers: 5      -> Parallel Seq Scan on tbl_parallel_test            (cost=0.00..28701.47 rows=1 width=1008)            (actual time=179.496..1081.120 rows=0 loops=1)         Filter: ((c1 < 10000) AND (c2 ~~ '%bb%'::text))         Rows Removed by Filter: 1000000     Planning time: 0.078 ms     Execution time: 200.610 ms    (7 rows)  

Here, we can see how changing max_parallel_degree allows the usage of parallel workers to perform parallel sequential scans. We can see in the above example that even though we have set max_parallel_degree as 6, still it uses 5 workers and the reason for same is that currently the parallel workers are chosen based on size of relation.

Next, let us discuss usage of functions in parallel query. A new clause PARALLEL is added to the CREATE FUNCTION statement. There are three valid values that can be used by user with this clause.

1. PARALLEL Unsafe - This indicates that the function can't be executed in parallel mode and the presence of such a function in a SQL statement forces a serial execution plan.
2. PARALLEL Restricted - This indicates that the function can be executed in parallel mode, but the execution is restricted to parallel group leader. As of now, if the qualification for any particular relation has anything that is parallel restricted, that relation won't be chosen for parallelism.
3. PARALLEL Safe - This indicates that the function is safe to run in parallel mode without restriction.

The default value for function is PARALLEL Unsafe. Now let us see the impact of using Parallel Safe and Unsafe function in the queries. I will continue using the query used in previous examples to explain the concept.

Create a Parallel Safe function

create or replace function calc_factorial(a integer, fact_val integer)    returns integer      as $$      begin        perform (fact_val)!;        return a;      end;      $$ language plpgsql PARALLEL Safe; 

Use it in a query

Explain analyze select * from tbl_parallel_test where                  c1 < calc_factorial(10000, 10)                   and c2 like '%bb%';            QUERY PLAN      --------------------------------------------------------------------------------      Gather (cost=1000.00..75154.99 rows=1 width=1008)         (actual time=120566.456..120566.456 rows=0 loops=1)       Number of Workers: 5       -> Parallel Seq Scan on tbl_parallel_test          (cost=0.00..74154.89 rows=1 width=1008)          (actual time=119635.421..359721.498 rows=0 loops=1)        Filter: ((c2 ~~ '%bb%'::text) AND (c1 < calc_factorial(10000, 10)))        Rows Removed by Filter: 1000000      Planning time: 54.904 ms      Execution time: 120622.631 ms      (7 rows)

Here we can see that Parallel Plan is chosen and the parallel safe functionis pushed to workers for evaluation of quals. Now let’s change that function as Parallel Unsafe and see how the above query behaves.

Alter Function calc_factorial(integer, integer) PARALLEL Unsafe;      Explain analyze select * from tbl_parallel_test where                 c1 < calc_factorial(10000, 10)                  and c2 like '%bb%';             QUERY PLAN      --------------------------------------------------------------------------------      Seq Scan on tbl_parallel_test         (cost=0.00..407851.91 rows=1 width=1008)         (actual time=33166.138..33166.138 rows=0 loops=1)       Filter: ((c2 ~~ '%bb%'::text) AND (c1 < calc_factorial(10000, 10)))       Rows Removed by Filter: 1000000      Planning time: 0.162 ms      Execution time: 33166.208 ms      (5 rows)  

So using parallel unsafe functions in queries would lead to serial plans. Next, let us see the Performance characteristics of Parallelism:

Non-default settings used to collect performance data:

shared_buffers=32GB; min_wal_size=5GB; max_wal_size=10GB    checkpoint_timeout =30min; max_connections=300;    max_worker_processes=100;  

Test setup

create table tbl_perf(c1 int, c2 char(1000));    insert into tbl_perf values(generate_series(1,30000000),'aaaaa');    Explain analyze select c1 from tbl_perf where                 c1 > calc_factorial($1,10) and                 c2 like '%aa%'; 

The function calc_factorial is the same as used in the previous example and the values passed to it are such that the desired percentage of rows can be selected.  Example:

--"to select 1% of rows, below query can be used"    Explain analyze select c1 from tbl_perf where                 c1 > calc_factorial(29700000,10) and                 c2 like '%aa%';"    --"to select 10% of rows, below query can be used"    Explain analyze select c1 from tbl_perf where                 c1 > calc_factorial(27000000,10) and                 c2 like '%aa%';"    --"to select 25% of rows, below query can be used"    Explain analyze select c1 from tbl_perf where                 c1 > calc_factorial(22500000,10) and                 c2 like '%aa%';"  

Performance Data 

1. With increase in degree of parallelism (more parallel workers), the time to complete the execution reduces.
2. Along with workers, master backend also participates in execution due to which you can see more time reduction in some cases.
3. After a certain point, increasing max parallel degree won't help.
The cases we have seen in this blog are mostly the cases where parallel query helps by using the workers, however there exists some cases, like when qualification is very cheap, where it hurts or won't help by employing more workers. There is more investigation needed to make sure that the planner won't choose such plans for parallelism.

Amit Kapila is a Senior Database Architect at EnterpriseDB. 

Share this

Relevant Blogs

Finding memory leaks in Postgres C code

I spent the last week looking for a memory leak in Postgres’s WAL Sender process. I spent a few days getting more acquainted with Valgrind and gcc/clang sanitizers, but ultimately...
March 27, 2024

More Blogs

Let's Workshop an Unplanned Postgres Outage

It’s not a controversial statement to say that no database maintenance is without risk. Postgres and its community provides several useful tools to minimize impact of even major overhauls such...
July 07, 2023