Performance of parallel index scans in PostgreSQL

EDB Team

July 24, 2018

This blog will continue the discussion of parallel query in PostgreSQL. In the previous blog of this series, we learned about parallel index scans, its design in PostgreSQL and the performance improvement achieved for a few queries on the industrial benchmark of TPC-H. Therein we analyzed the performance improvement only for a small factor of 20 (database size was approximately 20GB). But the performance benefits realized by parallel operators aren’t that significant until we leverage them for higher scale factors. Hence, in this blog, we will analyze the performance of parallel index scans on 300 scale factor.

For this experiment we used TPC-H inspired benchmark for PostgreSQL. We used 300 scale factor, which gives 300+ GB of database, depending on the available indexes, etc. Additional indexes we created were on columns (l_shipmode, l_shipdate, o_orderdate, o_comment). We tuned following parameters,

random_page_cost = 0.1
seq_page_cost = 0.1
effective_cache_size = 10GB
shared_buffers = 10GB
work_mem = 1GB

In the table below, we compared the performance of the queries using parallel-index scans on v10 to their performance in v9.6. Note that in v9.6 parallel seq scans and parallel nested loop joins were available. Additionally, we tabulated the total contribution of parallel index-scan for each of the queries, it is the execution time of parallel index scan node in explain analyze output of the respective query. All the values of timing are in seconds.

TPC-H query	On v9.6	On v10	Contribution of PIS
6	883	155	137
12	740	717	122
14	422	111	53
15	2295	1138	157
17	17724	3376	16

From the above table, it is clear that the benefits of parallel index scans are significant on large amounts of data as well. To get a better understanding of how this scan method benefits the queries in question, we studied their query plans on both versions. On analyzing those query plans we found two primary explanations for the benefits attained:

Contribution of parallel index scan in total execution time of the query: If the most time-consuming operation in a query is index scan then parallelizing it is likely to improve query performance significantly, e.g. Q6 and Q12. To be more precise have a look at the query plan of Q6,

Q6 Query plan on v9.6

———————————

Limit

-> Aggregate

-> Bitmap Heap Scan on lineitem

-> Bitmap Index Scan on idx_lineitem_shipdate

Execution time: 882812.376 ms

Q6 Query plan on v10

———————————

Limit

-> Finalize Aggregate

-> Gather

-> Partial Aggregate

-> Parallel Index Scan using idx_lineitem_shipdate on lineitem

Execution Time: 155619.579 ms

Here, scan is the only time-consuming operator, hence, parallelizing it improves the performance of the query. Similarly for Q12, the index scan is on lineitem which is the largest table of database. Once it is scanned in parallel, the join above it could also be done in parallel and hence the improvement in performance.

On the other hand, the benefits are not so pronounced when the plan is more complex and involves a number of operators with index scan as one of the less significant operator in terms of total execution time of the query.

Operators above it could also leverage parallelism: In some of the queries, it has been observed that even when the contribution of parallel index scan is not much in total execution time of the query, the benefits are significant, e.g. Q15 and Q17. In such cases, the benefits are not coming from parallel index scans alone but from the fact that now more operators could be pushed to workers and hence extend parallelism upto higher levels. To explain this observation, query plan for Q15 is used:

Q15 Query plan on v9.6

—————————————

Limit

InitPlan 1 (returns $0)

…

-> Merge Join

Merge Cond: (lineitem.l_suppkey = supplier.s_suppkey)

-> GroupAggregate

Group Key: lineitem.l_suppkey

-> Sort

-> Bitmap Heap Scan on lineitem

-> Bitmap Index Scan on idx_l_shipdate

-> Index Scan using supplier_pkey on supplier

Execution time: 2297222.717 ms

Q15 Query plan on v10

—————————————

Limit

InitPlan 1 (returns $0)

…

-> Nested Loop

-> Finalize GroupAggregate

Group Key: lineitem.l_suppkey

-> Gather Merge

-> Sort

-> Partial HashAggregate

Group Key: lineitem.l_suppkey

-> Parallel Index Scan using idx_lineitem_shipdate on Lineitem

-> Index Scan using supplier_pkey on supplier

Execution Time: 1133723.985 ms

In this case, with the introduction of parallel index scan the plan structure changed altogether. With parallel index scan, the aggregates are pushed to workers and the join method is also changed, eventually improving query performance. This shows that it may not be necessary for parallel index scan to be the most time consuming operator for it to improve query performance, rather having it may help other operators to use parallelism which could not be done otherwise.

Overall, we saw there are significant benefits of parallel index scans on large volume of data as well. Thus, aiming towards a more OLAP compatible and scalable version of PostgreSQL. Stay tuned to know about more groundbreaking features in PostgreSQL.

Resource Feature Callout 1

Performance of parallel index scans in PostgreSQL

EDB Team

Relevant Blogs

What is pgvector and How Can It Help You?

PostgreSQL 16 Update: Grouping Digits in SQL

Deploying PostgreSQL Clusters in a Declarative Way

More Blogs

PostgreSQL Logical Replication: Advantages, EDB's Contributions and PG 16 Enhancements

pgAdmin CI/CD

Customising your pgAdmin 4 workspace - Part 1: Browser