Pgbench: Performance Benchmark of PostgreSQL 12 and EDB Advanced Server 12

Vibhor Kumar May 26, 2020

Pgbench: Performance benchmark of PostgreSQL 12 and EDB Advanced Server 12

In January one of my colleagues, Kuntal Gosh, shared his benchmark with current and recent versions of PostgreSQL on an AWS m5.metal instance. In his benchmark, he has shown how over the few last years  that PostgreSQL’s performance has improved for the pgbench-type workload.

Using Kuntal’s benchmark scripts, we decided to perform another benchmark for PostgreSQL 12.2 on a similar machine equipped with multiple SSDs. Kuntal’s work is described in this blog entry

For the new benchmark, we made some changes, which we will be sharing later. After tweaking a few parameters and adding extra mount points for more IOPs, we were able to achieve 70,000 TPS for the pgbench read-write workload.

After performing the benchmark with PostgreSQL 12.2, we decided to run the same benchmark on the same machine and configuration for EDB Postgres Advanced Server 12.2. EDB Postgres Advanced Server was initiated in Redwood mode, which enables native compatibility with Oracle’s PL/SQL.
Following are the results:

The above two results show that EDB Postgres Advanced server provides nominally the same performance as PostgreSQL for the pgbench workload.Environment details, including hardware, OS parameters and Postgres configuration were identical for PostgreSQL and EDB Postgres Advanced Server. The details are listed below.

 

Environment Details: Machine Type, Operating System & Database Version

Machine Type and Mount Points

For this benchmark, we have used the pgbench utility, which comes with EDB Postgres Advanced Server binaries and used the same machine type as Kuntal’s blog.

Instance Details

Instance Name

Sustained All-Core Turbo

Logical Processors

Memory

Local Storage

EBS-Optimized Bandwidth

Network Bandwidth

m5d.metal

Up to 3.1 GHz

96

384 GiB

4 x 900 GB NVMe SSD

14 Gbps

25 Gbps

For better I/O throughput, we decided to create three mount points with more Provisioned IOPs as mentioned in the below table. The mounted disks were configured with a read-ahead block value 8192 and using the deadline scheduler. The following table gives information about mount points, size, provisioned IOPs, read-ahead setting, and I/O scheduler of the disk which we used for the benchmark.

Mount points

Size

Provisioned IOPS

Read-Ahead Setting

I/O Scheduler

/pg_indexes

200 GiB

10000

8192

deadline

/pg_data

600 GiB

30000

8192

deadline

/pg_wal

200 GiB

10000

8192

deadline

Operating System

For this benchmark, we used CentOS Linux release 7.8.2003 (Core) Operating system and tuned the following parameters. We disabled Transparent Huge Pages (THP) because as per our past experience we have not seen any benefits for THP with pgbench workload. In some cases, we observed that THP was impacting the performance of the database by 2-3%.

Operating System Configuration

Parameter

Value

vm.nr_hugepages

52600

vm.swappiness

1

vm.max_map_count

3112960

net.core.somaxconn

1024

 

Database

We used the following initdb command with a custom WAL segment on dedicated mount points for WAL and indexes. We chose a custom WAL segment size, i.e.. 128 MB, because it reduces the number of WAL files in the WAL directory and, with it, Postgres performs better on write-heavy workloads.

${PGSUDO} ${INITDB} --pgdata=/pg_data/data \
                    --waldir=/pg_wal/wal  \
                    --wal-segsize=128 \
                    --encoding=UTF-8

Database size and tablespaces used for this benchmark are given in the following table:

EDB Postgres Advanced Server (12.2)

WAL segment size

128MB

Database size

200GB

Tablespace

pgindexes -> /pg_indexes/indexes

 

Benchmark Method & Process

Pgbench

As mentioned above, we used pgbench for this benchmark. Pgbench runs the same set of SQL commands multiple times and allows runs the transactions using multiple concurrent database connections. After running the transactions, pgbench gives the average transaction rate (transactions per second).  For more information on pgbench, click here

Database Size of 200GB

In order to make the benchmarking process smooth, we created a database size of 200GB using the following pgbench command:

${PGSUDO} ${PGBIN}/pgbench --initialize \
                           --scale=${F_SCALE_FACTOR} \
                           --index-tablespace=pgindexes \
                           --user=${PGUSER} \
                           --port=${PGPORT} \
                           ${PGDATABASE}

where F_SCALE_FACTOR value was 15994.3800.

After creating a database of 200GB, we stopped Postgres and took a backup of the database using the following command:

${PGSUDO} rsync -ahW --no-compress ${PGDATA}/ ${PGDATA_BCKUP} &
${PGSUDO} rsync -ahW --no-compress ${PGINDEXES}/ ${PGINDEXES_BCKUP} &
${PGSUDO} rsync -ahW --no-compress ${PGWAL}/ ${PGWAL_BCKUP} &

We needed the backup for restarting the benchmark after each run. This helped us remove any unknown factor (like caching) impacting the benchmarking process.

We used the database parameters mentioned in Appendix - I and used pg_prewarm for warming the pgbench_accounts table’s data. The following are the SQL queries we used for warming the cache with the pgbench_accounts table:

SELECT pg_prewarm(‘pgbench_accounts’);
SELECT pg_prewarm(‘pgbench_accounts_pkey’);

After pre-warming the tables, we ran the pgbench command for a set of connections. We repeated the process three times for each set of connections and executed. The following is sample bash code from the set of scripts we executed for this benchmark:

for ((run = 1 ; run <= ${NO_OF_RUNS} ; run++))
do
    # print the run number
    plog "RUN => ${run}"
    # create run the director
    plog "Creating log dirtoring for run"
    mkdir -p ${LOGDIR}/${run}
    # start the database if its not running
    plog "Starting Postgres if its not started yet"
    start_stop_pg start
    for threads in ${PGBENCH_CONNECTIONS_LIST}
    do
        RUN_LOGFILE=${LOGDIR}/${run}/${threads}_conn.log
        plog "Running pgbench for ${threads} number of connections"
        pgbench_steps "${threads}" "${threads}" "${RUN_LOGFILE}"
        # backup the postgresql log file
        sudo mv ${PGDATA}/log/postgresql.log ${LOGDIR}/${run}
        sudo chown ${USER}:${USER} ${LOGDIR}/${run}/postgresql.log
        restore_pg
        update_pg_parameters
    done
done

The definition of pgbench_steps mentioned in the above sample code is given below. In the definition, a user can see we are using PREPARED statements and ${F_THREADS} = ${F_CONNECTIONS}

function pgbench_steps()
{
    typeset -r F_CONNECTIONS="$1"
    typeset -r F_THREADS="$1"
    typeset -r F_LOGFILE="$3"

    plog "Starting Postgres if its not started yet"
    start_stop_pg start
    plog "Peform pg_prewarm"
    cat ${LIBSQL}/pgbench_prewarm.sql \
     | ${PGSUDO} ${PSQL} --user=${PGUSER} \
                         --port=${PGPORT} \
                         --host=${PGHOST} \
                         --dbname=${PGDATABASE}

     plog "Perform checkpoint"
     ${PGSUDO} ${PSQL} --command="CHECKPOINT;" \
                       --username=${PGUSER} \
                       --port=${PGPORT} \
                       --host=${PGHOST} \
                       --dbname=${PGDATABASE}

     plog "Perform pgbench"
     ${PGBIN}/pgbench --client=${F_CONNECTIONS} \
                      --jobs=${F_THREADS} \
                      --time=${DURATION} \
                      --protocol=prepared \
                      --username=${PGUSER} \
                      --port=${PGPORT} \
                      --host=${PGHOST} \
                      ${PGDATABASE} >${F_LOGFILE}
}

The following results show that with pgbench, we were able to achieve 70,000 TPS with PostgreSQL 12.2 and  EDB Advanced Server12.2.

PostgreSQL 12.2 Results

Connections

Run 1

Run 2

Run 3

Median

1

876.333246

981.34652

988.981369

981.34652

16

10708.21359

10809.04874

10872.91065

10809.04874

32

18983.0072

19491.23698

20097.51526

19491.23698

64

33765.67255

34041.20951

34586.48111

34041.20951

128

54893.14464

56039.1552

55958.60076

55958.60076

256

67945.34531

68343.49373

69428.45183

68343.49373

512

70119.65119

69027.60012

70865.81639

70119.65119

550

70954.48097

70003.17815

70748.97134

70748.97134

600

69669.67524

70208.44087

71150.52718

70208.44087

EDB Advanced Server 12.2

Connections

Run 1

Run 2

Run 3

Median (TPS)

1

993.50

991.66

991.15

991.66

16

11,139.46

10,893.96

10,945.17

10,945.17

32

20,086.52

19,489.74

19,534.98

19,534.98

64

35,416.14

34,586.93

34,773.43

34,773.43

128

57,238.75

56,985.54

56,848.32

56,985.54

256

67,327.77

67,343.50

67,474.48

67,343.50

512

70,515.09

70,581.31

70,122.77

70,515.09

550

71,271.32

70,769.01

70,223.46

70,769.01

600

71,093.54

70,878.41

70,890.84

70,890.84

Please note: All the parameter settings and configuration mentioned in this document are specific to the pgbench test. We highly recommend tuning the system based on a benchmark defined for your workload and infrastructure.

Database parameters used for benchmarking:

Database Parameters

max_connections

1000

shared_buffers

80GB

work_mem

32

maintenance_work_mem

1

effective_cache_size

200

effective_io_concurrency

200

wal_level

replica

max_wal_senders

0

synchronous_commit

on

checkpoint_timeout

15

checkpoint_completion_target

0.9

max_wal_size

100GB

min_wal_size

20GB

wal_compression

on

wal_buffers

512MB

wal_writer_delay

200ms

wal_writer_flush_after

1MB

bgwriter_delay

200ms

bgwriter_lru_maxpages

100

bgwriter_lru_multiplier

2

bgwriter_flush_after

0

max_worker_processes

72

max_parallel_workers_per_gather

36

max_parallel_maintenance_workers

36

max_parallel_workers

72

parallel_leader_participation

on

enable_partitionwise_join

on

enable_partitionwise_aggregate

on

log_filename

postgresql.log

log_line_prefix

%t

log_lock_waits

on

log_temp_files

0

log_autovacuum_min_duration

0

log_checkpoints

on

 

 

Vibhor Kumar

Vibhor Kumar is Chief Performance Architect with 12+ years of leadership experience in designing innovative business solutions for customers and leads the performance engineering team at EnterpriseDB.