Benchmarking PostgreSQL: Setting up a RAID Array from Scratch

As part of my work on the CTO Team here at EDB, I do daily benchmarking of the PostgreSQL development branch with 500 concurrent users in order to make sure performance does not go down and to identify when in case it does. This also helps us determine which features improve performance for this TPROC-C workload.

We’ve noticed that we get a wide variety of results, even though I’m using a bare metal instance on AWS. In order to get more control over the environment (and therefore hopefully more stability), I am moving my benchmarking of PostgreSQL off of AWS and onto a dedicated machine in one of EDB's datacenters.

I am not the best when it comes to system administration, so the goal of this blog post is to help others who don't normally do this, and also to remind myself of what I did.


Steps to set up a RAID array

The first thing I need to do is mount the disks I'll be using.  While the system drive is a RAID 10 array of four SSD drives on a hardware controller, I want to use RAID 0 arrays of NVMe drives for the write ahead log and data directory, and they require a software RAID setup.  I don't need mirroring because all of the data is recreatable should one of the disks fail.  If this were a production system, I would use RAID 10.  To do this, I need to install mdadm.

dnf install mdadm

The first mount point is for the wal files and the second is the data directory.

mkdir /pgwal
mkdir /pgdata

The wal files will be on two drives and the data directory will be on four.  Does it make any sense to put a continuous append-only stream on two disks, or would one be sufficient?  I don't know!  I'll put on my todo list to benchmark both ways, so expect a new blog post for that in the future.

mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/nvme0n1 /dev/nvme1n1
mdadm --create /dev/md1 --level=0 --raid-devices=4 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1
mkfs.xfs /dev/md0
mkfs.xfs /dev/md1

I decided to use XFS for these drives, for no particular reason.  That, too, might be varied in the future to see if there is something better.

In order to mount them, I need their UUIDs and I get that like this:

lsblk -o NAME,UUID /dev/md0
lsblk -o NAME,UUID /dev/md1

And now I can put them in my /etc/fstab:

UUID=c28bbe22-b9bb-4973-90b8-a0d11293e982 /pgwal    xfs   
defaults,noatime    1 1

UUID=a8d2a8ef-de0a-40e8-8d9f-daa47f21ff64 /pgdata    xfs   
defaults,noatime    1 1

Now I can mount them, load the data, and start benchmarking!

mount -a

I will give the full specs of this server next time when I have some results to show. Here are the results of my daily benchmarking of the PostgreSQL development tree for the month of March 2021, performed on the current AWS environment.

PostgreSQL March Benchmarks

As can be seen from the graph, the average performance of PostgreSQL HEAD has stayed fairly consistent throughout March. As usual, all of the raw data is available on GitHub.

Stay tuned for next month’s benchmarks!

Vik Fearing

Vik Fearing is a PostgreSQL expert at EDB, and lives in France. He has been in the PostgreSQL community since 2008 and is the founder and co-organizer of pgDay Paris, co-organizer of PostgreSQL Conference Europe, and a volunteer and speaker at many other conferences around the world. He i ...