Benchmarking PostgreSQL: Setting up a RAID Array from Scratch

April 15, 2021

As part of my work on the CTO Team here at EDB, I do daily benchmarking of the PostgreSQL development branch with 500 concurrent users in order to make sure performance does not go down and to identify when in case it does. This also helps us determine which features improve performance for this TPROC-C workload.

We’ve noticed that we get a wide variety of results, even though I’m using a bare metal instance on AWS. In order to get more control over the environment (and therefore hopefully more stability), I am moving my benchmarking of PostgreSQL off of AWS and onto a dedicated machine in one of EDB's datacenters.

I am not the best when it comes to system administration, so the goal of this blog post is to help others who don't normally do this, and also to remind myself of what I did.

 

Steps to set up a RAID array

The first thing I need to do is mount the disks I'll be using.  While the system drive is a RAID 10 array of four SSD drives on a hardware controller, I want to use RAID 0 arrays of NVMe drives for the write ahead log and data directory, and they require a software RAID setup.  I don't need mirroring because all of the data is recreatable should one of the disks fail.  If this were a production system, I would use RAID 10.  To do this, I need to install mdadm.

dnf install mdadm

The first mount point is for the wal files and the second is the data directory.

mkdir /pgwal
mkdir /pgdata

The wal files will be on two drives and the data directory will be on four.  Does it make any sense to put a continuous append-only stream on two disks, or would one be sufficient?  I don't know!  I'll put on my todo list to benchmark both ways, so expect a new blog post for that in the future.

mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/nvme0n1 /dev/nvme1n1
mdadm --create /dev/md1 --level=0 --raid-devices=4 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1
mkfs.xfs /dev/md0
mkfs.xfs /dev/md1

I decided to use XFS for these drives, for no particular reason.  That, too, might be varied in the future to see if there is something better.

In order to mount them, I need their UUIDs and I get that like this:

lsblk -o NAME,UUID /dev/md0
lsblk -o NAME,UUID /dev/md1

And now I can put them in my /etc/fstab:

UUID=c28bbe22-b9bb-4973-90b8-a0d11293e982 /pgwal    xfs   
defaults,noatime    1 1

UUID=a8d2a8ef-de0a-40e8-8d9f-daa47f21ff64 /pgdata    xfs   
defaults,noatime    1 1

Now I can mount them, load the data, and start benchmarking!

mount -a

I will give the full specs of this server next time when I have some results to show. Here are the results of my daily benchmarking of the PostgreSQL development tree for the month of March 2021, performed on the current AWS environment.

PostgreSQL March Benchmarks

As can be seen from the graph, the average performance of PostgreSQL HEAD has stayed fairly consistent throughout March. As usual, all of the raw data is available on GitHub.

Stay tuned for next month’s benchmarks!
 

Share this

More Blogs

PGVector as Embedding Store in PrivateGPT

EDB has a long history of open source contributions, and while we’re best known for our contributions to Postgres, that’s not the only project we contribute to. e.g Barman, CloudNativePG...
June 05, 2024

pgAdmin User Management in Server Mode

pgAdmin can be deployed as a web application by configuring the app to run in server mode. One can check out server deployment on how to run pgAdmin in server...
August 24, 2023

pgAdmin CI/CD

Almost exactly three years ago I wrote a blog on my personal page entitled Testing pgAdmin which went into great detail discussing how we test pgAdmin prior to releases. Back...
August 24, 2023