As part of my work on the CTO Team here at EDB, I do daily benchmarking of the PostgreSQL development branch with 500 concurrent users in order to make sure performance does not go down and to identify when in case it does. This also helps us determine which features improve performance for this TPROC-C workload.
We’ve noticed that we get a wide variety of results, even though I’m using a bare metal instance on AWS. In order to get more control over the environment (and therefore hopefully more stability), I am moving my benchmarking of PostgreSQL off of AWS and onto a dedicated machine in one of EDB's datacenters.
I am not the best when it comes to system administration, so the goal of this blog post is to help others who don't normally do this, and also to remind myself of what I did.
Steps to set up a RAID array
The first thing I need to do is mount the disks I'll be using. While the system drive is a RAID 10 array of four SSD drives on a hardware controller, I want to use RAID 0 arrays of NVMe drives for the write ahead log and data directory, and they require a software RAID setup. I don't need mirroring because all of the data is recreatable should one of the disks fail. If this were a production system, I would use RAID 10. To do this, I need to install mdadm.
dnf install mdadm
The first mount point is for the wal files and the second is the data directory.
mkdir /pgwal
mkdir /pgdata
The wal files will be on two drives and the data directory will be on four. Does it make any sense to put a continuous append-only stream on two disks, or would one be sufficient? I don't know! I'll put on my todo list to benchmark both ways, so expect a new blog post for that in the future.
mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/nvme0n1 /dev/nvme1n1
mdadm --create /dev/md1 --level=0 --raid-devices=4 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1
mkfs.xfs /dev/md0
mkfs.xfs /dev/md1
I decided to use XFS for these drives, for no particular reason. That, too, might be varied in the future to see if there is something better.
In order to mount them, I need their UUIDs and I get that like this:
lsblk -o NAME,UUID /dev/md0
lsblk -o NAME,UUID /dev/md1
And now I can put them in my /etc/fstab:
UUID=c28bbe22-b9bb-4973-90b8-a0d11293e982 /pgwal xfs
defaults,noatime 1 1
UUID=a8d2a8ef-de0a-40e8-8d9f-daa47f21ff64 /pgdata xfs
defaults,noatime 1 1
Now I can mount them, load the data, and start benchmarking!
mount -a
I will give the full specs of this server next time when I have some results to show. Here are the results of my daily benchmarking of the PostgreSQL development tree for the month of March 2021, performed on the current AWS environment.
As can be seen from the graph, the average performance of PostgreSQL HEAD has stayed fairly consistent throughout March. As usual, all of the raw data is available on GitHub.
Stay tuned for next month’s benchmarks!