Benchmarking PostgreSQL with NOPM: The Daily 500 Users

February 18, 2021

Usually, benchmarks are measured in transactions per second, but the TPC-C and TPROC-C, benchmarks are measured in new orders per minute (NOPM). What is a new order? It’s simply a predefined operation on the database, that even fails 1% of the time. This is a better metric than transactions because you can have transactions on the database that are not part of the benchmark. Autoanalyze uses a transaction, for example.

I do TPROC-C benchmarking, which is the unofficial version of TPC-C that forgoes things like think-time. HammerDB can do both types.

Benchmarking with NOPM also allows me to use monitoring tools that cause transactions, but don’t (obviously) create new orders. If we were measuring transactions, we could just send a solitary semicolon to the database in a tight loop and get a whole bunch for free. By measuring new orders, we are testing how fast the database can do actual work. A side effect of this is that the NOPM are much lower than the TPM. That doesn’t matter. What matters is that the NOPM stay relatively constant compared to each other.

 

PostgreSQL Development with Daily Users

Every day, I do a HammerDB benchmark run for two hours using 500 users on the first commit of the day UTC. I do this to detect if a new patch has inadvertently caused a drop in performance.

Here is the graph for January 2021:

This is the raw data:

date

nopm

catversion

git hash

2021-01-01

490,109

202012293

4d3f03f42227bb351c2021a9ccea2fff9c023cfc

2021-01-02

516,539

202012293

ca3b37487be333a1d241dab1bbdd17a211a88f43

2021-01-03

 

202012293

 

2021-01-04

476,715

202012293

a271a1b50e9bec07e2ef3a05e38e7285113e4ce6

2021-01-05

512,718

202012293

fe05b6b620066aec313c43b6b4d6c169d0a346f7

2021-01-06

490,625

202012293

14d49f483d4c8a5a356e25d5e5ff5726ca43abff

2021-01-07

497,247

202012293

55fe26a4b580b17d721c5accb842cc6a08295273

2021-01-08

521,308

202012293

9ffe2278372d7549547176c23564a5b3404d072e

2021-01-09

423,147

202012293

e33d004900f76c35759293fdedd4861b198fbf5b

2021-01-10

 

202012293

 

2021-01-11

472,975

202012293

13a021f3e8c99915b3cc0cb2021a948d9c71ff32

2021-01-12

423,274

202012293

d5ab79d815783fe60062cefc423b54e82fbb92ff

2021-01-13

527,537

202012293

fce7d0e6efbef304e81846c75eddf73099628d10

2021-01-14

484,433

202101131

aef8948f38d9f3aa58bf8c2d4c6f62a7a456a9d1

2021-01-15

454,665

202101131

5e5f4fcd89c082bba0239e8db1552834b4905c34

2021-01-16

450,128

202101131

c95765f47673b16ed36acbfe98e1242e3c3822a3

2021-01-17

509,422

202101171

960869da0803427d14335bba24393f414b476e2c

2021-01-18

497,267

202101171

a3dc926009be833ea505eebd77ce4b72fe708b18

2021-01-19

469,663

202101181

ed43677e20369040ca4e50c698010c39d5ac0f47

2021-01-20

505,070

202101181

21378e1fefedcaed3d855ae7aa772555295d05d6

2021-01-21

467,455

202101181

733d670073efd2c3a9df07c225006668009ab793

2021-01-22

429,751

202101181

af0e79c8f4f4c3c2306855045c0d02a6be6485f0

2021-01-23

486,797

202101181

3fc81ce459e1696f7e5e5b3b8229409413bf64b4

2021-01-24

476,970

202101181

39b66a91bdebb00af71a2c6218412ecfc89a0e13

2021-01-25

422,180

202101181

40ab64c1ec1cb9bd73695f519cf66ddbb97d8144

2021-01-26

491,500

202101181

ee895a655ce4341546facd6f23e3e8f2931b96bf

2021-01-27

488,290

202101181

4c9c359d38ff1e2de388eedd860785be6a49201c

2021-01-28

451,614

202101181

f854c69a5b36ba7aa85bee9e9590c3e517970156

2021-01-29

494,281

202101181

514b411a2b5226167add9ab139d3a96dbe98035d

2021-01-30

471,347

202101181

f77717b2985aa529a185e6988de26b885ca10ddb

2021-01-31

445,362

202101181

0c4f355c6a5fd437f71349f2f3d5d491382572b7

(There were no commits all on the 3rd and the 10th.)

Because it’s only a single run as opposed to an average of five runs or so, there is some variation every day, but no major impact. This is good news.

 

Making Sense of the Data

If you would like to study all of the data that I collect before, during, and after a run; I have set up a public repository. My goal for the near future is to write some scripts to extract data and make pretty graphs, facilitating the analysis of these runs.

Tuning PostgreSQL is very important, but it is equally important to resist tuning for the benchmark. This is why I am not concerned with the exact number of NOPM I get, only that it remains relatively stable.

In a future blog post, I will share the scripts that I use to run these benchmarks so that you can run them yourselves, so stay tuned!
 

Share this

More Blogs