Going Deep on Stats

February 28, 2017

Postgres includes lots of monitoring tools that allow you to see what is happening, but there are a few settings that really go into detail, e.g. log_statement_stats:

 

SET client_min_messages = log;

SET log_statement_stats = true;

SELECT 100;
LOG:  QUERY STATISTICS
DETAIL:  ! system usage stats:
1 !       0.000000 s user, 0.000000 s system, 0.000201 s elapsed
2 !       [0.000000 s user, 0.000000 s system total]
3 !       0/0 [0/0] filesystem blocks in/out
4 !       0/14 [0/363] page faults/reclaims, 0 [0] swaps
5 !       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
6 !       0/0 [3/0] voluntary/involuntary context switches
 ?column?
----------
      100

What do all those numbers mean? Well, the Linux getrusage() manual page has a clue — these are kernel resource usage statistics. The first two lines show the user and kernel ("system") cpu time used, as well as elapsed time. (The numbers in brackets on line #2 and following are totals for this process.) Line #3 shows I/O from/to the storage device (not from the kernel cache). Line #4 covers memory pages faulted into the process address space and reclaimed. Line #5 shows signal and ipc message activity. Line #6 shows process context switches.

Let's look at some more interesting queries by populating a new table:
 

CREATE TABLE test(x INTEGER);
INSERT INTO test SELECT x FROM generate_series(1, 1000) t(x);

By clearing the kernel buffers with echo 3 > /proc/sys/vm/drop_caches and restarting the server we can see reads from the storage device (via log_statement_stats) and reads from the kernel cache (via explain (buffers true):

EXPLAIN (ANALYZE true, BUFFERS true, TIMING false, COSTS false)
SELECT COUNT() FROM test;
LOG:  QUERY STATISTICS
DETAIL:  ! system usage stats:
!       0.000000 s user, 0.000000 s system, 0.019259 s elapsed
!       [0.008000 s user, 0.000000 s system total]
!       1712/0 [5840/280] filesystem blocks in/out
!       7/94 [21/586] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       29/1 [73/3] voluntary/involuntary context switches
                    QUERY PLAN
---------------------------------------------------
 Aggregate (actual rows=1 loops=1)
   Buffers: shared read=5
   ->  Seq Scan on test (actual rows=1000 loops=1)
         Buffers: shared read=5
 Planning time: 3.673 ms
 Execution time: 2.221 ms


EXPLAIN (ANALYZE true, BUFFERS true, TIMING false, COSTS false)
SELECT COUNT(
) FROM test;
LOG:  QUERY STATISTICS
DETAIL:  ! system usage stats:
!       0.000000 s user, 0.000000 s system, 0.000408 s elapsed
!       [0.008000 s user, 0.000000 s system total]
!       0/0 [5840/288] filesystem blocks in/out
!       0/3 [21/591] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       0/0 [74/3] voluntary/involuntary context switches
                    QUERY PLAN
---------------------------------------------------
 Aggregate (actual rows=1 loops=1)
   Buffers: shared hit=5
   ->  Seq Scan on test (actual rows=1000 loops=1)
         Buffers: shared hit=5
 Planning time: 0.027 ms
 Execution time: 0.250 ms

Notice the first count reads 1712 512-byte blocks from storage and 5 8kB blocks from the kernel cache into the shared buffer cache. (The unrecorded reads are probably system table information.) The second query shows no I/O from the storage device, and a hit on 5 8kB shared buffer pages.

As you might know, Postgres executes queries in stages. Fortunately, you can get rusage statistics for each stage with log_parser_stats, log_planner_stats, and log_executor_stats. (explain (analyze true) also shows planning and execution durations, which you can see above.)

As you can see, you can access a wealth of information about Postgres and how it is using the cpus, storage, and caches. These have a large impact on how quickly queries execute.

Bruce Momjian is a Senion Database Architect at EnterpriseDB. 

This post originally appeared on Bruce's personal blog.

Share this