pg_stat_statements v7.4
pg_stat_statements is a core PostgreSQL module that provides a means to track planning and execution statistics of all SQL statements executed by the server.
In WarehousePG (WHPG) distributed architecture, query execution is parallelized across the coordinator and multiple segments. The WHPG implementation of this module provides cluster-wide visibility, collecting performance metrics from every segment node. This allows administrators to analyze buffer usage, I/O operations, and execution patterns across the entire distributed cluster rather than just the coordinator.
By extending standard functionality into the parallel execution layer, pg_stat_statements enables:
- Comprehensive distributed tracking across all query executors.
- Unified visibility through specialized system views for both raw and aggregated data.
- Centralized management for resetting performance metrics across the entire cluster simultaneously.
Loading the module
pg_stat_statements is shipped with WarehousePG 7.4 and later. You must load the extension as a shared library before you can use it within your databases.
Load the extension as a shared library before you can use it within your databases. Check for existing shared libraries:
gpconfig -s shared_preload_librariesUse the output of the previous command to enable `pg_stat_statements1, along any other shared libraries, and restart WHPG:
gpconfig -c shared_preload_libraries -v '<other_libraries>, pg_stat_statements' gpstop -ar
Create the extension in your database:
CREATE EXTENSION pg_stat_statements;
Analyzing performance with distributed views
The statistics gathered by the module are made available via two specialized views:
whpg_stat_statements
This view displays raw statistics from the coordinator and all segments in a single table.
SELECT * FROM whpg_stat_statements;
| Column | Type | Description |
|---|---|---|
gp_segment_id | integer | Segment identifier: -1 for the coordinator, 0...N-1 for segments. |
userid | oid | OID of the user who executed the statement. |
dbid | oid | OID of the database in which the statement was executed. |
queryid | bigint | Internal hash code computed from the statement's parse tree. |
query | text | Normalized representative query text. |
plans | bigint | Number of times the statement was planned. |
total_plan_time | double precision | Total time spent planning the statement (milliseconds). |
calls | bigint | Number of times the statement was executed. |
total_exec_time | double precision | Total time spent executing the statement (milliseconds). |
rows | bigint | Total number of rows retrieved or affected by the statement. |
shared_blks_hit | bigint | Total number of shared block cache hits. |
shared_blks_read | bigint | Total number of shared blocks read from disk. |
shared_blks_dirtied | bigint | Total number of shared blocks dirtied. |
shared_blks_written | bigint | Total number of shared blocks written to disk. |
local_blks_hit | bigint | Total number of local block cache hits. |
local_blks_read | bigint | Total number of local blocks read from disk. |
temp_blks_read | bigint | Total number of temporary blocks read from disk. |
temp_blks_written | bigint | Total number of temporary blocks written to disk. |
blk_read_time | double precision | Total time spent reading blocks (if track_io_timing is enabled). |
blk_write_time | double precision | Total time spent writing blocks (if track_io_timing is enabled). |
wal_records | bigint | Total number of WAL records generated by the statement. |
wal_bytes | numeric | Total number of WAL bytes generated by the statement. |
whpg_stat_statements_aggregated
This view provides a unified summary of total resource consumption per query across the entire cluster.
SELECT * FROM whpg_stat_statements_aggregated;
| Column | Type | Description |
|---|---|---|
userid | oid | OID of the user. |
dbid | oid | OID of the database. |
queryid | bigint | Internal hash code. |
query | text | Normalized representative query text. |
calls | bigint | Total number of executions. |
total_time | double precision | Total time spent executing. |
min_time | double precision | Minimum time spent executing. |
max_time | double precision | Maximum time spent executing. |
mean_time | double precision | Mean time spent executing. |
stddev_time | double precision | Population standard deviation of execution time. |
rows | bigint | Total rows retrieved/affected. |
shared_blks_hit | bigint | Total shared block cache hits. |
shared_blks_read | bigint | Total shared blocks read from disk. |
shared_blks_dirtied | bigint | Total shared blocks dirtied. |
shared_blks_written | bigint | Total shared blocks written to disk. |
local_blks_hit | bigint | Total local block cache hits. |
local_blks_read | bigint | Total local blocks read from disk. |
local_blks_dirtied | bigint | Total local blocks dirtied. |
local_blks_written | bigint | Total local blocks written to disk. |
temp_blks_read | bigint | Total temporary blocks read from disk. |
temp_blks_written | bigint | Total temporary blocks written to disk. |
blk_read_time | double precision | Total block read time. |
blk_write_time | double precision | Total block write time. |
To provide a consolidated view of cluster performance, the metrics in this view are processed as follows:
- Buffer and I/O statistics (
shared_blks_*,local_blks_*,temp_blks_*,blk_*_time) are summed across all nodes to show total cluster effort. - Timing statistics (
total_time,min_time,max_time,mean_time) reflect the values recorded at the coordinator level, representing the actual duration experienced by the user.
Managing statistics globally
The pg_stat_statements module includes functions to manage and clear gathered performance data.
whpg_stat_statements_reset
Resets all collected statistics on the coordinator and all segments simultaneously.
-- Global reset for all users and databases SELECT whpg_stat_statements_reset(); -- Targeted reset for specific user, database, or query SELECT whpg_stat_statements_reset(userid, dbid, queryid);
| Parameter | Type | Description |
|---|---|---|
p_userid | Oid | Reset entries for a specific user (use 0 for all users). |
p_dbid | Oid | Reset entries for a specific database (use 0 for all databases). |
p_queryid | bigint | Reset a specific query signature (use 0 for all queries). |
Note
This function requires superuser privileges.
Examples
View per-segment buffer usage
Check if specific segments are performing significantly more physical reads than others, which often indicates poor data distribution.
SELECT gp_segment_id, query, shared_blks_read, shared_blks_hit, blk_read_time FROM whpg_stat_statements WHERE query LIKE '%my_table%' ORDER BY gp_segment_id;
Get aggregated statistics
Identify queries consuming the most total I/O time summed across the entire cluster.
SELECT query, calls, rows, shared_blks_hit, blk_read_time + blk_write_time AS total_io_time_ms FROM whpg_stat_statements_aggregated ORDER BY total_io_time_ms DESC LIMIT 10;
Reset statistics before benchmarking
To ensure accurate results for a new workload, reset all cluster statistics first.
-- 1. Clear cluster-wide history SELECT whpg_stat_statements_reset(); -- 2. Run workload -- [Queries] -- 3. Analyze aggregated cluster costs SELECT * FROM whpg_stat_statements_aggregated;