Get Postgres Tips and Tricks
Subscribe to get advanced Postgres howtos.
51.50. pg_statistic
The catalog pg_statistic
stores
statistical data about the contents of the database. Entries are
created by ANALYZE
and subsequently used by the query planner. Note that all the
statistical data is inherently approximate, even assuming that it
is uptodate.
Normally there is one entry, with stainherit
=
false
, for each table column that has been analyzed.
If the table has inheritance children, a second entry with
stainherit
= true
is also created. This row
represents the column's statistics over the inheritance tree, i.e.,
statistics for the data you'd see with
SELECT
,
whereas the column
FROM table
*stainherit
= false
row represents
the results of
SELECT
.
column
FROM ONLY table
pg_statistic
also stores statistical data about
the values of index expressions. These are described as if they were
actual data columns; in particular, starelid
references the index. No entry is made for an ordinary nonexpression
index column, however, since it would be redundant with the entry
for the underlying table column. Currently, entries for index expressions
always have stainherit
= false
.
Since different kinds of statistics might be appropriate for different
kinds of data, pg_statistic
is designed not
to assume very much about what sort of statistics it stores. Only
extremely general statistics (such as nullness) are given dedicated
columns in pg_statistic
. Everything else
is stored in “slots”, which are groups of associated columns
whose content is identified by a code number in one of the slot's columns.
For more information see
src/include/catalog/pg_statistic.h
.
pg_statistic
should not be readable by the
public, since even statistical information about a table's contents
might be considered sensitive. (Example: minimum and maximum values
of a salary column might be quite interesting.)
pg_stats
is a publicly readable view on
pg_statistic
that only exposes information
about those tables that are readable by the current user.
Table 51.50. pg_statistic
Columns
Name  Type  References  Description 

starelid  oid 
 The table or index that the described column belongs to 
staattnum  int2 
 The number of the described column 
stainherit  bool  If true, the stats include inheritance child columns, not just the values in the specified relation  
stanullfrac  float4  The fraction of the column's entries that are null  
stawidth  int4  The average stored width, in bytes, of nonnull entries  
stadistinct  float4  The number of distinct nonnull data values in the column.
A value greater than zero is the actual number of distinct values.
A value less than zero is the negative of a multiplier for the number
of rows in the table; for example, a column in which about 80% of the
values are nonnull and each nonnull value appears about twice on
average could be represented by stadistinct = 0.4.
A zero value means the number of distinct values is unknown.
 
stakind  int2 
A code number indicating the kind of statistics stored in the
N th “slot” of the
pg_statistic row.
 
staop  oid 

An operator used to derive the statistics stored in the
N th “slot”. For example, a
histogram slot would show the < operator
that defines the sort order of the data.

stacoll  oid 

The collation used to derive the statistics stored in the
N th “slot”. For example, a
histogram slot for a collatable column would show the collation that
defines the sort order of the data. Zero for noncollatable data.

stanumbers  float4[] 
Numerical statistics of the appropriate kind for the
N th “slot”, or null if the slot
kind does not involve numerical values
 
stavalues  anyarray 
Column data values of the appropriate kind for the
N th “slot”, or null if the slot
kind does not store any data values. Each array's element
values are actually of the specific column's data type, or a related
type such as an array's element type, so there is no way to define
these columns' type more specifically than anyarray .
