# Aggregate functions v16

Aggregate functions compute a single result value from a set of input values.

## Built-in aggregate functions

The built-in aggregate functions are listed in the following tables.

Function | Argument type | Return type | Description |
---|---|---|---|

`AVG (expression)` | `INTEGER, REAL, DOUBLE PRECISION, NUMBER` | `NUMBER` for any integer type, `DOUBLE PRECISION` for a floating-point argument, otherwise the same as the argument data type | The average (arithmetic mean) of all input values |

`COUNT(*)` | `BIGINT` | Number of input rows | |

`COUNT (expression)` | Any | `BIGINT` | Number of input rows for which the value of expression is not null |

`MAX (expression)` | Any numeric, string, date/time, or bytea type | Same as argument type | Maximum value of expression across all input values |

`MIN (expression)` | Any numeric, string, date/time, or bytea type | Same as argument type | Minimum value of expression across all input values |

`SUM (expression)` | `INTEGER, REAL, DOUBLE PRECISION, NUMBER` | `BIGINT` for `SMALLINT` or `INTEGER` arguments, `NUMBER` for `BIGINT` arguments, `DOUBLE PRECISION` for floating-point arguments, otherwise the same as the argument data type | Sum of expression across all input values |

Except for `COUNT`

, these functions return a null value when no rows are selected. In particular, `SUM`

of no rows returns null, not zero. You can use the `COALESCE`

function to substitute zero for null when necessary.

## Aggregate functions for statistical analysis

The following table shows the aggregate functions typically used in statistical analysis. Where the description mentions `N`

, it means the number of input rows for which all the input expressions are non-null. In all cases, null is returned if the computation is meaningless, for example when `N`

is zero.

Function | Argument type | Return type | Description |
---|---|---|---|

`CORR(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Correlation coefficient |

`COVAR_POP(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Population covariance |

`COVAR_SAMP(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Sample covariance |

`REGR_AVGX(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Average of the independent variable (sum`(X) / N` ) |

`REGR_AVGY(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Average of the dependent variable (sum`(Y) / N` ) |

`REGR_COUNT(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Number of input rows in which both expressions are nonnull |

`REGR_INTERCEPT(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | y-intercept of the least-squares-fit linear equation determined by the `(X, Y)` pairs |

`REGR_R2(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Square of the correlation coefficient |

`REGR_SLOPE(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Slope of the least-squares-fit linear equation determined by the `(X, Y)` pairs |

`REGR_SXX(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Sum (X^{2}) – sum (X) ^{2} / N (“sum of squares” of the independent variable) |

`REGR_SXY(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Sum (X*Y) – sum (X) * sum (Y) / N (“sum of products” of independent times dependent variable) |

`REGR_SYY(Y, X)` | `DOUBLE PRECISION` | `DOUBLE PRECISION` | Sum (Y^{2}) – sum (Y) ^{2} / N (“sum of squares” of the dependent variable) |

`STDDEV(expression)` | `INTEGER, REAL, DOUBLE PRECISION, NUMBER` | `DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER` | Historic alias for `STDDEV_SAMP` |

`STDDEV_POP(expression)` | `INTEGER, REAL, DOUBLE PRECISION, NUMBER` | `DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER` | Population standard deviation of the input values |

`STDDEV_SAMP(expression)` | `INTEGER, REAL, DOUBLE PRECISION, NUMBER` | `DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER` | Sample standard deviation of the input values |

`VARIANCE(expression)` | `INTEGER, REAL, DOUBLE PRECISION, NUMBER` | `DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER` | Historical alias for `VAR_SAMP` |

`VAR_POP(expression)` | `INTEGER, REAL, DOUBLE PRECISION, NUMBER` | `DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER` | Population variance of the input values (square of the population standard deviation) |

`VAR_SAMP(expression)` | `INTEGER, REAL, DOUBLE PRECISION, NUMBER` | `DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER` | Sample variance of the input values (square of the sample standard deviation) |

## LISTAGG

`LISTAGG`

is an aggregate function that concatenates data from multiple rows into a single row in an ordered manner. You can optionally include a custom delimiter for your data.

The `LISTAGG`

function mandates the use of an `ORDER BY`

clause under a `WITHIN GROUP`

clause to concatenate values of the measure column and then generate the ordered aggregated data.

### Objective

- You can use
`LISTAGG`

without any grouping. In this case, the`LISTAGG`

function operates on all rows in a table and returns a single row. - You can use
`LISTAGG`

with the`GROUP BY`

clause. In this case, the`LISTAGG`

function operates on each group and returns an aggregated output for each group. - You can use
`LISTAGG`

with the`OVER`

clause. In this case, the`LISTAGG`

function partitions a query result set into groups based on the expression in the`query_partition_by_clause`

and then aggregates data in each group.

### Synopsis

### Parameters

`measure_expr`

`measure_expr`

(mandatory) specifies the column or expression that assigns a value to aggregate. `NULL`

values are ignored.

`delimiter`

`delimiter`

(optional) specifies a string that separates the concatenated values in the result row. The `delimiter`

can be a `NULL`

value, string, character literal, column name, or constant expression. If ignored, the `LISTAGG`

function uses a `NULL`

value by default.

`order_by_clause`

`order_by_clause`

(mandatory) determines the sort order in which the concatenated values are returned.

`query_partition_by_clause`

`query_partition_by_clause`

(optional) allows the `LISTAGG`

function to be used as an analytic function and sets the range of records for each group in the `OVER`

clause.

### Return type

The `LISTAGG`

function returns a string value.

### Examples

This example concatenates the values in the `emp`

table and lists all the employees separated by a `delimiter`

comma. First, create a table named `emp`

. Then insert records into the `emp`

table.

This example uses a `PARTITION BY`

clause with `LISTAGG`

in the `emp`

table. It generates output based on a partition by `deptno`

that applies to each partition and not on the entire table.

This example includes the `GROUP BY`

clause.

## MEDIAN

The `MEDIAN`

function calculates the middle value of an expression from a given range of values. `NULL`

values are ignored. The `MEDIAN`

function returns an error if a query doesn't reference the user-defined table.

### Objective

- You can use
`MEDIAN`

without any grouping. In this case, the`MEDIAN`

function operates on all rows in a table and returns a single row. - You can use
`MEDIAN`

with the`OVER`

clause. In this case, the`MEDIAN`

function partitions a query result set into groups based on the`expression`

specified in the`PARTITION BY`

clause. It then aggregates data in each group.

### Synopsis

### Parameters

`median_expression`

`median_expression`

(mandatory) is a target column or expression that the `MEDIAN`

function operates on and returns a median value. It can be a numeric, datetime, or interval data type.

`PARTITION BY`

`PARTITION BY`

clause (optional) allows you to use `MEDIAN`

as an analytic function and sets the range of records for each group in the `OVER`

clause.

### Return types

The return type is determined by the input data type of `expression`

. The following table shows the return type for each input type.

Input type | Return type |
---|---|

`BIGINT` | `NUMERIC` |

`FLOAT, DOUBLE PRECISION` | `DOUBLE PRECISION` |

`INTEGER` | `NUMERIC` |

`INTERVAL` | `INTERVAL` |

`NUMERIC` | `NUMERIC` |

`REAL` | `REAL` |

`SMALLINT` | `NUMERIC` |

`TIMESTAMP` | `TIMESTAMP` |

`TIMESTAMPTZ` | `TIMESTAMPTZ` |

### Examples

In this example, a query returns the median salary for each department in the `emp`

table:

This example uses the `PARTITION BY`

clause with `MEDIAN`

in the `emp`

table and returns the median salary based on a partition by `deptno`

:

You can compare the `MEDIAN`

function with `PERCENTILE_CONT`

. In this example, `MEDIAN`

generates the same result as `PERCENTILE_CONT`

:

## STATS_MODE

The `STATS_MODE`

function takes a set of values as an argument and returns the value that occurs with the highest frequency. If multiple values appear with the same frequency, the `STATS_MODE`

function arbitrarily chooses the first value and returns only that one value.

### Objective

- You can use
`STATS_MODE`

without any grouping. In this case, the`STATS_MODE`

function operates on all the rows in a table and returns a single value. - You can use
`STATS_MODE`

as an ordered-set aggregate function using the`WITHIN GROUP`

clause. In this case, the`STATS_MODE`

function operates on the ordered data set. - You can use
`STATS_MODE`

with the`GROUP BY`

clause. In this case, the`STATS_MODE`

function operates on each group and returns the most frequent and aggregated output for each group.

### Synopsis

Or

### Parameters

`expr`

An expression or value to assign to the column.

### Return type

The `STATS_MODE`

function returns a value that appears frequently. However, if all the values of a column are `NULL`

, `STATS_MODE`

returns `NULL`

.

### Examples

This example returns the mode of salary in the `emp`

table:

This example uses `GROUP BY`

and `ORDER BY`

clauses with `STATS_MODE`

in the `emp`

table and returns the salary based on a partition by `deptno`

:

This example uses the `WITHIN GROUP`

clause with `STATS_MODE`

to perform aggregation on the ordered data set.

## BIT_AND_AGG

`BIT_AND_AGG`

is a bitwise aggregation function that performs a bitwise `AND`

operation and returns a value based on the data type of input argument.

### Objective

- You can use the
`BIT_AND_AGG`

function with the`GROUP BY`

clause. In this case, the`BIT_AND_AGG`

function operates on a group and returns the result of a bitwise`AND`

operation. - You can use the
`DISTINCT`

or`UNIQUE`

keywords with the`BIT_AND_AGG`

function to ensure that unique values in the`expr`

are used for computation.

### Synopsis

### Parameters

`expr`

An expression or value to assign to the column.

### Return Type

The `BIT_AND_AGG`

function returns the same value as the data type of the input argument. However, if all the values of a column are `NULL`

, the `BIT_AND_AGG`

returns `NULL`

.

### Examples

This example applies the `BIT_AND_AGG`

function to the `sal`

column in an `emp`

table and groups the result by `deptno`

:

## BIT_OR_AGG

`BIT_OR_AGG`

is a bitwise aggregation function that performs a bitwise `OR`

operation and returns a value based on the data type of input argument.

### Objective

- You can use the
`BIT_OR_AGG`

function with the`GROUP BY`

clause. In this case, the`BIT_OR_AGG`

function operates on a group and returns the result of a bitwise`OR`

operation. - You can use the
`DISTINCT`

or`UNIQUE`

keywords with the`BIT_OR_AGG`

function to ensure that unique values in the`expr`

are used for computation.

### Synopsis

### Parameters

`expr`

An expression or value to assign to the column.

### Return type

The `BIT_OR_AGG`

function returns the same value as the data type of the input argument. However, if all the values of a column are `NULL`

, `BIT_OR_AGG`

returns `NULL`

.

### Examples

This example applies `BIT_OR_AGG`

to the `sal`

column in an `emp`

table and groups the result by `deptno`

: