# Aggregate Functionsv13

`Aggregate` functions compute a single result value from a set of input values. The built-in aggregate functions are listed in the following tables.

FunctionArgument TypeReturn TypeDescription
`AVG (expression)``INTEGER, REAL, DOUBLE PRECISION, NUMBER``NUMBER` for any integer type, `DOUBLE PRECISION` for a floating-point argument, otherwise the same as the argument data typeThe average (arithmetic mean) of all input values
`COUNT(*)``BIGINT`Number of input rows
`COUNT (expression)`Any`BIGINT`Number of input rows for which the value of expression is not null
`MAX (expression)`Any numeric, string, date/time, or bytea typeSame as argument typeMaximum value of expression across all input values
`MIN (expression)`Any numeric, string, date/time, or bytea typeSame as argument typeMinimum value of expression across all input values
`SUM (expression)``INTEGER, REAL, DOUBLE PRECISION, NUMBER``BIGINT` for `SMALLINT` or `INTEGER` arguments, `NUMBER` for `BIGINT` arguments, `DOUBLE PRECISION` for floating-point arguments, otherwise the same as the argument data typeSum of expression across all input values

It should be noted that except for `COUNT`, these functions return a null value when no rows are selected. In particular, `SUM` of no rows returns null, not zero as one might expect. The `COALESCE` function may be used to substitute zero for null when necessary.

The following table shows the aggregate functions typically used in statistical analysis. (These are separated out merely to avoid cluttering the listing of more-commonly-used aggregates.) Where the description mentions `N`, it means the number of input rows for which all the input expressions are non-null. In all cases, null is returned if the computation is meaningless, for example when `N` is zero.

FunctionArgument TypeReturn TypeDescription
`CORR(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Correlation coefficient
`COVAR_POP(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Population covariance
`COVAR_SAMP(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Sample covariance
`REGR_AVGX(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Average of the independent variable (sum`(X) / N`)
`REGR_AVGY(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Average of the dependent variable (sum`(Y) / N`)
`REGR_COUNT(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Number of input rows in which both expressions are nonnull
`REGR_INTERCEPT(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`y-intercept of the least-squares-fit linear equation determined by the `(X, Y)` pairs
`REGR_R2(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Square of the correlation coefficient
`REGR_SLOPE(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Slope of the least-squares-fit linear equation determined by the `(X, Y)` pairs
`REGR_SXX(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Sum (X2) – sum (X) 2 / N (“sum of squares” of the independent variable)
`REGR_SXY(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Sum (X*Y) – sum (X) * sum (Y) / N (“sum of products” of independent times dependent variable)
`REGR_SYY(Y, X)``DOUBLE PRECISION``DOUBLE PRECISION`Sum (Y2) – sum (Y) 2 / N (“sum of squares” of the dependent variable)
`STDDEV(expression)``INTEGER, REAL, DOUBLE PRECISION, NUMBER``DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER`Historic alias for `STDDEV_SAMP`
`STDDEV_POP(expression)``INTEGER, REAL, DOUBLE PRECISION, NUMBER``DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER`Population standard deviation of the input values
`STDDEV_SAMP(expression)``INTEGER, REAL, DOUBLE PRECISION, NUMBER``DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER`Sample standard deviation of the input values
`VARIANCE(expression)``INTEGER, REAL, DOUBLE PRECISION, NUMBER``DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER`Historical alias for `VAR_SAMP`
`VAR_POP(expression)``INTEGER, REAL, DOUBLE PRECISION, NUMBER``DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER`Population variance of the input values (square of the population standard deviation)
`VAR_SAMP(expression)``INTEGER, REAL, DOUBLE PRECISION, NUMBER``DOUBLE PRECISION` for floating-point arguments, otherwise `NUMBER`Sample variance of the input values (square of the sample standard deviation)

## LISTAGG

Advanced Server has added the `LISTAGG` function to support string aggregation. `LISTAGG` is an aggregate function that concatenates data from multiple rows into a single row in an ordered manner. You can optionally include a custom delimiter for your data.

The `LISTAGG` function mandates the use of an `ORDER BY` clause under a `WITHIN GROUP` clause to concatenate values of the measure column, and then generate the ordered aggregated data.

Objective

• `LISTAGG` can be used without any grouping. In this case, the `LISTAGG` function operates on all rows in a table and returns a single row.
• `LISTAGG` can be used with the `GROUP BY` clause. In this case, the `LISTAGG` function operates on each group and returns an aggregated output for each group.
• `LISTAGG` can be used with the `OVER` clause. In this case, the `LISTAGG` function partitions a query result set into groups based on the expression in the `query_partition_by_clause` and then aggregates data in each group.

Synopsis

Parameters

`measure_expr`

`measure_expr` (mandatory) specifies the column or expression that assigns a value to aggregate. `NULL` values are ignored.

`delimiter`

`delimiter` (optional) specifies a string that separates the concatenated values in the result row. The `delimiter` can be a `NULL` value, string, character literal, column name, or constant expression. If ignored, the `LISTAGG` function uses a `NULL` value by default.

`order_by_clause`

`order_by_clause` (mandatory) determines the sort order in which the concatenated values are returned.

`query_partition_by_clause`

`query_partition_by_clause` (optional) allows `LISTAGG` function to be used as an analytic function and sets the range of records for each group in the `OVER` clause.

Return Type

The `LISTAGG` function returns a string value.

Examples

The following example concatenates the values in the `emp` table and lists all the employees separated by a `delimiter` comma.

First, create a table named `emp` and then insert records into the `emp` table.

The following example uses `PARTITION BY` clause with `LISTAGG` in `emp` table and generates output based on a partition by `deptno` that applies to each partition and not on the entire table.

The following example is identical to the previous example, except it includes the `GROUP BY` clause.

## MEDIAN

The `MEDIAN` function calculates the middle value of an expression from a given range of values; `NULL` values are ignored. The `MEDIAN` function returns an error if a query does not reference the user-defined table.

Objective

• `MEDIAN` can be used without any grouping. In this case, the `MEDIAN` function operates on all rows in a table and returns a single row.
• `MEDIAN` can be used with the `OVER` clause. In this case, the `MEDIAN` function partitions a query result set into groups based on the `expression` specified in the `PARTITION BY` clause and then aggregates data in each group.

Synopsis

Parameters

`median_expression`

`median_expression` (mandatory) is a target column or expression that the `MEDIAN` function operates on and returns a median value. It can be a numeric, datetime, or interval data type.

`PARTITION BY`

`PARTITION BY` clause (optional) allows a `MEDIAN` function to be used as an analytic function and sets the range of records for each group in the `OVER` clause.

Return Types

The return type is determined by the input data type of `expression`. The following table illustrates the return type for each input type.

Input TypeReturn Type
`BIGINT``NUMERIC`
`FLOAT, DOUBLE PRECISION``DOUBLE PRECISION`
`INTEGER``NUMERIC`
`INTERVAL``INTERVAL`
`NUMERIC``NUMERIC`
`REAL``REAL`
`SMALLINT``NUMERIC`
`TIMESTAMP``TIMESTAMP`
`TIMESTAMPTZ``TIMESTAMPTZ`

Examples

In the following example, a query returns the median salary for each department in the `emp` table:

The following example uses `PARTITION BY` clause with `MEDIAN` in `emp` table and returns the median salary based on a partition by `deptno`:

The `MEDIAN` function can be compared with `PERCENTILE_CONT`. In the following example, `MEDIAN` generates the same result as `PERCENTILE_CONT`:

## STATS_MODE

The `STATS_MODE` function takes a set of values as an argument and returns the value that occurs with the highest frequency. If multiple values are appearing with the same frequency, the `STATS_MODE` function arbitrarily chooses the first value and returns only that one value.

Objective

• `STATS_MODE` function can be used without any grouping. In this case, the `STATS_MODE` function operates on all the rows in a table and returns a single value.
• `STATS_MODE` can be used as an ordered-set aggregate function using the `WITHIN GROUP` clause. In this case, the `STATS_MODE` function operates on the ordered data set.
• `STATS_MODE` can be used with the `GROUP BY` clause. In this case, the `STATS_MODE` function operates on each group and returns the most frequent and aggregated output for each group.

Synopsis

OR

Parameters

`expr`

An expression or value to assign to the column.

Return Type

The `STATS_MODE` function returns a value that appears frequently. However, if all the values of a column are `NULL`, the `STATS_MODE` returns `NULL`.

Examples

The following example returns the mode of salary in the `emp` table:

The following example uses `GROUP BY` and `ORDER BY` clause with `STATS_MODE` in `emp` table and returns the salary based on a partition by `deptno`:

The following example uses the `WITHIN GROUP` clause with the `STATS_MODE` function to perform aggregation on the ordered data set.