GROUPING Function v11

When using the ROLLUP, CUBE, or GROUPING SETS extensions to the GROUP BY clause, it may sometimes be difficult to differentiate between the various levels of subtotals generated by the extensions as well as the base aggregate rows in the result set. The GROUPING function provides a means of making this distinction.

The general syntax for use of the GROUPING function is shown by the following.

SELECT [ <expr> ...,] GROUPING( <col_expr> ) [, <expr> ] ...
FROM ...
GROUP BY [...,]
  { ROLLUP | CUBE | GROUPING SETS }( [...,] <col_expr>
  [, ...] ) [, ...]

The GROUPING function takes a single parameter that must be an expression of a dimension column specified in the expression list of a ROLLUP, CUBE, or GROUPING SETS extension of the GROUP BY clause.

The return value of the GROUPING function is either a 0 or 1. In the result set of a query, if the column expression specified in the GROUPING function is null because the row represents a subtotal over multiple values of that column then the GROUPING function returns a value of 1. If the row returns results based on a particular value of the column specified in the GROUPING function, then the GROUPING function returns a value of 0. In the latter case, the column can be null as well as non-null, but in any case, it is for a particular value of that column, not a subtotal across multiple values.

The following query shows how the return values of the GROUPING function correspond to the subtotal lines.

SELECT loc, dname, job, COUNT(*) AS "employees",
  GROUPING(loc) AS "gf_loc",
  GROUPING(dname) AS "gf_dname",
  GROUPING(job) AS "gf_job"
FROM emp e, dept d
WHERE e.deptno = d.deptno
GROUP BY ROLLUP (loc, dname, job)
ORDER BY 1, 2, 3;

In the three right-most columns displaying the output of the GROUPING functions, a value of 1 appears on a subtotal line wherever a subtotal is taken across values of the corresponding columns.

loc      |   dname    |    job    | employees | gf_loc | gf_dname | gf_job
---------+------------+-----------+-----------+--------+----------+--------
BOSTON   | OPERATIONS | ANALYST   |         1 |      0 |        0 |      0
BOSTON   | OPERATIONS | CLERK     |         1 |      0 |        0 |      0
BOSTON   | OPERATIONS | MANAGER   |         1 |      0 |        0 |      0
BOSTON   | OPERATIONS |           |         3 |      0 |        0 |      1
BOSTON   | RESEARCH   | ANALYST   |         2 |      0 |        0 |      0
BOSTON   | RESEARCH   | CLERK     |         2 |      0 |        0 |      0
BOSTON   | RESEARCH   | MANAGER   |         1 |      0 |        0 |      0
BOSTON   | RESEARCH   |           |         5 |      0 |        0 |      1
BOSTON   |            |           |         8 |      0 |        1 |      1
CHICAGO  | SALES      | CLERK     |         1 |      0 |        0 |      0
CHICAGO  | SALES      | MANAGER   |         1 |      0 |        0 |      0
CHICAGO  | SALES      | SALESMAN  |         4 |      0 |        0 |      0
CHICAGO  | SALES      |           |         6 |      0 |        0 |      1
CHICAGO  |            |           |         6 |      0 |        1 |      1
NEW YORK | ACCOUNTING | CLERK     |         1 |      0 |        0 |      0
NEW YORK | ACCOUNTING | MANAGER   |         1 |      0 |        0 |      0
NEW YORK | ACCOUNTING | PRESIDENT |         1 |      0 |        0 |      0
NEW YORK | ACCOUNTING |           |         3 |      0 |        0 |      1
NEW YORK |            |           |         3 |      0 |        1 |      1
         |            |           |        17 |      1 |        1 |      1
(20 rows)

These indicators can be used as screening criteria for particular subtotals. For example, using the previous query, you can display only those subtotals for loc and dname combinations by using the GROUPING function in a HAVING clause.

SELECT loc, dname, job, COUNT(*) AS "employees",
  GROUPING(loc) AS "gf_loc",
  GROUPING(dname) AS "gf_dname",
  GROUPING(job) AS "gf_job"
FROM emp e, dept d
WHERE e.deptno = d.deptno
GROUP BY ROLLUP (loc, dname, job)
HAVING GROUPING(loc) = 0
  AND  GROUPING(dname) = 0
  AND  GROUPING(job) = 1
ORDER BY 1, 2;

This query produces the following result:

loc      |   dname    | job | employees | gf_loc | gf_dname | gf_job
---------+------------+-----+-----------+--------+----------+--------
BOSTON   | OPERATIONS |     |         3 |      0 |        0 |      1
BOSTON   | RESEARCH   |     |         5 |      0 |        0 |      1
CHICAGO  | SALES      |     |         6 |      0 |        0 |      1
NEW YORK | ACCOUNTING |     |         3 |      0 |        0 |      1
(4 rows)

The GROUPING function can be used to distinguish a subtotal row from a base aggregate row or from certain subtotal rows where one of the items in the expression list returns null as a result of the column on which the expression is based being null for one or more rows in the table, as opposed to representing a subtotal over the column.

To illustrate this point, the following row is added to the emp table. This provides a row with a null value for the job column.

INSERT INTO emp (empno,ename,deptno) VALUES (9004,'PETERS',40);

The following query is issued using a reduced number of rows for clarity.

SELECT loc, job, COUNT(*) AS "employees",
  GROUPING(loc) AS "gf_loc",
  GROUPING(job) AS "gf_job"
FROM emp e, dept d
WHERE e.deptno = d.deptno AND loc = 'BOSTON'
GROUP BY CUBE (loc, job)
ORDER BY 1, 2;

Note that the output contains two rows containing BOSTON in the loc column and spaces in the job column (fourth and fifth entries in the table).

loc    |   job   | employees | gf_loc | gf_job
-------+---------+-----------+--------+--------
BOSTON | ANALYST |         3 |      0 |      0
BOSTON | CLERK   |         3 |      0 |      0
BOSTON | MANAGER |         2 |      0 |      0
BOSTON |         |         1 |      0 |      0
BOSTON |         |         9 |      0 |      1
       | ANALYST |         3 |      1 |      0
       | CLERK   |         3 |      1 |      0
       | MANAGER |         2 |      1 |      0
       |         |         1 |      1 |      0
       |         |         9 |      1 |      1
(10 rows)

The fifth row where the GROUPING function on the job column (gf_job) returns 1 indicates this is a subtotal over all jobs. Note that the row contains a subtotal value of 9 in the employees column.

The fourth row where the GROUPING function on the job column as well as on the loc column returns 0 indicates this is a base aggregate of all rows where loc is BOSTON and job is null, which is the row inserted for this example. The employees column contains 1, which is the count of the single such row inserted.

Also note that in the ninth row (next to last) the GROUPING function on the job column returns 0 while the GROUPING function on the loc column returns 1 indicating this is a subtotal over all locations where the job column is null, which again, is a count of the single row inserted for this example.