 |
|
Advanced Oracle SQL: Nested Aggregates
Oracle Tips by Laurent Schneider
|
Laurent
Schneider is considered one of the top Oracle SQL experts, and
he is the author of the book "Advanced SQL Programming" by Rampant TechPress. The following
is an excerpt from the book.
Aggregate functions can be nested:
SELECT
AVG(
MAX(SAL)
)
FROM
EMP
GROUP BY
DEPTNO;
AVG(MAX(SAL))
-------------
3616.66667
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 4 (25)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 7 | 4 (25)| 00:00:01 |
| 2 | SORT GROUP BY | | 1 | 7 | 4 (25)| 00:00:01 |
| 3 | TABLE ACCESS FULL| EMP | 14 | 98 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
The average of the maximum salary of each department is
returned. A query containing a nested aggregation requires one group by clause
and returns one row.
Subtotals
The CUBE, ROLLUP and GROUPING SETS functions are used in the
GROUP BY clause to generate totals and subtotals.
SELECT
DEPTNO,
SUM(SAL)
FROM
EMP
GROUP BY
ROLLUP(DEPTNO);
DEPTNO SUM(SAL)
---------- ----------
10 8750
20 10875
30 9400
29025
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 21 | 4 (25)| 00:00:01 |
| 1 | SORT GROUP BY ROLLUP| | 3 | 21 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS FULL | EMP | 14 | 98 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
The sum of salaries is aggregated per department, and then
an additional row containing the total of all employees' salaries is returned.
The ROLLUP has generated four rows:
ROLLUP can have more than one dimension, generating grand
totals and subtotals:
SELECT
DEPTNO,
JOB,
SUM(SAL)
FROM
EMP
GROUP BY
ROLLUP(DEPTNO,JOB);
DEPTNO JOB
SUM(SAL)
-------- --------- ----------
10 CLERK 1300
10 MANAGER 2450
10 PRESIDENT 5000
10 8750
20 CLERK 1900
20 ANALYST 6000
20 MANAGER 2975
20 10875
30 CLERK 950
30 MANAGER 2850
30 SALESMAN 5600
30 9400
29025
The sum generates a grand total, a subtotal per department
and a sub-subtotal per department and job.
It also possible to mix ROLLUP groups with normal groups in
the GROUP BY clause:
SELECT
DEPTNO,
JOB,
SUM(SAL)
FROM
EMP
GROUP BY
DEPTNO,ROLLUP(JOB);
DEPTNO JOB
SUM(SAL)
-------- --------- ----------
10 CLERK 1300
10 MANAGER 2450
10 PRESIDENT 5000
10 8750
20 CLERK 1900
20 ANALYST 6000
20 MANAGER 2975
20 10875
30 CLERK 950
30 MANAGER 2850
30 SALESMAN 5600
30 9400
A subtotal per department and a sub-subtotal per department
and job are returned. As DEPTNO is a standard GROUP BY expression, there is no
grand total.
SELECT
MAX(ENAME)
KEEP
(
DENSE_RANK
FIRST
ORDER BY
SAL DESC
) ENAME,
DEPTNO,
JOB,
MAX(SAL) SAL
FROM
EMP
GROUP BY
CUBE(DEPTNO,JOB);
ENAME DEPTNO
JOB SAL
---------- ---------- --------- ----------
KING 5000
MILLER CLERK 1300
SCOTT ANALYST 3000
JONES MANAGER 2975
ALLEN SALESMAN 1600
KING PRESIDENT 5000
KING 10 5000
MILLER 10 CLERK 1300
CLARK 10 MANAGER 2450
KING 10 PRESIDENT 5000
SCOTT 20 3000
ADAMS 20 CLERK 1100
SCOTT 20 ANALYST 3000
JONES 20 MANAGER 2975
BLAKE 30 2850
JAMES 30 CLERK 950
BLAKE 30 MANAGER 2850
ALLEN 30 SALESMAN 1600
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 231 | 4 (25)| 00:00:01 |
| 1 | SORT GROUP BY | | 11 | 231 | 4 (25)| 00:00:01 |
| 2 | GENERATE CUBE | | 11 | 231 | 4 (25)| 00:00:01 |
| 3 | SORT GROUP BY | | 11 | 231 | 4 (25)| 00:00:01 |
| 4 | TABLE ACCESS FULL| EMP | 14 | 294 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------
CUBE generates nine rows for the various job/departments,
three rows for the subtotal per department, three rows for the subtotal per job
and one row for the grand total. For each job and department, the maximum salary
is returned, and for this salary, the name of the corresponding employee is
returned. In case of duplicates, the MAX function is applied to the name.
With this result set, a superaggregation is performed for
each dimension, the department, the job, and the overall. In addition to the
rows that are aggregated by ROLLUP, CUBE also produces a row for each job.
GROUPING SETS simplifies the management of the subtotals:
SELECT
DEPTNO,
JOB,
MIN(SAL)
FROM
EMP
GROUP BY
GROUPING SETS
(
(JOB),
(DEPTNO)
);
DEPTNO JOB
MIN(SAL)
---------- --------- ----------
ANALYST 3000
CLERK 800
MANAGER 2450
PRESIDENT 5000
SALESMAN
1250
10 1300
20 800
30 950
GROUPING SET is a subset of CUBE. The minimum salary in each
job and the minimum salary in each department are returned.
Compare with:
SELECT
CASE
WHEN GROUPING_ID(JOB, DEPTNO)=1
THEN 'Count per job'
WHEN GROUPING_ID(JOB, DEPTNO)=2
THEN 'Count per department'
END " ",
CASE
WHEN GROUPING(JOB)=0
THEN JOB
ELSE '========='
END JOB,
CASE
WHEN GROUPING(DEPTNO)=0
THEN TO_CHAR(DEPTNO,'99999')
ELSE '======'
END DEPTNO,
COUNT(*)
FROM
EMP
GROUP BY
CUBE(JOB, DEPTNO)
HAVING
GROUPING_ID (JOB, DEPTNO) in (0,1,2);
JOB DEPTNO
COUNT(*)
-------------------- --------- ------ ----------
CLERK 10 1
CLERK 20 2
CLERK 30 1
ANALYST 20 2
MANAGER 10 1
MANAGER 20 1
MANAGER 30 1
SALESMAN 30 4
PRESIDENT 10 1
Count per job CLERK ====== 4
Count per job ANALYST ====== 2
Count per job MANAGER ====== 3
Count per job SALESMAN ====== 4
Count per job PRESIDENT ====== 1
Count per department ========= 10 3
Count per department ========= 20 5
Count per department ========= 30 6
CUBE generates counts in every possible dimension including
overall counts. The HAVING clause evaluates the GROUPING_ID function. When no
superaggregation occurs, it gets the value of 0; when the first argument group
is a subtotal, it gets 1 (20); when the second is a subtotal, it gets
2 (21). If it is a grand total for both the first and the second
arguments, it gets 3 (20+21). Only the 0, 1 and 2 grouping
ids are returned. GROUPING returns 1 when the column is summarized and
otherwise, 0.
There is one more function related to grouping that is
called GROUP_ID which is useful when there are duplicate subtotals:
SELECT
CASE
WHEN GROUPING(ENAME)=0
THEN NULL
WHEN GROUP_ID()=0
THEN 'SUM'
WHEN GROUP_ID()=1
THEN 'AVG'
END TYPE,
ENAME,
DEPTNO,
CASE
WHEN GROUPING(ENAME)=0
THEN NULL
WHEN GROUP_ID()=0
THEN SUM(SAL)
WHEN GROUP_ID()=1
THEN AVG(SAL)
END VALUE
FROM
EMP
GROUP BY
GROUPING SETS
(
(ENAME,DEPTNO),
(DEPTNO),
(DEPTNO),
(),
()
)
DEPTNO,
ENAME,
GROUP_ID();
TYP ENAME
DEPTNO VALUE
--- ---------- ---------- ----------
CLARK
10
KING 10
MILLER 10
SUM 10 8750
AVG 10 2916.66667
ADAMS
20
FORD 20
JONES 20
SCOTT 20
SMITH 20
SUM
20 10875
AVG 20 2175
ALLEN 30
BLAKE 30
JAMES 30
MARTIN 30
TURNER 30
WARD 30
SUM 30 9400
AVG 30 1566.66667
SUM 29025
AVG 2073.21429
For each department and for the whole table, one total and
one average are evaluated.