 |
|
SQL: Rows between unbounded preceding
Oracle Tips by Laurent Schneider
|
Laurent
Schneider is considered one of the top Oracle SQL experts, and
he is the author of the book "Advanced SQL Programming" by Rampant TechPress. The following
is an excerpt from the book.
Windows with the "rows between unbounded
preceding" clause
The partition clause is not the only method of limiting the
scope of an analytic function. When using a "rows between unbounded preceding" clause, rows are
ordered and a window is defined.
SELECT
ENAME,
HIREDATE,
SAL,
MAX
(
SAL
)
OVER
(
ORDER BY
HIREDATE,
ENAME
ROWS BETWEEN
UNBOUNDED PRECEDING
AND
1 PRECEDING
) MAX_BEFORE,
MAX
(
SAL
)
OVER
(
ORDER BY
HIREDATE,
ENAME
ROWS BETWEEN
1 FOLLOWING
AND
UNBOUNDED FOLLOWING
) MAX_AFTER
FROM
EMP
ORDER BY
HIREDATE,
ENAME;
ENAME HIREDATE SAL MAX_BEFORE
MAX_AFTER
---------- --------- ---------- ---------- ----------
SMITH 17-DEC-80 800 5000
ALLEN 20-FEB-81 1600 800 5000
WARD 22-FEB-81 1250 1600 5000
JONES 02-APR-81 2975 1600 5000
BLAKE 01-MAY-81 2850 2975 5000
CLARK 09-JUN-81 2450 2975 5000
TURNER 08-SEP-81 1500 2975 5000
MARTIN 28-SEP-81 1250 2975 5000
KING 17-NOV-81 5000 2975 3000
FORD 03-DEC-81 3000 5000 3000
JAMES 03-DEC-81 950 5000 3000
MILLER 23-JAN-82 1300 5000 3000
SCOTT 19-APR-87 3000 5000 1100
ADAMS 23-MAY-87 1100 5000
On each row, the highest salary before the current row and
the highest salary after are returned. The ORDER BY clause is not used here for
ranking but for specifying a window.
Summing with ORDER BY produces cumulative totals.
SELECT
ENAME,
SAL,
SUM(SAL) OVER (ORDER BY ENAME ROWS UNBOUNDED PRECEDING) CUMSUM
FROM
EMP;
ENAME
SAL CUMSUM
---------- ---------- ----------
ADAMS 1100 1100
ALLEN 1600 2700
BLAKE 2850 5550
CLARK 2450 8000
FORD 3000 11000
JAMES 950 11950
JONES 2975 14925
KING 5000 19925
MARTIN 1250 21175
MILLER 1300 22475
SCOTT 3000 25475
SMITH 800 26275
TURNER 1500 27775
WARD 1250 29025
The lowest possible bound
is UNBOUNDED PRECEDING (the first row), the current row is CURRENT ROW and the
highest possible row is UNBOUNDED FOLLOWING (the last row).
Rows preceding and following the current row are retrieved
with n PRECEDING and n FOLLOWING where n is the relative
position to the current row.
When not specifying BETWEEN, the window implicitly ends at
the CURRENT ROW.
SELECT
ENAME,
SAL,
AVG(SAL) OVER (ORDER BY SAL ROWS 1 PRECEDING) AVG
FROM
EMP;
ENAME
SAL AVG
---------- ---------- ----------
SMITH 800 800
JAMES 950 875
ADAMS 1100 1025
WARD 1250 1175
MARTIN 1250 1250
MILLER 1300 1275
TURNER 1500 1400
ALLEN 1600 1550
CLARK 2450 2025
BLAKE 2850 2650
JONES 2975 2912.5
SCOTT 3000 2987.5
FORD 3000 3000
KING 5000 4000
The average is calculated for rows between the previous and
the current row. The windows start at the position immediately preceding the
current row and the current row.
RANGE is similar to ROWS but the intervals are not a number
of rows. They are either numeric or date values.
SELECT
ENAME,
SAL,
SAL*.9 LOW,
SAL*1.1 HIGH,
COUNT(*)
OVER
(
ORDER BY
SAL
RANGE BETWEEN
SAL*.1 PRECEDING
AND
SAL*.1 FOLLOWING
) COUNT
FROM
EMP;
ENAME
SAL LOW HIGH COUNT
---------- ---------- ---------- ---------- ----------
SMITH 800 720 880 1
JAMES 950 855 1045 1
ADAMS 1100 990 1210 1
WARD 1250 1125 1375 3
MARTIN 1250 1125 1375 3
MILLER 1300 1170 1430 3
TURNER 1500 1350 1650 2
ALLEN 1600 1440 1760 2
CLARK 2450 2205 2695 1
BLAKE 2850 2565 3135 4
JONES 2975 2677.5 3272.5 4
SCOTT 3000 2700 3300 4
FORD 3000 2700 3300 4
KING 5000 4500 5500 1
The sort key is the salary. The count of employees with the
same salary, +/- 10%, is evaluated.
When using RANGE, the result is deterministic. If two rows
get the same value, they are both either included or excluded from the window.
SELECT
ENAME,
SAL,
SUM(SAL) OVER (ORDER BY SAL ROWS UNBOUNDED PRECEDING) SUMROWS,
SUM(SAL) OVER (ORDER BY SAL RANGE UNBOUNDED PRECEDING) SUMRANGE
FROM
EMP;
ENAME
SAL SUMROWS SUMRANGE
---------- ---------- ---------- ----------
SMITH 800 800 800
JAMES 950 1750 1750
ADAMS 1100 2850 2850
WARD 1250 4100 5350
MARTIN 1250 5350 5350
MILLER 1300 6650 6650
TURNER 1500 8150 8150
ALLEN 1600 9750 9750
CLARK 2450 12200 12200
BLAKE 2850 15050 15050
JONES 2975 18025 18025
SCOTT 3000 21025 24025
FORD 3000 24025 24025
KING 5000 29025 29025
For Scott and Ford, the salary is equal to 3000. The
analytic function that uses RANGE is deterministic and for both returns the same
value while ROWS will return a unique value for each row.
ROWS CURRENT ROW points to exactly one row; RANGE CURRENT
ROW points to all rows where the sort key is equal to the current row.
When using ORDER BY with no windowing clause, the implicit
window is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW for analytic
functions that support a windowing clause.
With dates and timestamps, the interval could be a number of
days, a day-to-seconds interval or a year-to-month interval.
SELECT
ENAME,
HIREDATE,
SAL,
AVG(SAL)
OVER
(
ORDER BY
TRUNC(HIREDATE,'MM')
RANGE BETWEEN
INTERVAL '1' MONTH PRECEDING
AND
INTERVAL '1' MONTH PRECEDING
) "PREVIOUS",
AVG(SAL)
OVER
(
ORDER BY
TRUNC(HIREDATE,'MM')
RANGE CURRENT ROW
) "CURRENT",
AVG(SAL)
OVER
(
ORDER BY
TRUNC(HIREDATE,'MM')
RANGE BETWEEN
INTERVAL '1' MONTH FOLLOWING
AND
INTERVAL '1' MONTH FOLLOWING
) "NEXT",
AVG(SAL)
OVER
(
ORDER BY
TRUNC(HIREDATE,'MM')
RANGE BETWEEN
INTERVAL '1' MONTH PRECEDING
AND
INTERVAL '1' MONTH FOLLOWING
) "3MONTHS"
FROM
EMP
ORDER BY
HIREDATE;
ENAME HIREDATE
SAL PREVIOUS CURRENT NEXT 3MONTHS
---------- --------- -------- -------- -------- -------- --------
SMITH 17-DEC-80 800 800 800
ALLEN 20-FEB-81 1600 1425 1425
WARD 22-FEB-81 1250 1425 1425
JONES 02-APR-81 2975 2975 2850 2913
BLAKE 01-MAY-81 2850 2975 2850 2450 2758
CLARK 09-JUN-81 2450 2850 2450 2650
TURNER 08-SEP-81 1500 1375 1375
MARTIN 28-SEP-81 1250 1375 1375
KING 17-NOV-81 5000 5000 1975 2983
JAMES 03-DEC-81 950 5000 1975 1300 2563
FORD 03-DEC-81 3000 5000 1975 1300 2563
MILLER 23-JAN-82 1300 1975 1300 1750
SCOTT 19-APR-87 3000 3000 1100 2050
ADAMS 23-MAY-87 1100 3000 1100 2050
The sort key is the month of hire date. The previous column evaluates the
average salary for the employees hired in the month before the current employee
was hired, the current column includes the average of employees hired in the
same month of the current row and the next column relates to the employees hired
in the month after the hire date of the current employee.