 |
|
for Modeling
Oracle Tips by Laurent Schneider
|
Laurent
Schneider is considered one of the top Oracle SQL experts, and
he is the author of the book "Advanced SQL Programming" by Rampant TechPress. The following
is an excerpt from the book.
SQL for Modeling
A model is a multidimensional array created with existing
and new data. The MODELclause was originally
named SQL Spreadsheetand designed to provide
an SQL model directly in the database instead of the traditional approach of
loading data in a tool like Microsoft Excel and performing calculations in a
worksheet. MODEL is part of the core database engine and is usable at no extra
cost.
The model designer describes the partitions, dimensions and
measures and writes the model code. Without the model clause, the rows are
selected from a table or view, but there are no calculations after the rows are
returned. Each row is accessed using the dimension, which could be initialized
by a column or a scalar.
Most worksheet calculations require computing results from
multiple cells; similarly, SQL MODEL provides inter-rows calculation and row
generation.
Another feature missing in SQL is iteration. Processing the
rows more than once to evaluate complex calculation is traditionally done with a
programming language like PL/SQL or Java. Model enhances SQL capabilities by
providing for loops and iteration.
Partitions, dimensions and
measures
The model query has partitions, dimensions and measures.
SELECT
*
FROM
EMP
GROUP BY
DEPTNO,
JOB,
TRUNC(HIREDATE,'Y')
MODEL
PARTITION BY
(
DEPTNO
)
DIMENSION BY
(
JOB,
TRUNC(HIREDATE,'Y') YEAR
)
MEASURES
(
COUNT(*) C
)
();
The PARTITIONclause
splits the result set into different datasets. Each partition is isolated. The
DIMENSIONclause determines the indexes of
the array. The MEASURESclause defines the
calculated cells that will be returned by the MODEL
query.
Partition is optional; both dimension and measure clauses
are required. The partition can be neither changed nor inserted. Dimensions
cannot be changed, but new values can be inserted.
SELECT
*
FROM
EMP
WHERE
DEPTNO=10
MODEL
DIMENSION BY
(EMPNO)
MEASURES
(ENAME, SAL)
();
EMPNO ENAME
SAL
---------- ---------- ----------
7782 CLARK 2450
7839 KING 5000
7934 MILLER 1300
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 51 | 3 (0)| 00:00:01 |
| 1 | SQL MODEL ORDERED FAST| | 3 | 51 | | |
|* 2 | TABLE ACCESS FULL | EMP | 3 | 51 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate
Information (identified by operation id):
---------------------------------------------------
2 -
filter("DEPTNO"=10)
The employees in department 10 are used as data input for
the model calculation. The employee number is the dimension of the array and the
name and the salary are the measures.
The optimizer has multiple access operations for MODEL. One
of them is SQL MODEL ORDERED FAST. Note that the cost and time columns are not
filled.
SELECT
*
FROM
EMP
GROUP BY
DEPTNO,
JOB
MODEL
PARTITION BY
(DEPTNO)
DIMENSION BY
(JOB)
MEASURES
(SUM(SAL) SAL)
()
ORDER BY
DEPTNO,
SAL DESC;
DEPTNO
JOB SAL
---------- --------- ----------
10 PRESIDENT 5000
10 MANAGER 2450
10 CLERK 1300
20 ANALYST 6000
20 MANAGER 2975
20 CLERK 1900
30 SALESMAN 5600
30 MANAGER 2850
30 CLERK 950
The result set of the query is partitioned by department
number. In each partition, the dimension is the job and the sum of salaries is
the measure.
|