 |
|
Advanced Oracle SQL: Full Outer, Cross and Partitioned Join
Oracle Tips by Laurent Schneider
|
Laurent
Schneider is considered one of the top Oracle SQL experts, and
he is the author of the book "Advanced SQL Programming" by Rampant TechPress. The following
is an excerpt from the book.
1. Full outer
join
A full outer join retrieves rows from both tables, whether
or not they have a matching row:
WITH
EMP
AS
(
SELECT
'JOEL' ENAME,
40 DEPTNO
FROM
DUAL
UNION ALL
SELECT
'MARY' ENAME,
50 DEPTNO
FROM
DUAL
)
SELECT
e.ENAME,
d.DNAME
FROM
EMP e
FULL JOIN
DEPT d
USING
(
DEPTNO
);
ENAM DNAME
---- --------------
JOEL OPERATIONS
MARY
SALES
RESEARCH
ACCOUNTING
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
|
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 120 | 8 (13)| 00:00:01
|
| 1 | VIEW | VW_FOJ_0 | 8 | 120 | 8 (13)| 00:00:01
|
|* 2 | HASH JOIN FULL OUTER| | 8 | 176 | 8 (13)| 00:00:01
|
| 3 | VIEW | | 2 | 18 | 4 (0)| 00:00:01
|
| 4 | UNION-ALL | | | | |
|
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01
|
| 6 | FAST DUAL | | 1 | | 2 (0)| 00:00:01
|
| 7 | TABLE ACCESS FULL | DEPT | 4 | 52 | 3 (0)| 00:00:01
|
----------------------------------------------------------------------------------
Predicate Information (identified by
operation id):
---------------------------------------------------
2 - access("E"."DEPTNO"="D"."DEPTNO")
The full outer join selects employees with a department,
employees without a department and departments without employees. In releases
prior to 11g, it does a UNION ALL of a left JOIN and a RIGHT JOIN to include all
non-matched rows.
In 11g, the full outer join is much faster than before as
the optimizer uses a new operation called HASH JOIN FULL OUTER which scans each
table only once instead of doing a union of two joins.
Compare with 10gR2:
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |
Cost
(%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 75
| 13
(24)| 00:00:01 |
| 1 | TEMP TABLE TRANSFORMATION | | |
|
| |
| 2 | LOAD AS SELECT | | |
|
| |
| 3 | UNION-ALL | | |
|
| |
| 4 | FAST DUAL | | 1 |
| 2
(0)| 00:00:01 |
| 5 | FAST DUAL | | 1 |
| 2
(0)| 00:00:01 |
| 6 | VIEW | | 5 | 75
| 9
(12)| 00:00:01 |
| 7 | UNION-ALL | | |
|
| |
|* 8 | HASH JOIN OUTER | | 2 | 44
| 5
(20)| 00:00:01 |
| 9 | VIEW | | 2 | 18
| 2
(0)| 00:00:01 |
| 10 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6601_2C2CE3 | 2 | 38
| 2
(0)| 00:00:01 |
| 11 | TABLE ACCESS FULL | DEPT | 4 | 52
| 2
(0)| 00:00:01 |
|* 12 | HASH JOIN ANTI | | 3 | 48
| 5
(20)| 00:00:01 |
| 13 | TABLE ACCESS FULL | DEPT | 4 | 52
| 2
(0)| 00:00:01 |
| 14 | VIEW | | 2 | 6
| 2
(0)| 00:00:01 |
| 15 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6601_2C2CE3 | 2 | 38
| 2
(0)| 00:00:01 |
8 -
access("E"."DEPTNO"="D"."DEPTNO"(+))
12 - access("E"."DEPTNO"="D"."DEPTNO")
2.
Cross join
A cross join computes all rows from the one table to the
rows of the other table:
SELECT
d1.DNAME,
d2.DNAME
FROM
DEPT d1,
DEPT d2;
DNAME DNAME
-------------- --------------
ACCOUNTING ACCOUNTING
ACCOUNTING OPERATIONS
ACCOUNTING RESEARCH
ACCOUNTING SALES
OPERATIONS ACCOUNTING
OPERATIONS OPERATIONS
OPERATIONS RESEARCH
OPERATIONS SALES
RESEARCH ACCOUNTING
RESEARCH OPERATIONS
RESEARCH RESEARCH
RESEARCH SALES
SALES ACCOUNTING
SALES OPERATIONS
SALES RESEARCH
SALES SALES
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 16 | 320 | 5 (0)| 00:00:01 |
| 1 | MERGE JOIN CARTESIAN| | 16 | 320 | 5 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL | DEPT | 4 | 40 | 2 (0)| 00:00:01 |
| 3 | BUFFER SORT | | 4 | 40 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | DEPT | 4 | 40 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------
The syntax CROSS JOIN is useful when combined with other
joins:
WITH
JOBS
AS
(
SELECT
DISTINCT
JOB
FROM
EMP
)
SELECT
JOBS.JOB,
DEPT.DEPTNO,
COUNT(EMP.EMPNO)
FROM
EMP
RIGHT JOIN
(
JOBS
CROSS JOIN
DEPT
)
ON
(
JOBS.JOB=EMP.JOB
AND
DEPT.DEPTNO=EMP.DEPTNO
)
GROUP BY
JOBS.JOB,
DEPT.DEPTNO;
JOB DEPTNO COUNT(EMP.EMPNO)
--------- ---------- ----------------
ANALYST 10 0
ANALYST 20 2
ANALYST 30 0
ANALYST 40 0
CLERK 10 1
CLERK 20 2
CLERK 30 1
CLERK 40 0
MANAGER 10 1
MANAGER 20 1
MANAGER 30 1
MANAGER 40 0
PRESIDENT 10 1
PRESIDENT 20 0
PRESIDENT 30 0
PRESIDENT 40 0
SALESMAN 10 0
SALESMAN 20 0
SALESMAN 30 4
SALESMAN 40 0
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 15 | 555 | 10 (30)|
00:00:01 |
| 1 | HASH GROUP BY | | 15 | 555 | 10 (30)|
00:00:01 |
|* 2 | HASH JOIN OUTER | | 20 | 740 | 9 (23)|
00:00:01 |
| 3 | VIEW | | 20 | 440 | 6 (17)|
00:00:01 |
| 4 | MERGE JOIN CARTESIAN | | 20 | 180 | 6 (17)|
00:00:01 |
| 5 | VIEW | | 5 | 30 | 3 (34)|
00:00:01 |
| 6 | HASH UNIQUE | | 5 | 40 | 3 (34)|
00:00:01 |
| 7 | TABLE ACCESS FULL | EMP | 14 | 112 | 2 (0)|
00:00:01 |
| 8 | BUFFER SORT | | 4 | 12 | 6 (17)|
00:00:01 |
| 9 | INDEX FAST FULL SCAN| PK_DEPT | 4 | 12 | 1 (0)|
00:00:01 |
| 10 | TABLE ACCESS FULL | EMP | 14 | 210 | 2 (0)|
00:00:01 |
-------------------------------------------------------------------------------------
2 - access("DEPT"."DEPTNO"="EMP"."DEPTNO"(+)
AND
"JOBS"."JOB"="EMP"."JOB"(+))
The cross product of jobs and departments is joined to the
table of employees.
3. Partitioned
outer join
The partitioned outer join (10g) selects the
partition key of the outer table even where there are no matching rows:
SELECT
d.DEPTNO,
e.JOB,
COUNT(e.EMPNO)
FROM
EMP e
PARTITION BY
(e.JOB)
RIGHT JOIN
DEPT d
ON
(e.DEPTNO=d.DEPTNO)
GROUP BY
d.DEPTNO,
e.JOB
ORDER BY
d.DEPTNO,
e.JOB;
DEPTNO JOB COUNT(E.EMPNO)
---------- --------- --------------
10 ANALYST 0
10 CLERK 1
10 MANAGER 1
10 PRESIDENT 1
10 SALESMAN 0
20 ANALYST 2
20 CLERK 2
20 MANAGER 1
20 PRESIDENT 0
20 SALESMAN 0
30 ANALYST 0
30 CLERK 1
30 MANAGER 1
30 PRESIDENT 0
30 SALESMAN 4
40 ANALYST 0
40 CLERK 0
40 MANAGER 0
40 PRESIDENT 0
40 SALESMAN 0
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 682 | 11 (28)|
00:00:01 |
| 1 | SORT GROUP BY | | 11 | 682 | 11 (28)|
00:00:01 |
| 2 | VIEW | | 20 | 1240 | 10 (20)|
00:00:01 |
| 3 | MERGE JOIN PARTITION OUTER| | 20 | 360 | 10 (20)|
00:00:01 |
| 4 | SORT JOIN | | 4 | 12 | 2 (50)|
00:00:01 |
| 5 | INDEX FULL SCAN | PK_DEPT | 4 | 12 | 1 (0)|
00:00:01 |
|* 6 | SORT PARTITION JOIN | | 14 | 210 | 3 (34)|
00:00:01 |
| 7 | TABLE ACCESS FULL | EMP | 14 | 210 | 2 (0)|
00:00:01 |
----------------------------------------------------------------------------------------
6 -
access("E"."DEPTNO"="D"."DEPTNO")
filter("E"."DEPTNO"="D"."DEPTNO")
For each job of EMP and for each department of DEPT, there will be a matching
row. This looks like a cross join query, but it is much more efficient here
because the employee table is selected only once. The job is a column of EMP and
is selected even when there is no matching row in EMP. Only left and right
partitioned joins are supported to date.
Note: ANSI syntax is recommended by
Oracle. It has more capabilities, like outer joining with more than
one table and with OR predicates, but it also has several
limitations, one of which is: ?cannot be used in fast refreshable
materialized views?.