 |
|
Table Functions and
Pipelining
Oracle Tips by Burleson Consulting
|
The following Tip is from the
outstanding book "Oracle
PL/SQL Tuning: Expert Secrets for High Performance Programming" by
Dr. Tim Hall, Oracle ACE of the year, 2006:
Functions that return collections of rows are
known as table functions. They can be used like database tables in the
FROM clause of a query or in the SELECT list of a query as a column
name.
A regular table function creates an entire
collection before returning it to the requesting query. But the
performance of table functions can be improved by the implementation
of pipelining and parallelization, giving the following benefits:
-
Pipelining allows rows to be passed out of
table functions as they are produced, rather than waiting for whole
collections to be produced before the results are returned.
The outcome is a reduction in the time taken for the first rows to
be produced and a reduction in the total amount of memory consumed
by the table function.
-
Parallel enabling table functions allows
their workload to be split between multiple slave processes, which
may result in faster execution.
Like regular functions, table functions can
accept input parameters including collection types and REF CURSORS.
Accepting these parameters allows them to be chained together to
perform complex transformation pipelines, or streams. These
transformation pipelines can be used as a replacement for traditional
Extraction Transformation Load (ETL) processes, removing the need for
intermediate staging areas.
The next section shows how table functions are
created and demonstrates the performance improvements associated with
pipelined table functions.
Pipelining Table Functions
The earlier section revealed that pipelining
table functions results in a reduction in the time it takes to
visualize the first rows of the collection and a reduction in overall
memory usage. The obvious next step is to define some table
functions and prove these statements are true.
The create_square_root_schema_objects.sql
script defines two database types that represent the row and table
types used by the table functions. Table functions require these
types to be created as database objects, while pipelined table
functions can use PL/SQL types defined in a package specification,
provided that Oracle 9.2 or later is being used. In order to
make the comparison as close as possible, the same database objects
for both types of table function will be used.
create_square_root_schema_objects.sql
CREATE OR
REPLACE TYPE t_square_root_row AS OBJECT (
start_number NUMBER,
square_root NUMBER,
description VARCHAR2(50)
);
/
CREATE OR
REPLACE TYPE t_square_root_tab AS TABLE OF t_square_root_row;
/
With the database types in place, the next
step is to define some table functions. The
create_square_root_functions.sql script defines a package with two
table functions that return the square roots of a specified range of
numbers. One of the table functions is pipelined, the other is not.
create_square_root_functions.sql
CREATE OR
REPLACE PACKAGE tf_api AS
FUNCTION get_square_roots_tf (p_start_range IN NUMBER,
p_end_range IN NUMBER,
p_pause IN VARCHAR2
DEFAULT
'TRUE')
RETURN t_square_root_tab;
FUNCTION get_square_roots_ptf (p_start_range IN NUMBER,
p_end_range IN NUMBER,
p_pause IN VARCHAR2
DEFAULT 'TRUE')
RETURN t_square_root_tab PIPELINED;
END tf_api;
/
SHOW ERRORS
CREATE OR
REPLACE PACKAGE BODY tf_api AS
FUNCTION get_square_roots_tf (p_start_range IN NUMBER,
p_end_range IN NUMBER,
p_pause IN
VARCHAR2 DEFAULT
'TRUE')
RETURN t_square_root_tab
AS
l_row t_square_root_row := t_square_root_row(NULL,
NULL, NULL);
l_tab t_square_root_tab := t_square_root_tab();
BEGIN
FOR i IN p_start_range .. p_end_range LOOP
-- Perform a conditional delay.
IF p_pause = 'TRUE' AND MOD(i, 10) = 0 THEN
DBMS_LOCK.sleep(1);
END IF;
-- Build up a new row.
l_row.start_number := i;
l_row.square_root := ROUND(SQRT(i), 2);
l_row.description := 'The square root of '
|| i || ' is ' || l_row.square_root;
-- Extend the collection and add the row.
l_tab.extend;
l_tab(l_tab.last) := l_row;
END LOOP;
-- Return the collection.
RETURN l_tab;
END get_square_roots_tf;
FUNCTION get_square_roots_ptf (p_start_range IN NUMBER,
p_end_range IN NUMBER,
p_pause IN VARCHAR2
DEFAULT 'TRUE')
RETURN t_square_root_tab PIPELINED
AS
l_row t_square_root_row := t_square_root_row(NULL,
NULL, NULL);
BEGIN
FOR i IN p_start_range .. p_end_range LOOP
-- Perform a conditional delay.
IF p_pause = 'TRUE' AND MOD(i, 10) = 0 THEN
DBMS_LOCK.sleep(1);
END IF;
-- Build up a new row.
l_row.start_number := i;
l_row.square_root := ROUND(SQRT(i), 2);
l_row.description := 'The square root of '
|| i || ' is ' || l_row.square_root;
-- Pipe the row out.
PIPE ROW (l_row);
END LOOP;
-- Perform return.
RETURN;
END get_square_roots_ptf;
END tf_api;
/
SHOW ERRORS
The get_square_roots_tf function is a regular
table function because it creates the entire collection before
returning it. In contrast, the get_square_roots_ptf function
pushes out each row as it is created using the PIPE ROW command and
ends with an empty return statement. Notice that both functions
contain an optional pause every 10 rows to make the query artificially
slow.
Once the table functions have been created,
the first test can then be run using the
query_square_root_functions.sql script shown below. This script
uses the TABLE function to make the output from the table functions
resemble a real table.
query_square_root_functions.sql
-- Query the
regular table function.
SELECT *
FROM TABLE(tf_api.get_square_roots_tf(1, 100)) a;
-- Query the
pipelined table function.
SELECT *
FROM TABLE(tf_api.get_square_roots_ptf(1, 100)) a;
Both queries
in the script return output similar to that displayed below. But how
the output is returned is the focal point, not the output itself.
START_NUMBER
SQUARE_ROOT DESCRIPTION
------------ ----------- ------------------------------
1
1 The square root of 1 is 1
2
1.41 The square root of 2 is 1.41
3
1.73 The square root of 3 is 1.73
.
.
98
9.9 The square root of 98 is 9.9
99
9.95 The square root of 99 is 9.95
100 10
The square root of 100 is 10
100 rows
selected.
The reason for performing this test is that it
highlights the difference in how the results are returned from the
functions. The regular table function builds the whole
collection before returning it, so a pause is seen followed by all the
results being returned in a single block. In contrast, the
pipelined table function returns rows as they are created, so results
are returned in chunks by SQL*Plus.
In addition to the difference in the speed of
returning the first rows, the difference in memory consumption should
also be proven. Imagine a situation in which the table function
is used to return 100,000 rows. The regular table function would
build up the whole collection in memory before returning the data,
while the pipelined table function would never hold more than a single
row in memory at any time. The expected result is that the
profile of the memory usage between the two methods would be vastly
different. The test_table_function_memory_usage.sql script
provides a method for testing this difference.
test_table_function_memory_usage.sql
-- Create a
function to retrieve current PGA usage.
CREATE OR
REPLACE FUNCTION get_used_memory RETURN NUMBER AS
l_used_memory NUMBER;
BEGIN
SELECT ms.value
INTO l_used_memory
FROM v$mystat ms,
v$statname sn
WHERE ms.statistic# = sn.statistic#
AND sn.name = 'session pga memory';
RETURN l_used_memory;
END get_used_memory;
/
SHOW ERRORS
conn
test/test
-- Test regular table function.
SET SERVEROUTPUT ON
DECLARE
l_start NUMBER;
BEGIN
l_start := get_used_memory;
FOR
cur_rec IN (SELECT *
FROM TABLE(tf_api.get_square_roots_tf(1, 100000,
'FALSE')))
LOOP
NULL;
END LOOP;
DBMS_OUTPUT.put_line('Regular table function : ' ||
(get_used_memory - l_start));
END;
/
conn
test/test
-- Test
pipelined table function.
SET
SERVEROUTPUT ON
DECLARE
l_start NUMBER;
BEGIN
l_start := get_used_memory;
FOR cur_rec IN (SELECT *
FROM TABLE(tf_api.get_square_roots_ptf(1, 100000,
'FALSE')))
LOOP
NULL;
END LOOP;
DBMS_OUTPUT.put_line('Pipelined table function : ' ||
(get_used_memory - l_start));
END;
/
This script defines a function that returns
the amount of PGA memory currently assigned to the session, which is
used before and after calls to the table functions defined previously,
allowing the memory consumption associated with the table function
calls to be quantified.
Each test is separated by a new connection to
make sure a clean session is being used. Notice that the
artificial pause is not needed for this test. The output from
this script is listed below and clearly demonstrates the difference in
memory consumption by the two methods.
SQL> @test_table_function_memory_usage.sql
Connected.
Function
created.
No errors.
Connected.
Regular
table function : 34734080
PL/SQL
procedure successfully completed.
Connected.
Pipelined
table function : 65536
PL/SQL
procedure successfully completed.
SQL>
In this example the regular table function
consumes more than 500 times the PGA memory of the pipelined table
function.
These two tests clearly demonstrate the
performance improvements associated with pipelined table functions
over conventional table functions.
The next section shows the affect of
parallelizing table functions on their performance.
|
|
Need an Oracle Health Check?
- Do you have
bad performance after an upgrade?
- Need to
certify that your database follows best practices?
BC Oracle performance gurus can quickly
certify every aspect of your
Oracle database and provide a complete verification that your database
is fully optimized. |

|
|