Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 
 

Creating Transformation Pipelines


Oracle Tips by Burleson Consulting

 

The following Tip is from the outstanding book "Oracle PL/SQL Tuning: Expert Secrets for High Performance Programming" by Dr. Tim Hall, Oracle ACE of the year, 2006:

Pipelined table functions are often discussed as a replacement for the traditional Extraction Transformation Load (ETL) processes.  In the past, these processes have relied on loading data from flat files into staging tables, where it is processed before being loaded into the main schema tables.  This section will create a simple example of how this whole process can be replaced with a transformation pipeline using pipelined table functions.

First a flat file containing data must be created that can be subsequently loaded.  The create_tp_testfile.sql spools out some data into a flat file.

create_tp_testfile.sql

SET PAGESIZE 0
SET FEEDBACK OFF
SET LINESIZE 1000
SET TRIMSPOOL ON
SPOOL /tmp/tp_test.txt
SELECT owner || ',' || object_name || ',' || object_type || ',' || status
FROM   all_objects;
SPOOL OFF
SET FEEDBACK ON
SET PAGESIZE 24

The resulting file contains extra lines at the start and the end.

SQL> SELECT owner || ',' || object_name || ',' || object_type || ',' || status
  2  FROM   all_objects;
..
..
..
SQL> SPOOL OFF

These extra lines will not prevent the data load, but they will be reported as bad records so they should be manually removed before proceeding.

Next the create_tp_test_schema.sql script is used to create several schema objects to support the example.

create_tp_test_schema.sql

-- Create a directory object pointing to the flat file.

CONN sys/password AS SYSDBA
CREATE OR REPLACE DIRECTORY data_load_dir AS '/tmp/';
GRANT READ, WRITE ON DIRECTORY data_load_dir TO test;

CONN test/test

-- Create an external table.

DROP TABLE tp_test_ext;
CREATE TABLE tp_test_ext (
  owner                    VARCHAR2(30),
  object_name              VARCHAR2(30),
  object_type              VARCHAR2(19),
  status                   VARCHAR2(7)
)
ORGANIZATION EXTERNAL
(
  TYPE ORACLE_LOADER
  DEFAULT DIRECTORY data_load_dir
  ACCESS PARAMETERS
  (
    RECORDS DELIMITED BY NEWLINE
    BADFILE data_load_dir:'tp_test_%a_%p.bad'
    LOGFILE data_load_dir:'tp_test_%a_%p.log'
    FIELDS TERMINATED BY ','
    MISSING FIELD VALUES ARE NULL
    (
      owner                    CHAR(30),
      object_name              CHAR(30),
      object_type              CHAR(19),
      status                   CHAR(7)
    )
  )
  LOCATION ('tp_test.txt')
)
PARALLEL 10
REJECT LIMIT UNLIMITED
/

-- Create a table as the final destination for the data.

CREATE TABLE tp_test (
  owner                    VARCHAR2(30),
  object_name              VARCHAR2(30),
  object_type              VARCHAR2(19),
  status                   VARCHAR2(7),
  extra_1                  NUMBER,
  extra_2                  NUMBER
);

The script creates a directory object and an external table pointing to the flat file created previously, along with a table that represents the final destination of the transformed data.

The transformation process defined by the tp_api.sql script is split into two distinct stages (step_1 and step_2).  In addition to the two pipelined table functions that represent each stage, a procedure called load_data initiates the load process.

tp_api.sql

CREATE OR REPLACE PACKAGE tp_api AS

  TYPE t_step_1_in_rc IS REF CURSOR RETURN tp_test_ext%ROWTYPE; 

  TYPE t_step_1_out_row IS RECORD (
    owner                    VARCHAR2(30),
    object_name              VARCHAR2(30),
    object_type              VARCHAR2(19),
    status                   VARCHAR2(7),
    extra_1                  NUMBER
  ); 

  TYPE t_step_1_out_tab IS TABLE OF t_step_1_out_row;

  TYPE t_step_2_in_rc IS REF CURSOR RETURN t_step_1_out_row;

  TYPE t_step_2_out_tab IS TABLE OF tp_test%ROWTYPE;

  FUNCTION step_1 (p_cursor  IN  t_step_1_in_rc)
    RETURN t_step_1_out_tab PIPELINED
    PARALLEL_ENABLE(PARTITION p_cursor BY RANGE (owner));

  FUNCTION step_2 (p_cursor  IN  t_step_2_in_rc)
    RETURN t_step_2_out_tab PIPELINED
    PARALLEL_ENABLE(PARTITION p_cursor BY RANGE (owner));

  PROCEDURE load_data;

END tp_api;
/
SHOW ERRORS

CREATE OR REPLACE PACKAGE BODY tp_api AS

  FUNCTION step_1 (p_cursor  IN  t_step_1_in_rc)
    RETURN t_step_1_out_tab PIPELINED
    PARALLEL_ENABLE(PARTITION p_cursor BY RANGE (owner))
  IS
    l_row  t_step_1_out_row;
  BEGIN
    LOOP
      FETCH p_cursor
      INTO  l_row.owner,
            l_row.object_name,
            l_row.object_type,
            l_row.status;
      EXIT WHEN p_cursor%NOTFOUND;     

      -- Do some work here.

      l_row.extra_1 := p_cursor%ROWCOUNT;
      PIPE ROW (l_row);
    END LOOP;
    RETURN;
  END step_1;

  FUNCTION step_2 (p_cursor  IN  t_step_2_in_rc)
    RETURN t_step_2_out_tab PIPELINED
    PARALLEL_ENABLE(PARTITION p_cursor BY RANGE (owner))
  IS
    l_row  tp_test%ROWTYPE;
  BEGIN
    LOOP
      FETCH p_cursor
      INTO  l_row.owner,
            l_row.object_name,
            l_row.object_type,
            l_row.status,
            l_row.extra_1;
      EXIT WHEN p_cursor%NOTFOUND;     

      -- Do some work here.

      l_row.extra_2 := p_cursor%ROWCOUNT;
      PIPE ROW (l_row);
    END LOOP;
    RETURN;
  END step_2;

  PROCEDURE load_data IS
  BEGIN
    EXECUTE IMMEDIATE 'TRUNCATE TABLE tp_test';   

    INSERT /*+ APPEND */ INTO tp_test
    SELECT *
    FROM   TABLE(step_2(CURSOR(SELECT *
                               FROM   TABLE(step_1(CURSOR(SELECT /*+ parallel (t1, 5) */ *
                                                         FROM   tp_test_ext t1
                                                          )
                                                   )
                                            ) t2
                               )
                        )
                 ) t3;
    COMMIT;
  END load_data;

END tp_api;
/
SHOW ERRORS

The package specification contains PL/SQL types representing all the input and output parameters of the pipelined table functions.  The stage_1 and stage_2 functions do very little work, but they act as building blocks showing how the transformation pipeline is formed; the real reason for the example. 

The load_data procedure links all the pieces together to form the transformation pipeline.  This procedure is driven by an INSERT ... SELECT statement that queries the pipelined table functions to transform the data from the external data into a format that can be inserted into the destination table.  It is worth breaking this query down to see how it fits together.

First the external table is queried to get the raw data.

SELECT /*+ parallel (t1, 5) */ *
FROM   tp_test_ext t1

Next the query is converted to a cursor variable and passed as a parameter into the step_1 function.

step_1(CURSOR(SELECT /*+ parallel (t1, 5) */ *
              FROM   tp_test_ext t1
              )
       )

Next the output is queried from the step_1 function using the TABLE function.

SELECT *
FROM   TABLE(step_1(CURSOR(SELECT /*+ parallel (t1, 5) */ *
                          FROM   tp_test_ext t1
                          )
                   )
            ) t2
)

Next this query is converted into a cursor variable and passed as a parameter to the step_2 function.

step_2(CURSOR(SELECT *
                               FROM   TABLE(step_1(CURSOR(SELECT /*+ parallel (t1, 5) */ *
                                                          FROM   tp_test_ext t1
                                                          )
                                                   )
                                            ) t2
                               )
                        )

Finally the output is queried from the step_2 function using the TABLE function to create the final query.

SELECT *
FROM   TABLE(step_2(CURSOR(SELECT *
                           FROM   TABLE(step_1(CURSOR(SELECT /*+ parallel (t1, 5) */ *
                                                      FROM   tp_test_ext t1
                                                      )
                                               )
                                        ) t2
                           )
                    )
             ) t3;

This completes the "External table -> step_1 -> step_2 -> Destination table" transformation in a single statement.  Although this query looks complicated, it is generally more efficient to perform the action this way than to break it down into individual steps like:

OPEN c1 FOR
  SELECT *
  FROM   tp_test_ext;

OPEN c2 FOR
  SELECT *
  FROM   TABLE(step_1(c1));

INSERT /*+ APPEND */ INTO tp_test
SELECT *
FROM   TABLE(step_2(c2));

CLOSE c2;

CLOSE c1;

With the package in place, the load_data procedure can then be executed to prove the process works.

SQL> EXEC tp_api.load_data;

PL/SQL procedure successfully completed.

SQL> SELECT COUNT(*) FROM tp_test;

  COUNT(*)
----------
     42774

1 row selected.

 

The next section looks at the influence the DETERMINISTIC clause has on pipelined table functions.

 

This is an excerpt from the bestselling book "Oracle PL/SQL Tuning: Expert Secrets for High Performance Programming" by Dr. Tim Hall, Oracle ACE of the year, 2006.

You can buy the book for only $23.95 (30%-off) when you buy directly from the publisher, and you also get instant access to the code depot of PL/SQL tuning scripts:


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational