Home
E-mail Us
Oracle Articles
New Oracle Articles

Oracle Training
Oracle Tips
Oracle Forum
Class Catalog

Remote DBA
Oracle Tuning
Emergency 911
RAC Support
Apps Support
Analysis
Design
Implementation
Oracle Support

SQL Tuning
Security
Oracle UNIX
Oracle Linux
Monitoring
Remote support
Remote plans
Remote services
Application Server
Applications
Oracle Forms
Oracle Portal
App Upgrades
SQL Server
Oracle Concepts
Software Support
Remote Support
Development
Implementation

Consulting Staff
Consulting Prices
Help Wanted!

Oracle Posters
Oracle Books
Oracle Scripts
Ion
Excel-DB

Don Burleson Blog

Table Function

SQL Tips by Donald Burleson

Either way, the previous solution looks ridiculous to anybody with little programming background. Integers can be easily created on the fly, so why does one have to refer to some stored relation let alone a view over several tables? The idea of relations which can be manufactured with code as opposed to stored relations leads to the concept of Table Function.

A table function is a piece of procedural code that can produce a result which can be understood by the SQL engine -- that is, a relation! A table function can have a parameter, so that the output relation depends on it. For all practical purposes it looks like a Parameterized view, and it is even called this in the SQL Server world. The Integers view is naturally parameterized with an upper bound value.

The table function concept evolved further to embrace the idea of pipelining. From a logical perspective, pipelining does not change anything: the table function output is still a relation. The function, however, does not require materializing the whole relation at once; it supplies rows one by one along with the consumer's demand. This implementation difference, however, has a significant implication in our case. The integers relation is infinite! By no means can one hope to materialize it in all entirety, hence the upper bound parameter in a non-pipelined version mentioned in the previous paragraph.

The size of the pipelined table function output does not necessarily have to be controlled by purely logical means. A row's producer (table function) is in an intricate relationship with a row's consumer (the enclosing SQL query), and there is a way for the consumer to tell the producer to stop.

Let's write such table function producing an infinite list of integers and see what happens:

CREATE TYPE IntSet AS TABLE OF Integer;

CREATE or replace FUNCTION Integers
RETURN IntSet PIPELINED IS
BEGIN
    loop
       PIPE ROW(0);
    end loop;
END;
/

Each table function is required to define the shape of the output it produces as an object type. We declared IntSet as a list of integers. The function implementation is unbelievably simple: the flow of control enters an infinite loop, where it creates a new output row during each iteration. Since the function is capable of producing a list of 0s only, it is calling the SQL query's responsibility to have a pseudo column expression that assigns integers incrementally. Alternatively, we could have a slightly more verbose Integers table function with a counter variable incremented during each loop iteration which pipes that integer to the output.

How is the Integers table function used? Simply use:

select rownum from Table(Integers)

Be careful though, a typical client programmatic interface allows opening a cursor and fetching the result set row by row. Normally, the client would stop when it exhausts the result set. In this case, the result set is infinite, so the client has to decide by itself when to stop. Of course, the decision when to stop can be moved to the server side and made explicit such as:

select rownum from Table(Integers)
where rownum < 1000

When designing a pipelined Integers function a rather popular alternative was neglected. Many would find it natural for the Integers function to have an argument specifying an upper limit in the range of generated integers. For example, Integers(5) returns the list 1,2,3,4,5; and the last could be reformulated in terms of this new function without the predicate:

select rownum from Table(Integers(1000))

Which style is better? Suppose somebody unfamiliar with the Integers function implementation asks:

What is the maximum number in the list of integers produced by this query?

Predicate rownum <= 1000 gives the answer to the question immediately. While with the function parameter it might be 1000, 999, or even 21000 - it is impossible to tell for sure without examining the Integers function implementation.

Function Argument or Predicate?

In general, I favor the solution with predicates. Being at the high abstraction level, programs with predicate expressions are shorter and cleaner. In this particular case, my preference for clarity even goes as far as compromising program safety. A carelessly written query against the Integers function with no argument would not terminate.

The case when one would need raw or only a slightly cooked list of integers is relatively simple. If the integers relation is a part of a more complex query, it would require a more elaborate analysis to decide whether the query would terminate without an explicit server-side stop condition.

Cube

One more approach leverages the cube operator:

select rownum from (
    select 0
    from dual
    group by cube(1,1,1,1,1)
)

This solution had drawn the following eloquent comment from a reader named Mikito Harakiri:

Most posters here seem to have trouble seeing the difference between finite and infinite.

OK, exponentiate the number, it is still produces a finite number (of rows). One day your program with this "cool" cube solution would break just because you have been lazy enough to code a trivial pipelined function. Essentially, cube is as bad as selecting from obj$ (OK, col$ is probably bigger). Well, except that you take your fellow programmer's time, who has to understand your code. What is the purpose of this group by cube, in general, and why there are 5 parameters (and, say, not 6), in particular.

Table function method

select rownum from Table(Integers()) where rownum < 50

is much cleaner and robust.

Hierarchical Query

I have mentioned already DB2 integer generator leveraging recursive SQL. Oracle does not have recursive SQL capabilities at the time of this writing, so users have to use a non-standard hierarchical query extension. A contrast between Oracle and DB2 solutions is often enlightening. One fundamental difference between these two platforms is that Oracle seems to be able to detect loops, while DB2 does not make such a claim. Detecting loops in general is undecideable, which is the basis for DB2's position. Does Oracle's loop detection work because hierarchical extension has narrowed query expression capabilities compared to recursive SQL? Can this be challenged?

Consider the following typical hierarchical query:

select ename from emp
connect by empno = prior mgr
start with empno = 1000

First, Oracle finds all the nodes in the graph satisfying the start with condition. Then, for each batch of nodes found on a previous step, it finds a batch of extra nodes that satisfy the connect by condition. Any time the new node collides with the nodes that have been already discovered, it signals the connect by loop error. How does it identify the nodes? Should it compare all the relation attributes, or only those in the select clause or choose some other ones?

It is easy to see that the attributes in the select clause should not matter. Indeed, adding a rownum pseudo column would artificially make the next node always appear different from its predecessors. The loop, however, is a property of the graph. Graph either has a cycle or it does not no matter what node labels exist. Therefore, the only columns which should be relevant for loop detection are the ones in the predicate with the prior prefix.

What hierarchical query is written without the prior? This experiment reveals a remarkably succinct integer generator:

select rownum from dual
connect by 0=0

As an alternative to the rownum pseudo column, level could be used:

select level from dual
connect by 0=0

Both queries produce an infinite result set. It is pipelined, however, so that the execution can be stopped either on the client or explicitly on the server:

select level from dual
connect by level < 100

Now that we have an ample supply of integer generators, we can proceed with applications.

This is an excerpt from the new book SQL Design Patterns: The Expert Guide to SQL Programming by Vadim Tropashko

You can buy it direct from the publisher for 30%-off.

��