Search BC Oracle Sites

# Enumerating Sets of Integers

SQL Tips by Donald Burleson

When enumerating integer sets every integer is mapped into a set of integers. Representing an integer in a binary numbering system provides a natural way to implement such mapping. Formally, in each ai  is either 0 or 1. Then, the sequence (a0, a1, a2, a3, ?) is a characteristic vector of a set: i is an element of the set whenever ai = 1. For example, 10 = 21 + 23 maps into {1,3}. Here is an integer set enumeration for up to N=10:

Unfortunately, this output cannot be produced in SQL, at least, not yet. Later, a study of composite data types and how multiple values are aggregated into strings and collections will be presented. For now, how to build a relation, ?i is an element of set N? is shown:

The implementation is easy. Take the IntegerPairs relation and recruit the BITAND and POWER SQL functions to filter the unwanted pairs (N, i) pairs:

select y N, x i from IntegerPairs

where bitand(power(2,x),y)>0

Although the idea of pipelining was somewhat discredited in the previous section, it may be recovered for this purpose. We take an infinite stream of integer pairs, apply a filter with the bitand(power(2,x),y)>0 predicate and cut a finite segment of the result on the client. With a non-pipelined IntegerPairs implementation (via Cartesian product of Integers relations) the execution would hang in an infinite loop.

Discrete Interval Sampling

Perhaps the second most common problem that utilizes the Integers relation is unpacking the intervals, or leveraging the Signal Processing terminology Discrete Interval Sampling.

Reduced to essentials the problem is the following. Given two dates, ?1-jan-2005? and ?31-dec-2005?, for example, return the list of the days in between. This is trivial, of course, if the Dates relation has already been filled in with the dates:

select day from Dates
where day between ?1-jan-2005? and ?31-dec-2005?

If the Dates relation is not present, why not simply manufacture it? All that is required is the support of the datetime/interval arithmetic by the RDBMS of your choice. Perhaps, the easiest approach is adding numeric values to the date represented in the date data type. For example, adding 1 to today's date produces tomorrow's date. Likewise, to_date(?1-jan-2005?)+1 is to_date(?2-jan-2005?), and so on. Therefore,

select to_date(?1-jan-2005?)+num
from Integers
where to_date(?1-jan-2005?)+num <= to_date(?31-dec-2005?)

will generate a list of days spanning the whole year. This toy query is frequently encountered as a part of more challenging puzzles.

Consider the following hotel reservation application. The Bookings table stores reservations made against a set of rooms:

Users typically query what rooms are empty during a certain period. For example, querying all available rooms in the time period from 2005-05-07 to 2005-05-12 should return:

We start with two relations: the first containing the list of all the rooms, and the second containing all the dates between 2005-05-07 and 2005-05-12. It is very tempting to define the first relation as

select distinct RoomNo from Bookings

However, it is possible that a particular room is always empty, and therefore, has no records in the Bookings table. The whole Bookings table can be empty!  Let's assume that we have the list of all rooms as one more relation -- the Rooms table. We are just one step from the solution. If a Cartesian product of the two relations is generated, the unwanted results (again, with scalar subquery) can be filtered out:

select RoomNo, Day
from (
select to_date('07-May-2005')+num Day
from Integers
where to_date('07-May-2005')+num <= to_date('12-May-2005')
), Rooms r
where 0 = ( select count(*) from Bookings b
where r.RoomNo = b.RoomNo
and Day between "From" and "To"
);

Although the query has written itself effortlessly, much is left to be desired about its performance.  A seasoned SQL performance analyst would immediately spot a couple of problems with this solution. Building a Cartesian product, first, and evaluating every resulting tuple with a subquery does not sound very promising. But the Cartesian product is unavoidable: if the Bookings relation is empty, then the result must be the Cartesian product. It is an interesting research question, whether one can reduce the cost associated with subquery evaluation in this example.

An alternative approach to the problem would reformulate the query from a business perspective.

1       What is the list of rooms available during a certain interval?

2        What is the average ratio of occupied rooms to empty rooms during a certain               period?

Both of these queries can be answered without discrete interval sampling.

 This is an excerpt from the new book SQL Design Patterns: The Expert Guide to SQL Programming by Vadim TropashkoYou can buy it direct from the publisher for 30%-off.

��

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:

and include the URL for the page.

 Burleson Consulting The Oracle of Database Support Oracle Performance Tuning

Oracle ® is the registered trademark of Oracle Corporation.