Search BC Oracle Sites

# Transitive Closure

SQL Tips by Donald Burleson

There are several key graph concepts that would guide your intuition when writing queries on graphs:

1)      Reflexive closure of a graph is built by adding missing loops - edges with the same endpoints. Reflexive closure is expressed easily in SQL:

select head, tail from Edges -- original relation
union
union
select tail, tail from Edges ?- more loops

2)      Symmetric closure of a (directed) graph is built by adding an inversely oriented edge for each edge in the original graph. In SQL:

select head, tail from Edges -- original relation
union
select tail, head from Edges ?- inverse arrows

3)      Transitive closure of a (directed) graph is generated by connecting edges into paths and creating a new edge with the tail being the beginning of the path and the head being the end. Unlike the previous two cases, a transitive closure cannot be expressed with bare SQL essentials - the select, project, and join relational algebra operators.

For now, let's assume that we know how to query transitive closure, and demonstrate armed with these definitions how some problems can be approached. Here is a puzzle from the comp.databases.oracle.misc forum:

Suppose I have a table that contains Account# & RelatedAccount# (among other things).

How could I use CONNECT BY & START WITH in a query to count relationships or families.  For example, in

ACCT    REL_ACCT
Bob     Mary
Bob     Jane
Jane    Bob
Larry   Moe
Curly   Larry

there are 2 relationship sets (Bob,Mary,Jane & Larry,Moe,Curly).  If I

Curly   Jane

then there'd be only 1 larger family.  Can I use CONNECT BY & START with to detect such relationships and count them?  In my case I'd be willing to go any number of levels deep in the recursion.

By ignoring Oracle specific SQL keywords as inessential to the problem at hand and using proper graph terminology, the question can be formulated into just one line:

Find a number of connected components in a graph.

The Connected component of a graph is a set of nodes reachable from each other. A node is reachable from another node if there is an undirected path between them.

Reachability is an equivalence relation: it is reflective, symmetric, and transitive. The reachability relation is obtained by closing the Edges relation to become reflective, symmetric, and transitive as shown in Figure 6.5.

Now return to the problem of finding the number of connected components and assume that the reachability relation EquivalentNodess has already been calculated. Then, simply select the smallest node from each component. Informally this would read:

Select node(s) such that there is no node with smaller label reachable from it. Count them.

Formally:

select count(distinct tail) from EquivalentNodes e
where not exists (
select * from EquivalentNodes ee
);

Equivalence Relation and Group By (cont)

In one of the Chapter 1 sidebars we have attributed the incredible efficiency of the group by operator to its proximity to one of the most fundamental mathematical constructions - the equivalence relation. There are two ways to define an equivalence relation.

The first way is leveraging the existing equality operator on a domain of values. The second way is defining an equivalence relation explicitly, as a set of pairs. The standard group by operator is not able to understand an equivalence relation defined explicitly - this is the essence of the problem, which has just been solved.

Being able to query the number of connected components earned us an unexpected bonus: a connected graph can be redefined as a graph that has a single connected component. And, a connected graph with N nodes and N-1 edges must be a tree. Thus, counting nodes and edges together with a transitive closure is another opportunity to enforce tree constraint.

Now that some important graph closure properties have been established, everything is ready for transitive closure implementations. Unfortunately, this story has to branch here since database vendors approach the hierarchical query differently.

 This is an excerpt from the new book SQL Design Patterns: The Expert Guide to SQL Programming by Vadim TropashkoYou can buy it direct from the publisher for 30%-off.

��

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:

and include the URL for the page.

 Burleson Consulting The Oracle of Database Support Oracle Performance Tuning