Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 

 

A Case for the CASE Operator

SQL Tips by Donald Burleson

This section embarks upon a little bit longer journey. We have already mentioned SQL proximity to the English language. Consider the following query:

For each customer, show the number of calls made during the first 6 months that exceeded the average length of all calls made during the year, and the number of calls made during the second 6 months that exceeded the same average length.

True, you have to read this long sentence more than once. Upon careful examination, though, you?ll be easily convinced that the apparent complexity is illusory. The query can be translated into SQL in small increments. But first, a schema is needed to anchor the SQL query to:

table Calls (
    FromPh      integer(10),
    ToPh        integer(10),
    ConnectedAt date,
    Length      integer
);

The Calls table stores the calls placed on a telephone network over a period of one year. Each FromPh number identifies a customer.

Thus the process begins with translating

For each customer ?

Into

select FromPh, ?  
from Calls
group by FromPh

The missing part that will be gradually developed in the next steps is intentionally marked with ellipsis.

At this moment you may wonder if: 

select distinct FromPh  
from Calls
 
is the easier way to list all the customers. It certainly is, but now what? This query is complete, it answers the question partially, but it cannot be expanded to answer the remainder. The group by clause, on the other hand, is one of the most powerful SQL constructs.

DISTINCT operator is redundant

Technically, the distinct operator is a special case of group by. For any table (or view) T

select distinct x, y from T

is equivalent to

select x, y from T

group by x, y

The next clause:  

?, show the number of calls made during the first 6 months that exceeded the average length of all calls made during the year, ?

leaves us a choice. The condition can be placed into the where clause, but then some difficulty may arise in assembling the query pieces together.  A better alternative is to leverage a familiar conditional summation pattern:

select FromPh,
       sum( case when - then 1 else 0 end ),
       ?
from Calls
group by FromPh
 
Ellipsis means that the condition must be interpreted: 

? during the first 6 months that exceeded the average length of all calls made during the year ?

This is still a relatively complex sentence. You may notice that the two variables: ConnectedAt and Length are involved. The condition begins with

? during the first 6 months ?

which is easily translated into ConnectedAt < ?1-July-2005?. The next fragment

 ? the average length of all calls made during the year ?

is a little bit trickier. First, the query is ambiguous. Did the author mean the average length of all the calls in the system, or the average length for each customer? Both interpretations are perfectly reasonable. The average length of the call is

select avg(Length)
from Calls
 
while the average length of the call per each customer is: 

select FromPh,
       avg(Length)
from Calls
group by FromPh

The first interpretation is easier to implement than the second one, therefore, it is used as an exercise for the reader.

So, given the query that has been developed so far 

select FromPh,
       sum( case when - then 1 else 0 end ),
       ?
from Calls
group by FromPh
 
where does the relation

select FromPh,
       avg(Length)
from Calls
group by FromPh

fit in? The only place that admits arbitrary relations is the from clause.

Relational Closure

The SQL query block inside the from clause is called inner view. From a logical perspective there is no difference if a relation within the from clause (or anywhere in the SQL statement, for that matter) is a table or a view. It is a manifestation of the fundamental property of the Relational Model - Relational Closure. It is common to organize a query in a chain of inner views so that every step is small and easily comprehendible.

Let's nest the second query into the first as an inner view:

select c1.FromPh,
       sum( case when - then 1 else 0 end ),
       ?
from Calls /*as*/ c1, (
    select FromPh,
           avg(Length) /*as*/ av
    from Calls
    group by FromPh
) c2
group by FromPh
 

Aliases c1, c2, and av, were introduced along the way, which will be helpful for further development. The c1, in fact, is required to disambiguate the FromPh column name in the select clause.

Translation of the informal query into SQL is only a few steps away from completion. First, the relations c1 and c2 are naturally joined by the customer id, FromPh. Second, the av alias is the average length of the call per each customer that was required to complete the predicate inside the case operator. Thus:

select c1.FromPh,
       sum(case when ConnectedAt < '1-July-2005'

                
and Length < av
            then 1 else 0 end),
       ?
from Calls /*as*/ c1, (
    select FromPh,
           avg(Length) /*as*/ av
    from Calls
    group by FromPh
) c2
where c1.FromPh = c2.FromPh
group by FromPh
 
The final clause of the informal query 

?and the number of calls made during the second 6 months that exceeded the same average length.

is very similar to the clause that was just analyzed.

Let's explore a slightly different path. Instead of introducing an inner view c2, why not calculate the average length for the customer in place as a correlated scalar subquery:

select c1.FromPh,
       sum(case when ConnectedAt < '1-July-2005'

               
and Length < (
                        select avg(Length)
                        from Calls c2
                        where c1.FromPh = c2.FromPh

                 )
            then 1 else 0 end
       ),
       ?
from Calls c1
group by FromPh

Which of the two queries, the one with inner view, or the one with scalar subquery performs better? Well, they are logically equivalent, aren't they? The SQL engine reserves the right to transform a query to a logically equivalent one. A curious reader might want to check if both queries have the same execution plans on the RDBMS of his choice.

Let's pause and reflect back a little. The genesis of the solution is the case operator inside the sum aggregate. It is possible to express this query in SQL without it. Chantziantoniou et al (the authors of the article where I borrowed the problem from) followed that route and introduced a chain of named intermediate views as follows:

create view AvgCallLengthPerCust as
select FromPh, avg(Length) /*as*/ avgL
from Calls
group by FromPh
 

create view ExcAvgDuring1stHalfYear as
select C.FromPh, count(*) /*as*/ count
from Calls /*as*/ C, AvgCallLengthPerCust /*as*/ V
where C.FromPh = V.FromPh AND
Length > avgL AND Date<'1-July-2005'
group by C.FromPh  

create view ExcAvgDuring2ndHalfYear as
select C.FromPh, count(*) /*as*/ count
from Calls /*as*/ C, AvgCallLengthPerCust /*as*/ V
where C.FromPh = V.FromPh AND
Length > avgL AND Date>='1-July-2005'
group by C.FromPh  

select a1.FromPh, a1.cnt, a2.cnt
from ExcAvgDuring1stHalfYear /*as*/ a1,
     ExcAvgDuring2ndHalfYear /*as*/ a2
where a1.FromPh=a2.FromPh

Based on this example, Chantziantoniou et al proposed extending SQL language in such a way that would make writing queries involving multiple features of the same group easier. As have been shown earlier, a solution leveraging the case operator makes this argument less convincing.

 

This is an excerpt from the new book SQL Design Patterns: The Expert Guide to SQL Programming by Vadim Tropashko

You can buy it direct from the publisher for 30%-off.


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2020

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.