 |
|
SQL Constraints
SQL Tips by Donald Burleson |
Constraints are
fundamental to databases and application programming. Unfortunately,
in the programming industry the discussion about constrains often
degrades to a rather shallow dilemma. Should the constraints be
enforced in the database, or in the application/middle tier?
If the reader is
still undecided about this, then it makes little sense to continue.
Constraint implementation in the database matured to a fairly
sophisticated level, not the least of which should be credited to
the wealth of the underlying language - SQL.
As the reader is
assumed to be familiar with the basics of database constraints - unique key, referential integrity, check constraint, this chapter
will venture into an obscure area of complex constraints. SQL
standard allows declaring complex constraints as ASSERTIONs, but no
database vendor supports them. For quite some time database triggers
were the only complex constraint enforcement technique. This chapter
will help the reader to broaden that view.
Certain constraints
can be implemented with the Function Based method. It might
not be as general as the materialized views based technique, but it
certainly is not short of elegance. The materialized views based
method, on the other hand, is somewhat elaborate, although much more
powerful. It is the favorite in this chapter.
Function Based Constraints
Function based
constraints require little introduction. Sooner or later every
database person comes across a unique constraint on
UPPER(person.name). The other popular example of a function based
constraint is:
alter table
Emp
ADD
CONSTRAINT namesInUppercase CHECK ( UPPER(ename)=ename )
This chapter will
show the many reasons why declarative constraints declaration is
always better than any alternative solution. I witnessed that in my
own experience. Consider the query:
select *
from emp
where ename
like 'MIL%'
When I saw the
following query execution plan, I was puzzled where did the second
conjunct UPPER(ename) like UPPER('MIL%') come from?
The plan looked as
if the optimizer rewrote the query into:
select *
from emp
where ename
like 'MIL%'
and
UPPER(ename) like 'MIL%'
The answer may be
immediate here in hindsight of the constraint that I declared
earlier, but in practice I totally forgot about the constraint which
I declared a while ago. After quick investigation, I indeed found
that there was a check constraint UPPER(ename) = ename
declared upon the Emp table. In other words, the RDBMS engine
leverages constraints when manipulating query predicates.
What is the point
of adding one more filter condition to a query that would execute
correctly anyway? Imagine that a function-based index was added upon
an UPPER(ename) pseudo column. It might be used, even though
the original query does not refer to any function within the
predicate:
select *
from emp
where ename
like 'MIL%'
A query execution
leveraging an index often means difference between life and death
from a performance perspective.
The next section
will experiment with a somewhat more elaborated function based
constraint.
Symmetric Functions
Consider an
inventory database of boxes:
table Boxes
(
length
integer,
width
integer,
height
integer
)
Box dimensions in
the real world are generally not given in any specific order. The
choice of which dimensions becomes length, width, and height is
essentially arbitrary. However, what if we want to identify the
boxes according to their dimensions? For example, the box with
length=1, width=2, and height=3 is the same box as the one with
length=3, width=1, and height=2. Furthermore, how about declaring a
unique dimensional constraint? More specifically, any two
boxes that have the same dimensions would not be allowed.
An analytical mind
would have no trouble recognizing that the heart of the problem is
the column ordering. The values of the length, width, and height
columns can be interchanged to form another legitimate record!
Therefore, why not introduce three pseudo columns, say A, B, and C
such that:
A
≤ B ≤ C
Then, a unique
constraint on A, B, C should satisfy the
requirement! It could be implemented as a function based unique
index, as long as A, B, C can be expressed
analytically in terms of length, width, height.
Piece of cake: C is the greatest of length,
width, height; A is the least of them, but
how is B expressed? Well, the answer is easy to write
although difficult to explain:
B = least
(greatest (length,width),
greatest (width,height),
greatest (height,length)
)
A mathematical
perspective, as usual, clarifies a lot. Consider the following cubic
equation:
If roots x1,
x2, x3 are known, then the cubic
polynomial could be factored as the following:
Marrying both
equations, coefficients a, b, c are expressed
in terms of roots x1, x2, x3:
The functions -x1-x2-x3,
x1x2+x2x3+x3x1,
-x1x2x3 are symmetric.
Permuting x1, x2, x3
has no effect on the values a, b, c. In other
words, the order among the roots of cubic equation is irrelevant:
formally, we speak of a set of roots, not a list of roots. This is
exactly the effect desired in the example with boxes. Symmetric
functions rewritten in terms of length, width, height are:
length+width+height
length*width+width*height+height*length
length*width*height
Those expressions
were simplified a little by leveraging the fact that the negation of
a symmetric function is also symmetric.
The last solution
is strikingly similar to the earlier one, where the greatest
operator plays the role of multiplication, while the least operator
goes as addition. It is even possible to suggest a solution, which
is a mix-in between the two:
least(length,width,height)
least(length+width,width+height,height+length)
length+width+height
A reader can check
that these three functions are again symmetric.
The last step is
recording the solution in formal SQL:
table Boxes
(
length
integer,
width
integer,
height
integer
);
create
unique index b_idx on Boxes(
length +
width + height,
length *
width + width * height + height * length,
length *
width * height
);
Symmetric functions
provide a basis for a nifty solution. In practice however, a problem
can often be solved by schema redesign. In the box inventory
database example, a schema redesign is not necessary; simply change
the practice of inserting unconstrained records (length,width,height)
and demand that:
length ≥
width ≥ height
Information on Boolean Expressions in Check Constraints is
available here:
Boolean Expressions in Check Constraints
|
|
Get the Complete
Oracle SQL Tuning Information
The landmark book
"Advanced Oracle
SQL Tuning The Definitive Reference" is
filled with valuable information on Oracle SQL Tuning.
This book includes scripts and tools to hypercharge Oracle 11g
performance and you can
buy it
for 30% off directly from the publisher.
|