The process
of removing redundancy from tables is called data
normalization, which attempts to minimize the
amount of duplication within the database design.
Although normalization was an excellent technique
during the 1980s, when disk space was very expensive,
the rules have changed in the 21st century, with disk
costs dramatically lower. Today, adding redundancy is
a very important aspect of designing high-performance
Oracle databases.
The introduction of redundancy to avoid costly table
joins can dramatically improve the speed at which
Oracle SQL queries are serviced. It is the challenge
of the Oracle design professional to choose the
appropriate database design to ensure that SQL queries
are serviced as quickly as possible. Instead of
removing redundancy, the Oracle designer controls the
introduction of redundancy using specific rules.
When to add redundancy
and violate third normal form
Essentially, the introduction of redundancy is a
function of the size of the redundant column and the
frequency with which the column is updated. The ideal
candidates for redundant duplication are table columns
that meet the following criteria:
- The introduction of redundancy will eliminate
the need to repeatedly join two tables together.
- The data column is small.
- The data column is static and rarely updated.
Planned data
denormalization
Oracle was one of the first databases to introduce
tools for planned data denormalization. As hard drives
became cheaper throughout the 1990s, Oracle recognized
that significant performance improvements could be
introduced by deliberately introducing redundant data
items into the Oracle table and index structures.
Snapshots
One of Oracle's first forays into data redundancy was
the introduction of Oracle snapshots. With
Oracle's advanced replication option, copies of tables
could be made on remote database servers and refreshed
at specific intervals. This redundant duplication of
Oracle tables across widely dispersed geographical
areas ensured that users were able to retrieve
information quickly on a local server without the need
to travel across a large network.
VARRAYs
Oracle also allows the introduction of redundant
information using VARRAY table structures. In a VARRAY
table, Oracle provides for the introduction of
non-first-normal form data structures by inserting
repeating groups of values directly within a single
Oracle table row. This avoids the overhead of joining
the base table into a subordinate table to retrieve
the solution set.
Let?s look at a simple VARRAY table example. Assume we
have a Student table for a university. One of the
table requirements is storing student SAT and ACT
scores. The students may take the test only three
times, and the test scores are a very small repeating
group that repeats for only a specific number of
values.
Using traditional database design structures,
we would be required to create a Test_scores table and
join the Student table with it to see both the student data and the
repeating values of their SAT and ACT test scores. Using Oracle8
VARRAY tables, you can create a table structure where repeating
groups are automatically stored within the Oracle table itself
|
A non-first-normal form
(0NF) Oracle VARRAY table |
Frequently updated large data columns can be very
cumbersome for Oracle VARRAY tables. In the example of
our test scores, the VARRAY tables allow Oracle to
retrieve both the student and test information within
a single disk I/O operation.
Another important VARRAY table characteristic is that
the repeating information may be stored in presorted
order. Upon retrieval, the information will always be
displayed in sorted order. This alleviates the
additional overhead of re-sorting the test scores
every time a student row is retrieved.
Oracle
Materialized views
After the popularity of snapshot replication, Oracle
recognized that complex queries could be prebuilt to
provide end users with the illusion of instantaneous
response time. The pre-compilation process allowed
five-way table joins, complex pre-summarization of
aggregation operations, and a host of other
time-consuming and I/O expensive SQL queries that can
be pre-calculated.
Basically, materialized views boil down to a
?build it now or build it later? philosophy. Using
this philosophy, you can pre-execute Oracle queries in
anticipation of the end user?s query, thereby allowing
the end user to retrieve complex information on a
single disk I/O.
However, simply pre-building complex queries is only a
portion of the answer. A mechanism had to be created
to make Oracle SQL aware of a query that had been
prebuilt and to tell it to use the pre-created summary.
Oracle called this exciting new feature query
rewrite. Using the Oracle parameter
query_rewrite_enabled, Oracle automatically checks for
materialized views whenever it notices a match between
an incoming SQL statement and a prebuilt aggregate.
If
Oracle notices that the information has been
pre-summarized, the cost-based optimizer goes directly
to the pre-summarized information, thus saving
thousands of expensive disk I/Os. For data warehouse
applications and Oracle systems requiring complex SQL
queries, materialized views can be the difference
between sub-second response times and queries that may
run for 30 minutes.
Here is a simple example of a materialized view:
create
materialized view
sum_sales
build immediate
refresh complete
enable query rewrite
as
select
product_nbr,sum(sales) sum_sales
from sales;
When any query summarizes sales, that query will be
dynamically rewritten to reference the summary table:
alter
session set query_rewrite_enabled=true;
set autotrace on
select
sum(sales)
from
sales;
In the execution plan for this query, we see that the
sum_sales table is referenced:
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1
Bytes=83)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'SUM_SALES' (Cost=1
Card=423 Bytes=5342)
Materialized views, being redundant, need to be
updated when their base tables change. Just as a
snapshot needs to specify a refresh interval, an
Oracle materialized view has to specify the rate at
which the materialized view is recreated when any of
the information that constitutes the materialized view
has changed. Oracle offers a wealth of options for the
frequency of rebuilding the views, ranging from
instantaneous rebuilds (commit refresh) to more
sophisticated refresh intervals that can be done
according to the volatility of the base data.
Conclusion
Because disk prices are falling by orders of magnitude
every year, Oracle professionals are very conscious of
introducing redundancy into their Oracle data models
to improve performance. A third-normal-form database
design in the 21st century may be very efficient from
a disk-storage point of view, but it will perform very
poorly because everything has to be built from its
atomic pieces every time the queries are executed.
Using Oracle's denormalization tools such as
replication, VARRAY tables, and materialized views,
the Oracle database designer can deliberately
introduce redundancy into the data model, thereby
avoiding expensive table joins and large-table
full-table scan operations that are required to
recompute the information at runtime.
Also See:
|
|
Get the Complete
Oracle SQL Tuning Information
The landmark book
"Advanced Oracle
SQL Tuning The Definitive Reference" is
filled with valuable information on Oracle SQL Tuning.
This book includes scripts and tools to hypercharge Oracle 11g
performance and you can
buy it
for 30% off directly from the publisher.
|