Oracle Row Re-sequencing
Oracle Silver Bullets by Burleson Consulting
How does row re-sequencing work?
- Click here
to listen to a talking video showing the benefits of
re-sequencing in Oracle data blocks:
If response times are lagging in your high-transaction system,
reducing disk I/O is the best way to bring about quick improvement.
And when you access tables in a transaction system exclusively
through range scans in primary-key indexes, reorganizing the tables
with the CTAS method should be one of the first strategies you use
to reduce I/O. By physically sequencing the rows in the same order
as the primary-key index, this method can considerably speed up data
queries that access common rows with a table (e.g. get all items in
order 123), unordered tables can experience huge I/O as the index
retrieves a separate data block for each row requested.
group like rows together (as measured by the clustering_factor in
dba_indexes) we can get all of the row with a single block read
because the rows are together. You can use 10g hash cluster
tables, single table clusters, or manual row re-sequencing (CTAS
with ORDER BY) to achieve this goal:
Like disk load balancing, row resequencing is easy, inexpensive, and
relatively quick. With both techniques in your DBA bag of tricks,
you'll be well equipped to shorten response times-often
dramatically-in high-I/O systems.
Experienced Oracle DBAs know that I/O is the single greatest
component of response time and regularly work to reduce I/O. Disk
I/O is expensive because when Oracle retrieves a block from a data
file on disk, the reading process must wait for the physical I/O
operation to complete. Physical reads (disk access) is at
least 100x slower than a buffer get, and anything you can do to
minimize I/O can greatly improve the performance of any Oracle database.
Oracle database administrators also know that storing data in the
same way that it is retrieved will reduce I/O. Re-sequencing
table rows to place related rows together on the same data blocks
can greatly reduce I/O because Oracle can grab more relevant data in
a single block fetch.
For a simple illustration, assume a
system where 95% of the SQL queries retrieve 50 random item rows for
a specific order number. Queries for "show me all items for
this order" would greatly benefit from reduced if we sequence all
item rows by order number, placing together on adjacent data blocks.
For example, clustering all
orders for a customer together on the same data
block allows the query "show me all orders
for this customer" to be retrieved in a
See here for more details on using hashing
to group data rows onto related data blocks.
clustering_factor metric shows how in-sync
the rows are.
With row re-sequencing,
adjacent rows are together, greatly reducing
Note that in
the most beneficial case, dba_indexes will show
this column with a low
clustering_factor, small avg_row_len and we are using
large data blocks (found in dba_segments) to minimize
When row re-sequencing does not help
For the majority of tables, physically re-sequencing
the rows is not going to make a dramatic difference in SQL response
time. Row re-sequencing is best for databases when the vast
majority of SQL queries traverse an un-clustered one-to-many data
For example, a shipping database where clerks enter customer data
and then retrieve random order data might benefit by re-sequencing the
orders onto adjacent data blocks.
When does row re-sequencing help?
Row re-sequencing can help by:
- Clustering related rows together onto the same data block
(Oracle cluster table structures).
- Pre-ordering data to avoid expensive disk sorts after
- Packing adjacent rows together on a single data block to
reduce block fetches for index row scans.
Row re-sequencing does not help queries that perform full-scans
or index unique scans, and careful attention must be given to the
nature of the queries against the target table.
Proving the benefits of row-resequencing
There are still relational purists and neophytes who don't
believe that storing related rows together will reduce query I/O.
The best way to address the concerns of Luddites is to create a
simple test case:
- Run the original query and measure consistent gets and
physical reads (TKPROF)
- Copy the table to a re-sequenced table (using CTAS with
order by), build indexes and re-run the query noting the logical
and physical reads.
Your mileage may vary
The runtime benefits of row-resequencing (as measured by a
reduction in consistent gets) depends on many factors, most
important, the overall amount that the table is accessed via the
index (as a range scan) and the average number of rows returned by
the range scan:
Client access method - The SQL*plus arraysize
parameter and the use of PL/SQL "bulk collect" can affect consistent gets
for row-ordered tables. See my
notes here on
Data structure - Rows per block
fetch is important and it depends on block size and avg_row_len.
SQL access method - The way that
your table is accessed has a great effect on the benefits of
No benefit for single block access
("index unique scan" and "fetch by ROWID") or for full-table
Best when a large majority of the SQL
does long range scans against the target table
The number of rows returned by the
range scan is a major factor. A system that displays an
average of 200 line items for an order would see a far
greater LIO reduction than a "small" index range scan that
only fetches 20 rows.
System-wide parameters - Many
parameters effect multiblock reads against the target table
The History of Row Re-sequencing
Even before commercial databases, data processing administrators
regularly used the IBM mainframe IDCAMS utility to maintain row
sequence order for ISAM (indexed sequential access method) flat
Historically, Oracle dropped-the-ball in offering the robust
physical data sequencing methods. Competing databases such as
IDMS/R and DB2 have offered the ability to pre-sequence rows and
manage new inserts.
- Since 1981 IDMS has offered the
VIA set construct that allows adjunct access for
When an owner participates in more than
one set, you can request that a specified set be given priority
and those member records will be stored first, ensuring that
they are stored on the same page as the owner.
DB2 cluster indexes, we have been able to cluster rows
together in index sequence sequence for decades:
The keyword CLUSTER, specified when the index is created,
instructs DB2 to maintain the rows on the data pages in sequence
according to the indexed column
Originally, relational theory stated that the physical placement
of rows on the database did not matter, causing consternating among
realists who have known for decades that query performance is highly
related to the placement of the rows on the data blocks.
It was not until Oracle8i that Oracle lifted this silly
restriction and the Oracle DBA was allowed to use "Create Table as
Select (CTAS)" with an ORDER BY clause. Prior to that, the
Oracle DBA had to "force" row re-sequencing with an "index" hint in
the CTAS syntax.
Inside Row re-sequencing
The degree to which resequencing improves performance depends on how
far out of sequence the rows are when you begin and how many rows
you will be accessing in sequence. You can find out how well a
table's rows match the index's sequence key by looking at the
dba_indexes and dba_tables views in the data dictionary.
In the dba_indexes view, we look at the clustering_factor
column. If the clustering factor-an integer-roughly matches the
number of blocks in the table, your table is in sequence with the
index order. However, if the clustering factor is close to the
number of rows in the table, it indicates that the rows in the table
are out of sequence with the index.
The benefits of row resequencing cannot be underestimated. In large
active tables with a large number of index scans, row resequencing
can triple the performance of queries. Once you have decided
to re-sequence the rows in a table, you can use one of the following
tools to reorganize the table.
- Copy the table using Oracle's Create Table As Select (CTAS)
- Oracle9i dbms_redefinition, an in-place table reorganization
- Migrate to a single-table or multi-table cluster.
Row re-sequencing FAQ
>> I am trying my best to make data
sequencing a performance tactic at my shop and when I speak
about it's benefits and what it is......Oracle DBA's look at me
like I'm speaking in tongues.
They must be beginners. Data
sequence on disk has been beneficial for decades before
Oracle came along. Just ask them why Oracle provides
cluster tables, specifically made for row sequencing.
>> I'm more than in agreement with
you that data sequencing is an extremely powerful SQL
performance tactic for sequential heavy queries.
Well, it's best for "large
index range scans", and it can reduce LIO by a factor of
thousands. It's easily proven with a simple test,
cloning the table into row sequence and re-running the
>> So, I'm hoping you could provide
me with a customer reference or two who uses data sequencing.
I would start with ANY DB2 or
IDMS shop, where it's easy to sequence rows into key order.
Also, search MOSC for "table cluster" questions and you
should find hundreds of shops using Oracle cluster tables.
For more details, see my new book "Oracle
Tuning: The Definitive Reference".
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts.