Home
E-mail Us
Oracle Articles
New Oracle Articles

Oracle Training
Oracle Tips
Oracle Forum
Class Catalog

Remote DBA
Oracle Tuning
Emergency 911
RAC Support
Apps Support
Analysis
Design
Implementation
Oracle Support

SQL Tuning
Security
Oracle UNIX
Oracle Linux
Monitoring
Remote support
Remote plans
Remote services
Application Server
Applications
Oracle Forms
Oracle Portal
App Upgrades
SQL Server
Oracle Concepts
Software Support
Remote Support
Development
Implementation

Consulting Staff
Consulting Prices
Help Wanted!

Oracle Posters
Oracle Books
Oracle Scripts
Ion
Excel-DB

Don Burleson Blog

Oracle Data Warehouse How to Create a Clustered Index

Oracle Data Warehouse Tips by Burleson Consulting

How to create a clustered index

For each table, there can only be one clustered index, since only one index will be stored in the same physical order as the table. Unfortunately, Oracle does not give the DBA a simple method for insuring the physical ordering of rows within a table. There are, however, some tricks that can be used to re-sequence a table to give it a different physical order. Beware, however, that the index may quickly be out of physical sequence as new rows are added to the end of the table and key values are updated to change the logical sequence in the index.

For Oracle data warehouses updates are usually done on a time-based formula, so the issues becomes one of re-sequencing the index and insuring that additions to the index are managed. The most obvious key for an Oracle data warehouse would be the date column. Since Oracle data warehouses are updated in batch mode periodically, the physical sequence of the data with the index could be maintained if we insure that the new records are pre-sorted in date order.

Note: A Date sequence for a clustered index is only beneficial if your system performs a lot of range scans by date.

Here are the options for appending new rows onto the end of a physically sequenced Oracle table:

1. Pre-sort the extract files in index key order and load with SQL*Loader.

2. Extract the data directly from an Oracle OLTP system using date predicates:

    INSERT INTO warehouse_table
            SELECT * FROM oltp_table@remote_instance
            WHERE trans_date > ?01-JAN-1998?
            ORDER BY trans_date;

Note: Oracle export/import utilities have no mechanism for changing the physical sequence of tables, and cannot be used to cluster an index.

Now that we understand the basic constructs of Oracle indexes, let?s look at a dictionary query (Listing 8.2) that will tell us the structure of our indexes. Note that this query assumes that your Oracle database is using the cost-based optimizer, and that your tables have been analyzed with the ANALYZE TABLE command. Here we see that the indexes are grouped according to the tables that they are built upon. We also see that the clustering factor for each index is computed as a percentage of the number of rows in the index.

Listing 8.2 A SQL*Plus routine to locate clustered indexes

rem idx_bad1.sql, ? 1997 by Donald K. Burleson
set pagesize 60;
set linesize 100;

column c0 heading 'Table'      format a8;
column c1 heading 'Index'      format a18;
column c2 heading 'Level'      format 999;
column c3 heading 'Clust Ftr' format 9,999,999;
column c4 heading '# rows'     format 99,999,999;
column c5 heading 'Clust Pct' format 999.9999;
column c6 heading 'dist. Keys' format 99,999,999;

spool idx_bad1.lst;

break on c0 skip 1;

select
dba_indexes.table_name                            c0,
index_name                                        c1,
blevel                                            c2,
clustering_factor                                 c3,
num_rows                                          c4,
decode(clustering_factor,0,1,clustering_factor)/
decode(num_rows,0,1,num_rows)                     c5,
distinct_keys                                     c6
from dba_indexes, dba_tables
where
dba_indexes.owner not in ('SYS','SYSTEM')
and
dba_tables.table_name = dba_indexes.table_name
and c5 < .25
order by c0, c5 desc;

spool off;

Here is a sample listing from this report:

SQL> @idx_bad1

Table    Index               Level Clust Ftr      # rows     reorg dist. Keys
-------- ------------        ----- ---------- ----------   ------- ----------
INV_LINE INV_LINE_ITEM_PK       2     62,107   1,910,034     .0325   1,912,644
         ILI_FK_INV             2    164,757   1,910,034     .0339   1,659,625
         ILI_FK_ACT             2    283,343   1,910,034     .0436          47
         ILI_EK_CCHS_ACCT       3 1,276,987   1,910,034     .1450      25,041

Now, let's inspect this listing to see the clustering status of our indexes. The indexes are listed within each table heading in descending order of their clustering factor, with the most clustered indexes at the bottom of the list. In the example above we see the INV_LINE_ITEM_PK index with a clustering factor of 62,107, indicating that the INV_LINE table has been loaded in nearly the same physical order as this index.

If the value for clustering factor approaches the number of blocks in the base table, then the index is said to be clustered. If the clustering factor is greater than the number of blocks in the base table and approaches the number of rows in the base table, then the index is un-clustered.

Determining when to rebuild indexes

One of the problems with Oracle data warehouses is their huge size. For very large database warehouses, performing a database re-organization is impractical because of the amount of time required to export the data warehouse to tape, drop the Oracle database, and re-create the warehouse using Oracle's import utility. Because of this time issue, the data warehouse manager must find alternative methods for insuring that our Oracle data warehouse remains well-tuned from a physical data perspective.

In an Oracle data warehouse, a large index may take a long amount of time to rebuild, and the prudent data warehouse administrator must carefully choose the right conditions that warrant an index rebuild. If the very large database tables have been horizontally partitioned, where the large table is split into sub-tables according to date, we see that there will be several smaller indexes to replace a single, very large index. For details on using index partitioning, see Chapter 14, Oracle8 for the Warehouse.

In general, indexes will seldom required rebuilding in an Oracle data warehouse unless there has been a high amount of update or delete activity against the index columns. SQL Insert operations that are common for loads of new warehouse data do not cause structural problems with the Oracle index structure.

So, how do we tell when an index would benefit from being rebuilt? There are two Oracle views that provide index statistics, DBA_INDEXES and INDEX_STATS. The DBA_INDEX view contains statistical information that is placed into the view when the ANALYZE INDEX xxx command is issued. Unfortunately, the DBA_INDEXES view was designed to provide information to the cost-based SQL optimizer and it does not keep statistics about the internal status of the Oracle indexes. To see the internal structure for an Oracle index, you must use the ANALYZE INDEX xxx VALIDATE STRUCTURE SQL command to validate the structure for the index. This command creates a single row in a view called INDEX_STATS.

SQL> analyze index DON.DON_FK_PLT validate structure;

Index analyzed.

SQL> select * from index_stats;

    HEIGHT     BLOCKS NAME                              LF_ROWS    LF_BLKS
---------- ---------- ------------------------------ ---------- ----------
LF_ROWS_LEN LF_BLK_LEN    BR_ROWS    BR_BLKS BR_ROWS_LEN BR_BLK_LEN DEL_LF_ROWS
----------- ---------- ---------- ---------- ----------- ---------- -----------
DEL_LF_ROWS_LEN DISTINCT_KEYS MOST_REPEATED_KEY BTREE_SPACE USED_SPACE
--------------- ------------- ----------------- ----------- ----------
PCT_USED ROWS_PER_KEY BLKS_GETS_PER_ACCESS
---------- ------------ --------------------
         3       5635 DON_FK_PART                       196103       2382
    3137648       3900       2381         18       41031       3956           7
            112           125             56220     9361008    3178679
        34     1568.824              787.912

The Oracle index_stats view will never contain more than one row. Therefore, you must perform the "analyze index xxx validate structure" command and "select * from index_stats" before issuing the next analyze index command. The script id1.sql provides a method for getting a complete report for all indexes.

Oracle indexes perform two basic operations as the index expands to hold more keys. Oracle's version of b-tree indexing uses an algorithm where each index node may contain many index keys. As new key values are added to the index, Oracle must manage the configuration of each index node. Oracle index nodes are managed with two operations; splitting and spawning.

Splitting-- This is the term used to describe what happens when an index node is filled with keys and a new index node is created at the same level as the full node. Splitting widens the b-tree horizontally.

Spawning-- This is the term used to describe the process of adding a new level to an index.

The index_stats view contains information about the internal structure of the b-tree index that can be useful when determining whether or not to rebuild the index. The following columns of index_stats are especially useful:

height - This columns refers to the maximum number of levels encountered within the index. An index may have 90% of the nodes at 3 levels, but excessive splitting and spawning in one area of the index may have caused some nodes to have move then 3 levels. Whenever the value of height is more than three, you may benefit from dropping and re-creating the index. Oracle indexing will not spawn a fourth level on a clean rebuild until more than ten million kets have been added to the index.

del_lf_rows-- This columns refers to the number of leaf rows that have been deleted from the index. This occurs when heavy index update activity occurs within the index tree, and indicated that the index will benefit from being dropped and re-created.

distinct_keys-- This indicates the number of distinct key values in the index. This is called the cardinality of the index, and values less than 20 are candidates for being re-created as bitmapped indexes.

most_repeated_key-- This column counts the number of times that the most frequent key value in a non-unique index appears in the b-tree.

Since the index_stats view will only hold one row at a time, it is not easy to create a SQL*Plus routine that will produce a index_stats report for all of the indexes on a system. The SQL in listing 8.3 will perform an analyze index xxx validate structure for each index in the schema and report of the resulting values in index_stats.

Note: Running id1.sql will invoke id2.sql through id5.sql automatically producing the unbalanced index report. Just be sure that id1.sql through id5.sql are present in a common directory when starting id1.sql.

Despite the complexity of dealing with a one-row index_stats table, it is easy to use the following script to get index_stats for all warehouse indexes. (listings 8-3 through 8-8) In operational use, the unbalanced index report would be run whenever the DBA suspects that update activity may have unbalanced the indexes.

listing 8.3 The SQL*Plus script to generate the report for index_stats.
rem id1.sql The main driver routine for reporting index_stats
Rem ? 1997 by Donald K. Burleson

REM id1.sq
set pages 9999;
set heading off;
set feedback off;
set echo off;

spool id4.sql;

select '@id2.sql' from dual;

select 'analyze index '||owner||'.'||index_name||' validate structure;',
'@id3.sql;'
from dba_indexes
where
owner not in ('SYS','SYSTEM');

spool off;

set heading on;
set feedback on;
set echo on;

@id4.sql
@id5.sql

Listing 8.4 The SQL to create a temporary table for the index_stats report.

Rem ? 1997 by Donald K. Burleson

rem id2.sql
elect
name                  ,
most_repeated_key     ,
distinct_keys         ,
del_lf_rows           ,
height                ,
blks_gets_per_access
from index_stats;

Listing 8.5 The SQL to insert the data from index_stats into the temporary table

Rem ? 1997 by Donald K. Burleson

rem id3.sql
nsert into temp_stats
(select
name                  ,
most_repeated_key     ,
distinct_keys         ,
del_lf_rows           ,
height                ,
blks_gets_per_access
from index_stats
);

Listing 8.6 This SQL is generated from running id1.sql

Rem ? 1997 by Donald K. Burleson

rem id4.sql

analyze index DON.SHL_EK_TRUCK_LINK_NUM validate structure;

@id3.sql;

analyze index DON.SHL_UK_FACT1_ID_SRC_CD_LOB validate structure;

@id3.sql;

analyze index DON.PURCH_UNIT_PK validate structure;

@id3.sql;

Listing 8.7 This SQL*Plus script generates the clustering report

Rem ? 1997 by Donald K. Burleson

rem id5.sql - This creates the unbalanced index report and the rebuild syntax
set pagesize 60;
set linesize 100;
set echo off;
set feedback off;
set heading off;

column c1 format a18;
column c2 format 9,999,999;
column c3 format 9,999,999;
column c4 format 999,999;
column c5 format 99,999;
column c6 format 9,999;

spool idx_report.lst;

prompt
prompt
prompt '                      # rep       dist.     # deleted              blk s
prompt Index                  keys        keys      leaf rows Height      per s
prompt -------------------- ------       -----     --------   ------      -----

select distinct
name                  c1,
most_repeated_key     c2,
distinct_keys         c3,
del_lf_Rows           c4,
height                c5,
blks_gets_per_access c6
from temp_stats
where
height > 3
or
del_lf_rows > 10
order by name;

spool off;

spool id6.sql;

select 'alter index '||owner||'.'||name||' rebuild tablespace '||tablespace_nam'
from temp_stats, dba_indexes
where
temp_stats.name = dba_indexes.index_name
and
(height > 3
or
del_lf_rows > 10);

select 'analyze index '||owner||'.'||name||' compute statistics;'
from temp_stats, dba_indexes
where
temp_stats.name = dba_indexes.index_name
and
(height > 3
or
del_lf_rows > 10);

spool off;

Listing 8.8 This is the completed unbalanced index report

'                      # rep       dist.     # deleted              blk gets
Index                  keys        keys      leaf rows Height      per access
-------------------- ------       -----     --------   ------      ----------DON_EK                159,450     25,420          934       4              41

DON_FK_ACT 1,009,808 542 101 3 1,705

INV_EK_INV_NUM 4 1,586,880 122 3 4

INV_FK_CAR 546,366 1,109 315 3 725

INV_FK_SRC 1,041,696 309 31 3 2,591

LOB_FACT1_PK 1 3,778,981 66,918 4 5

PAT_FK_JEN 37 2,736,262 436,880 3 4

JEN_FK_SHP 88 1,464,282 97,473 4 6

DON_FK_LEG 342,290 1,350 301 4 933

Dropping and re-creating an index has inherent problems and is not the best way to rebuild an Oracle index. The most common problem occurs when an extended index is dropped and then fails to re-create, failing with a lack of space in the index tablespace. For example, if we have an index with an initial size of 100 megabytes with 20 extents of 5 megabytes, we will drop the index, re-define the index with an initial extent of 200 megabytes, and re-create the index. Of course, we will need to insure that there is 200 megabytes of free space within the index before we try this operation. Also, we may need to coalesce free index extents in the tablespace to make more contiguous space.

If you like Oracle tuning, see the book "Oracle Tuning: The Definitive Reference", with 950 pages of tuning tips and scripts.

You can buy it direct from the publisher for 30%-off and get instant access to the code depot of Oracle tuning scripts.

��