 |
|
Index-Only Table
Oracle Database Tips by Donald Burleson |
Index-Only Table: An Example
In many cases, such as with lookup tables, we
end up with a one- or two-column table and an index for that table.
This wastes space. Oracle8, Oracle8i, and Oracle9i allow us to create
an index-only table that is stored in B-tree format based on the value
of the primary keys. Since the table is stored as an index, there are
no rowids, and all access is through the primary key. This reduces
space needs and makes access to the table much faster. This should be
used for small tables and low update/insert tables. Index-only tables
cannot contain LOB or LONG columns, and the following operations are
not allowed:
* Creating an index-only table with a subquery
clause.
* Indexing an index-only table (other than
bitmap indexes on IOTs that have a mapping table specified).
* Imposing UNIQUE constraints on values in an
index-only table.
* Composite partitioning of an index-only
table.
The following is an example of the creation of
an index-only table:
CREATE TABLE look_ups
(
lookup_code INTEGER NOT NULL,
lookup_value VARCHAR2(10),
lookup_descr VARCHAR2(80),
CONSTRAINT pk_look_ups PRIMARY KEY (lookup_code)
ORGANIZATION INDEX TABLESPACE sales_ts
PCTTHRESHOLD 20 INCLUDING lookup_value
OVERFLOW TABLESPACE sales_ts;
You can convert INDEX ONLY tables to regular
tables using the Oracle IMPORT/EXPORT utilities. To convert an INDEX
ONLY table to a regular table:
1.
Export the INDEX ONLY table data using the conventional path.
2.
Create a regular table definition with the same definition.
3.
Import the INDEX ONLY table data, making sure IGNORE = y (ensures that
?object exists? error is ignored).
After an INDEX ONLY table is converted to a
regular table, it can be exported and imported using pre-Oracle8
EXPORT/IMPORT utilities.
Sizing an Oracle9i Nonclustered Table
The procedures in this section describe how to
estimate the total number of data blocks necessary to hold data
inserted to a nonclustered table. Typically, the space required to
store a set of rows that experience updates, deletes, and inserts will
exceed this calculated value. The actual space required for complex
workloads is best determined by analyzing an existing table; it is
then scaled by the projected number of future rows in the production
table. In general, increasing amounts of concurrent activity on the
same data block results in additional overhead (for transaction
records), so it is important that you take into account such activity
when scaling empirical results. (Spreadsheets are available at the
Wiley Web site for calculating table and index size.)
Note: No allowance is made here for
changes to PCTFREE or PCTUSED, due to insert, delete, or update
activity. Thus, this reflects a best-case scenario, that is, when
users insert rows without performing deletes or updates.
Calculating space required by nonclustered
tables is a five-step process:
1.
Calculate the total block header size.
2.
Calculate the available data space per data block.
3.
Calculate the space used per row.
4.
Calculate the total number of rows that will fit in a data block.
5.
With the rows/block data, calculate the total number of data blocks
and convert to kilo- or megabytes.
A Simple Sizing Example
Let's take a more detailed look at the steps
using a simple example.
Step 1: Calculate the Total Block Header Size
The space required by the data block header is
the result of the following formula:
Space after headers (hsize)
=
DB_BLOCK_SIZE - KCBH - UB4 - KTBBH - ((INITRANS - 1) * KTBIT) - KDBH
where:
DB_BLOCK_
SIZE. The database blocksize with which the database was created.
It can be viewed in the V$PARAMETER view by selecting:
SELECT value FROM
v$parameter WHERE name = ?db_block_size?;
KCBH, UB4, KTBBH, KTBIT, KDBH.
Constants whose sizes you can obtain by selecting from entries in the
V$TYPE_SIZE view.
* KCBH is the block common header; on NT with
a 4-KB blocksize, this is 20.
* UB4 is ?either byte 4?; on NT with a 4-KB
blocksize ,this is 4.
* KTBBH is the transaction fixed-header
length; on NT with a 4-KB blocksize, this is 48.
* KTBIT is transaction variable header; on NT
with a 4-KB blocksize, this is 24.
* KDBH is the data header; on NT with a 4-KB
blocksize, this is 14.
INITRANS. The initial number of transaction
entries allocated to the table. For an NT 4.0 platform with a 4-KB
blocksize and an INITRANS value of 5, the calculation would be:
DB_BLOCK_SIZE - KCBH - UB4 - KTBBH - ((INITRANS - 1) * KTBIT) - KDBH
hsize = 4192 - 20 - 4 - 48 - ((5?1)*24) - 14 = 4192 - 182 = 4010 bytes
Step 2: Calculate the Available Data Space per
Data Block
The space reserved for data in each data
block, as specified by PCTFREE, is calculated as follows:
available data space
(availspace) =
CEIL(hsize * (1 - PCTFREE/100 )) - KDBT
where:
CEIL. A fractional result rounded to
the next-highest integer.
PCTFREE. The percentage of space
reserved for updates in the table.
KDBT. A constant corresponding to the
Table Directory Entry size, which you can obtain by selecting the
entry from the V$TYPE_SIZE view. For an NT 4.0 platform with a 4-KB
blocksize, this is 4.
TIP: If you are unable to locate the
value of KDBT, use the value of UB4 instead.
So, to carry on our example, assuming a
PCTFREE of 20 for our table:
CEIL(hsize * (1 - PCTFREE/100 )) - KDBT
CEIL(4010*(1 - 20/100)) - 4 = CEIL((4010*.8) - 4 ) = CEIL(3208 - 4) =
3204
we get an available data size of 3204 bytes
per block.
Step 3: Calculate the Space Used per Row
Calculating the amount of space used per row
is a multistep task. First, you must calculate the column size,
including byte lengths:
Column size including
byte length =
column size + (1, if column size < 250, else 3)
I suggest using estimated averages for all
variable-length fields, such as numeric, VARCHAR2, and RAW. Remember
that number datatypes are stored at a two-to-one ratio in the database
(i.e., a NUMBER(30) takes up 15 bytes of storage if each place is
filled). The maximum for a NUMBER is 21 bytes. The size for a DATE is
7 bytes. Rowid takes 10 bytes for the extended and 6 bytes for the
restricted type of rowid. CHAR always takes its full specified length;
VARCHAR2, RAW, and other variable-length fields will use only the
space they take up.
Tip: You can also determine column size
empirically, by selecting avg(vsize(colname)) for each column in the
table.
For example, I have a table TEST with a
single VARCHAR2(50) column that has eight rows of various lengths. The
return from the select 'sELECT AVG(VSIZE(TEST1)) FROM TEST;? is:
AVG(VSIZE(TEST1))
-----------------
29
The table also has a number column TEST2:
AVG(VSIZE(TEST2))
-----------------
7
Then, calculate the row size:
Rowsize =
row header (3 * UB1) + sum of column sizes including length bytes
UB1 is ?UNSIGNED BYTE 1,? which is 1 on NT 4.0
with a 4-KB blocksize, so:
Rowsize =
(3*1) + (8 + 30) = 41
Of course, if you have an example table, the
quickest way to get average row size is to just analyze it and then
select average row size from USER_TABLES:
SQL> analyze table
test1 compute statistics;
Table analyzed.
SQL> select avg_row_len from user_tables
2 where table_name='TEST1';
AVG_ROW_LEN
-----------
41
Finally, you can calculate the space used per
row:
Space used per row (rowspace)
=
MIN(UB1 * 3 + UB4 + SB2, rowsize) + SB2
where:
UB1, UB4, SB2. Constants whose size can be
obtained by selecting entries from the V$TYPE_SIZE view:
* UB1 is ?unsigned byte 1?; it is set to 1 for
NT 4.0 with a 4-KB blocksize.
* UB4 is ?unsigned byte 4?; it is set to 4 for
NT 4.0 with a 4-KB blocksize.
* SB2 is 'signed byte 2?; it is set to 2 for
NT 4.0 with a 4-KB blocksize.
So this becomes MIN((1*3) + 4 + 2, 41) + 2,
or, 41 + 2 = 43.
MIN. Calculated by taking the lesser of
either UBI *3 + UB4 + SB2 or the calculated row size value. If the
space per row exceeds the available space per data block, but is less
than the available space per data block without any space reserved for
updates (for example, available space with PCTFREE = 0), each row will
be stored in its own block.
When the space per row exceeds the available
space per data block, without any space reserved for updates, rows
inserted into the table will be chained into two or more pieces;
hence, this storage overhead will be higher.
Step 4: Calculate the Total Number of Rows
That Will Fit into a Data Block
You can calculate the total number of rows
that will fit into a data block using the following equation:
Number of rows in
block =
FLOOR(availspace / rowspace)
where:
FLOOR. A fractional result rounded to
the next lowest integer.
So for our example this becomes:
FLOOR(3204/43) = 74
Step 5: Calculate the Total Blocks Required
The next step allows calculation of total
blocks that then allows us to convert to total size required.
total blocks =
(Total Table Rows) / (Rows Per Block)
Total Kilobytes =
CEIL((total blocks * blocksize) / 1024)
Total Megabytes =
CEIL((total blocks * blocksize) / 1048576) ?(1024^2)
For our example, we estimate we will have
42,000 rows in this table over the next year. So, the calculation
becomes:
((42000/74)*4192) / 1024 = 2324 KB or 3 MB (rounding up)
This calculation must also be done for each
nested table type, and the table storage must be altered accordingly.
Remember to add the SETID$ column length of 16 to each row length
calculated for a nested table. The size of the RAW required for inline
storage of a varray can vary between 2.5 and over 6 times the size of
the combined row length times the number of elements. A nested table
reference pointer is usually a RAW(36) value. A REF value will vary
between 42 and 46 bytes of system storage.
A More Complex Sizing Example
Let's do a complex sizing example before
moving on. From our conversion example, we have the following example
of an Oracle9i complex object:
SQL> DESC CLIENTSV8i
Name Null? Type
-------------------- ------- -----------
CLIENTS_ID NOT NULL NUMBER(38)
ADDRESSES RAW(36)
CUSTOMER_NAME VARCHAR2(35)
ACTIVE_FLAG VARCHAR2(1)
FAX VARCHAR2(20)
LOOKUPS LOOKUP_T
PHONE VARCHAR2(20)
CORPORATE_NAME VARCHAR2(30)
CREATION_TS DATE
CREATION_SY_USER NUMBER(38)
SPP_RATING CHAR(2)
RATING_DATE DATE
COMPETITOR_LOSS NUMBER(38)
LAST_CONTACT_TS DATE
DELETE_STATUS CHAR(1)
NAME_SOUNDEX CHAR(4)
SALES_VOLUME VARCHAR2(15)
SALES_VOLUME_CODE CHAR(1)
TOTAL_EMPLOYEES VARCHAR2(9)
LINE_OF_BUS VARCHAR2(19)
PCT_GROWTH_SALES VARCHAR2(4)
TERRITORY_COVERED CHAR(1)
MRC VARCHAR2(4)
CEO CEO_T
SUB_INDCTR CHAR(1)
FTX_CODES RAW(676)
SUB_CODES RAW(560)
In this object we have normal attributes, named types, varrays, and a
nested table. The named types are LOOK_UPS and CEO. The nested table
is addresses, and the varrays are ftx_codes and sub_codes. The first
step is to size the nested tables and named types. The nested table is
called address_list, and consists of:
setid$
RAW(16)
addrtype INTEGER
address1 VARCHAR2(80)
address2 VARCHAR2(80)
address3 VARCHAR2(80)
address4 VARCHAR2(80)
address5 VARCHAR2(80)
address6 VARCHAR2(80)
address7 VARCHAR2(80)
address8 VARCHAR2(80)
address9 VARCHAR2(80)
address10 VARCHAR2(80)
address11 VARCHAR2(80)
address12 VARCHAR2(80)
address13 VARCHAR2(80)
address14 VARCHAR2(80)
address15 VARCHAR2(80)
The address1-15 fields will always be filled
with two records, one with 5 fields (average of 10 characters each)
plus a single integer value for addrtype and a second with up to 15
fields (average of 7 with 10 characters each) and a single integer
value for addrtype. This yields the following row lengths:
16
+ 1 + (10 * 5) + (1 * 7) = 75
16 + 1 + (10 * 7) + (1 * 9) = 96
We have a 4-KB blocksize, and this is on NT
4.0, so the following calculation from step 1 is still good assuming
we use an INITRANS of 5:
DB_BLOCK_SIZE - KCBH - UB4 - KTBBH - ((INITRANS - 1) * KTBIT) - KDBH
hsize = 4192 - 20 - 4 - 48 - ((5?1) * 24) - 14 = 4192 - 182 = 4010
bytes
The calculation from step 2 is valid here,
too:
CEIL(hsize * (1 - PCTFREE / 100 )) - KDBT
CEIL(4010 * (1 - 20 / 100)) - 4 = CEIL((4010 * .8) - 4 ) = CEIL(3208 - 4) = 3204
Step 3 becomes:
Rowsize =
row header (3 * UB1) + sum of column sizes including length bytes
3 + 75 = 78
3 + 96 = 99
At this point, let's average these to a single
row size and double the expected count (since we will have x
occurrences of two rows, if we average the row size, we will have 2x
of the average size): average = (78 + 99) / 2 = 177/2 = 89 (rounding
up).
Step 4 becomes:
Rows per
block: =
3204 / 89 = 36
For step 5 we estimate 2.5 million rows, so
let's calculate in megabytes only:
((2500000 /
36) * 4196) / 1048576) = 278 meg
This should be the amount of storage required
for our nested table store table.
Next, we do the named types. The named types
are LOOK_UPS and CEO. Here is CEO:
ceo_first_name VARCHAR2(13),
ceo_last_name VARCHAR2(15),
ceo_middle_initial VARCHAR2(1),
ceo_suffix VARCHAR2(3),
ceo_prefix VARCHAR2(10),
ceo_title VARCHAR2(30)
with a total length of 72 plus 7 length bytes,
for a grand total of 79.
Here is LOOK_UPS:
lookup_no
VARCHAR2(9) ,
lookup_parent VARCHAR2(9),
lookup_str_adrs VARCHAR2(25),
lookup_city VARCHAR2(20),
lookup_state VARCHAR2(2),
lookup_zip VARCHAR2(5),
lookup_zip_ext VARCHAR2(4),
lookup_type CHAR(2),
lookup_parent_flag CHAR(1)
with a total length of 77 plus 9 length bytes,
for a total of 86.
Now we have the data required to finish the
calculation:
3 + (35 + 17
+ 1 + 20 + 86 + 20 + 10 + 7 + 3 + 2 + 7 + 3 + 7 + 1 + 4 + 7 + 2 + 10 +
3 + 1 + 4 + 79 + 1 + 676 + 560) + (1 * 25) = 1700
Note: Some values (most of the
VARCHAR2s) were averaged. The number values all correspond to integers
that won't exceed six places (6 / 2 = 3), and DATEs are always 7
bytes.
So now we have the 3,204 available-space
calculation and the 1,700 row size, which indicates we will have one
row per block with a 4-KB blocksize. If we changed the varrays into
NESTED TABLE, this requirement would drop by 1,166 and push us down to
a row size of 534 and a rows-per-block of 3,204/534, or 6, decreasing
the storage requirements by a factor of 6 for the primary object
table. Since we are talking about a much smaller row size in the
nested table storage tables than would be required with varrays, we
would also reduce the overall storage requirements and make better use
of available resources.
Table Alteration and Truncation
The ALTER TABLE command was covered in Chapter
4, Section 4.2, on relational tables. The commands for dropping and
truncating tables are also covered in Chapter 4.
This is an excerpt from Mike Ault, bestselling author of "Oracle
10g Grid and Real Application Clusters".

|
|