In cases where the rows are out of sequence with
the primary key index, the Oracle DBA will extract and sort the
table, replace the table in row order, and rebuild the clustered
index. When any of these conditions occur, the DBA will be able
to take the offending partition offline, rebuild the object, and
reintroduce it into the data warehouse with minimal service
interruption.
Note: With table partitioning, some
automatic clustering of the data with the index will take
place automatically, since new table rows will be
directed to the partition that contains similar partition
values. However, clustering within the partition may get
out of sync when too many rows are added to the end of
the partition.
Increased performance with partitioning
The Oracle engine can take advantage of the physical
segregation of table and index partitions in several ways:
- Disk load balancingTable and index
partitioning allows the Oracle data warehouse DBA to
segregate portions of very large tables and indexes onto
separate disk devices, thereby improving disk I/O
throughput and ensuring maximum performance.
- Improved query speedThe Oracle optimizer can detect the
values within each partition and access only those partitions that are
necessary to service the query. Since each partition can be defined with its
own storage parameters, the Oracle SQL
optimizer may choose a different optimization plan for
each partition.
- Faster parallel queryThe partitioning of objects also
greatly improves the performance of parallel query. When Oracle detects that a query is going
to span several partitions, such as a full-table scan, it
can fire off parallel processes. Each of processes will
independently retrieve data from each partition. This
feature is especially important for indexes, since
parallel queries don't need to share a single index when
servicing a parallel query.
Now that we've reviewed the compelling reasons for using
partitioning with Oracle tables and indexes, let's take an in-depth look at how
partitioning is implemented within the Oracle architecture.
Creating an
Oracle partitioned table
The process of creating a partitioned table with Oracle
is very straightforward. The only real difference from previous
versions is that the STORAGE clause has been changed to specify
the partitions.
For example, if we had a table called all_facts, we would first
need to choose a partition key for the operation. A partition key
is generally a date value, and it specifies the range of rows
that will participate in each partition. For some data warehouse
applications, the partition key could be a nondate value, such as
where an employee table is partitioned according to the value of
each employee's department column. Listing 1 shows some sample SQL code
for creating an Oracle table partition:
Listing 1CREATE TABLE all_facts
(
order_date date,
order_year number(2),
order_quarter char(2),
order_month, number(2),
order_nbr number(5),
salerperson_name varchar(20),
customer_name varchar(20),
customer_city varchar(20),
customer_state varchar(2),
customer_region char(1),
item_nbr number(5),
quantity_sold number(4)
)
PARTITION BY RANGE
(order_date)
(
PARTITION
year_1998
VALUES LESS THAN '01-JAN-1999'
TABLESPACE year_1998
STORAGE (INITIAL 500M, NEXT 5M, PCTUSED 99, PCTFREE 1),
PARTITION
year_1999
VALUES LESS THAN '01-JAN-2000'
TABLESPACE year_1999
STORAGE (INITIAL 500M, NEXT 5M, PCTUSED 99, PCTFREE 1),
PARTITION
year_2000
VALUES LESS THAN (MAXVALUE)
TABLESPACE year_2000
STORAGE (INITIAL 500M, NEXT 50M, PCTUSED 60, PCTFREE 40),
)
Let's review this syntax. Each table partition is defined as
having the order_date column as the partition key. We also see
that the VALUES LESS THAN parameters determine which rows are
partitioned into which tablespace. Note that the last partition
(year_2000) specifies VALUES LESS THAN (MAXVALUE), which means
that all other rows that do not meet the selection criteria will
be placed into this partition. Also, note that even though the
selection parameters for the year_1998 partition read VALUES LESS
THAN '01-JAN-1999', the 1997 rows will not be stored into the
year_1998 partition because the value check is preceded by the
filter for the year_1998 partition.
As you can see, each partition has been created
with different tablespace storage parameters. In this example, it
appears that only the last partition will be updated, as
evidenced by the value of the PCTFREE parameter in the year_2000
tablespace. At SQL INSERT TIME, the DDL is consulted in the
Oracle data dictionary, and the value specified in order_date
will determine which partition Oracle uses to store the row
within the table.
This partitioning of tables also allows each partition to be
referenced as a unique entity, saving resources within the
database. For example, we can still query the entire table as a
whole:
SELECT SUM( quantity_sold)
FROM all_facts
WHERE
order_year = 2000
AND
customer_city = 'Kittrell';
However, it would be simpler and less resource intensive to
rework this query to specify the target partition in the query.
SELECT SUM( quantity_sold)
FROM all_facts PARTITION (year_2000)
WHERE
order_year = 2000
AND
customer_city = 'Kittrell';
In the same fashion, partitions can be used to limit update
statements to a single partition. For instance, if we have a huge
employee table partitioned by department, we could give a 10
percent raise to the MIS department:
UPDATE
all_employee
PARTITION
('MIS')
SET
salary = salary*1.1;
Migration into partitioned table
structures
Migration into partitioned tables is very simple using Oracle. If we take our sample table definition
from the listing above, we can easily migrate our data from an
old Oracle fact table into the new partitioned structure:
INSERT INTO all_fact
PARTITION (year_1995)
(
SELECT * FROM old_fact
WHERE
order_year = 05
);
INSERT INTO all_fact PARTITION (year_1996)
(
SELECT * FROM old_fact
WHERE
order_year = 06
);
INSERT INTO all_fact PARTITION (year_1997)
(
SELECT * FROM old_fact
WHERE
order_year = 07
);
Note: In the above example, the WHERE
statement clause is redundant. The partition definition
will automatically filter out the rows that do not match
the selection criteria for each partition.
Now let's take a look at how you can partition Oracle indexes. In
many ways, the ability to partition Oracle indexes has more
performance potential than table partitioning, since indexes are
a common source of contention with Oracle data warehouses.
Index partitioning with Oracle
The first partitioned index method is called a LOCAL partition. A
local partitioned index creates a one-for-one match between the
indexes and the partitions in the table. Of course, the key value
for the table partition and the value for the local index must be
identical. The second method is called GLOBAL and allows the
index to have any number of partitions.
The partitioning of the indexes is transparent to all SQL queries. The great
benefit is that the Oracle query engine will scan only the index partition that
is required to service the query, thus speeding up the query significantly. In
addition, the Oracle parallel query engine will sense
that the index is partitioned and will fire simultaneous queries
to scan the indexes.
Local partitioned indexes
Local partitioned indexes allow the DBA to take individual
partitions of a table and indexes offline for maintenance (or
reorganization) without affecting the other partitions and
indexes in the table.
In a local partitioned index, the key values and number of index
partitions will match the number of partitions in the base table.
CREATE INDEX year_idx
on all_fact (order_date)
LOCAL
(PARTITION name_idx1,
PARTITION name_idx2,
PARTITION name_idx3);
Oracle will automatically use equal partitioning of the index
based upon the number of partitions in the indexed table. For
example, in the above definition, if we created four indexes on
all_fact, the CREATE INDEX would fail since the partitions do not
match. This equal partition also makes index maintenance easier,
since a single partition can be taken offline and the index
rebuilt without affecting the other partitions in the table.
Global partitioned indexes
A global partitioned index is used for all other indexes
except for the one that is used as the table partition key.
Global indexes partition OLTP (online transaction processing)
applications where fewer index probes are required than with
local partitioned indexes. In the global index partition scheme,
the index is harder to maintain since the index may span
partitions in the base table.
For example, when a table partition is dropped as part of a
reorganization, the entire global index will be affected. When
defining a global partitioned index, the DBA has complete freedom
to specify as many partitions for the index as desired.
Now that we understand the concept, let's examine the Oracle
CREATE INDEX syntax for a globally partitioned index:
CREATE INDEX item_idx
on all_fact (item_nbr)
GLOBAL
(PARTITION city_idx1 VALUES LESS THAN (100),
PARTITION city_idx1 VALUES LESS THAN (200),
PARTITION city_idx1 VALUES LESS THAN (300),
PARTITION city_idx1 VALUES LESS THAN (400),
PARTITION city_idx1 VALUES LESS THAN (500));
Here, we see that the item index has been defined with five
partitions, each containing a subset of the index range values.
Note that it is irrelevant that the base table is in three
partitions. In fact, it is acceptable to create a global
partitioned index on a table that does not have any partitioning.
Summary on table partitioning
These new data warehouse features for Oracle continue
to underscore Oracle Corporation's commitment to supporting very
large data warehouse architectures. It will be exciting to see
how these features continue to improve in Oracle9i, when
some of the object-oriented features will be available for the
data warehouse. Even more exciting will be the support for class
hierarchies and inheritance, since these features will greatly
improve the ability of the Oracle data warehouse designer to
implement ad-hoc classifications of data attributes.