Question: I have a very large table and I
need to delete millions of rows from the table without the table
fragmenting. I also need to use the best delete method, a
best-practice way to delete large amounts of rows as fast as
possible.
What are the options for deleting large amounts of rows from
large tables?
Answer: The answer to the best way to
delete rows from an Oracle table is: It depends! In a perfect world
where you can take the table offline for maintenance, a complete
reorganization is always best because it does the delete and places
the table back into a pristine state. We will address the tools for
doing large scale deletes and the appropriate methods for each
environment.
Also see
deleting large numbers of rows quickly.
Factors and tools for massive deletes
The choice of the delete methods depends on many factors:
- Is the target table partitioned? Partitioning greatly
improves delete performance. For example, it is common to have
a large time-based table partition and deleting elderly rows
from these table can be as simple as dropping the desired
partition. See these notes on
managing partitioned tables.
- Can you reorganize the table after the delete to remove
fragmentation?
- What percentage of the table will be deleted? In cases
where you are deleting more than 30-50% of the rows in a very
large table it is faster to use CTAS to delete from a table than
to do a vanilla delete and a reorganization of the table blocks
and a rebuild of the constraints and indexes.
- Do you want to release the space consumed by the deleted
rows? If you know that the empty space will be re-used by
subsequent DML then you will want to leave the empty space
within the table. Conversely, if you want to released the space
back onto the tablespace then you will need to reorganize the
table.
There are many tools that you can use to delete from large
tables:
- dbms_metadata.get_ddl:
This procedure wil punch-off the definitions of all table
indexes and constraints.
- dbms_redefinition:
This procedure will reorganize a table while it remains
available for updating.
- Rename
table: If you copy a table when deleting rows you
can rename it back to its original name.
- COMMIT:
In cases where a delete might run for many hours, even the
largest UNDO log will not be able to hold the rollback
information and it becomes necessary to do the delete in a
PL/SQL loop, issuing a COMMIT every zillion-rows to free-up the
undo logs. This approach will be re-startable automatically
because the delete will pick-up where it left off as on your
last commit checkpoint.
If you must do the delete in a 24x7 environment you have
limited methods for deleting the rows:
- Vanilla
delete: On a super-large table, a delete statement
will required a dedicated rollback segment (UNDO log), and in
some cases, the delete is so large that it must be written in
PL/SQL with a COMMIT every million rows. Note that Oracle
parallel DML allows you to parallelize large SQL deletes. But
beware that a standard SQL delete can be used, but a SQL delete
may cause honeycomb fragmentation and also place pages onto the
freelist that have used row space. See these important notes on
tuning Oracle DELETE statements
for faster performance.
- Online
table redefinition: You can drop large numbers of
rows from a table by adding a WHERE clause predicate to
filter-out unwanted rows when you copy the table.
If you have a scheduled maintenance downtime window and you are
able to take a consistent backup of the table immediately before the
delete, you can use several methods:
- OPTION ONE: Use a vanilla delete with a
COMMITs. Note that it is common for super-large tables to
reside within their own tablespace for ease of management. In
these cases you can set-up a PL/SQL to commit every n rows or
partition the deletes by the WHERE clause values:
-- pre 9i:
set transaction use rollback_segment = 'HUGE_RBS';
delete from mytab where year = '2008';
commit;
delete from mytab
where year = '2009';
commit;
delete from mytab
where year = '2010';
commit;
- OPTION TWO: Delete into a new tablespace.
Note that it is common for super-large tables to reside within
their own tablespace for ease of management:
STEP 1
-
Punch off the index and constraint DDL with
dbms_metadata.get_ddl.
STEP 2 - Copy the table using
a WHERE clause to delete the rows:
create table
new_mytab
as
select *
from mytab where year = '2012'
tablespace
new_tablespace;
STEP 3 - rename the tables:
rename mytab
to old_mytab;
rename new_mytab to mytab
STEP 4 -
Re-add constraints and indexes from the output of step 1.
STEP 5 - delete old_mytab
- OPTION THREE: Delete and copy back into an
existing tablespace. Note that this will take longer then
option one because the indexes and constraints will be update
for each and every row copied
STEP 1 - Copy the table
using a WHERE clause to delete the rows:
create table
new_mytab
as
select * from mytab where year =
'2012'
tablespace new_tablespace;
STEP 2 -
truncate the original table:
truncate mytab;
rename new_mytab to
mytab
STEP 3 - Copy-back the rows into
the original table definition. Please note that this step may
required a dedicated rollback segment:
alter session set
rollback_segment = 'HUGE_RBS';
insert into
mytab
as
select * from
new_mytab;
STEP 4 - delete old_mytab
In sum, a create table as select (CTAS) approach can be faster
than a vanilla delete when the majority of the table rows are being
deleted. CTAS is fast because CTAS can be parallelized, and the
required full-scan can be run with parallel read processes such that
on a 64 CPU server, the CTAS will scan the table 63 times faster.
CTAS can be used with the NOLOGGING option.
|
|
Get the Complete
Oracle SQL Tuning Information
The landmark book
"Advanced Oracle
SQL Tuning The Definitive Reference" is
filled with valuable information on Oracle SQL Tuning.
This book includes scripts and tools to hypercharge Oracle 11g
performance and you can
buy it
for 30% off directly from the publisher.
|