So what about
tablespaces? What do you get when you specify the NOLOGGING
clause when creating a tablespace?
CREATE TABLESPACE dims
NOLOGGING
DATAFILE 'c:\oradata\dims01.dbf' SIZE 50M
BLOCKSIZE 8192
EXTENT MANAGEMENT LOCAL UNIFORM SIZE 256K
SEGMENT SPACE MANAGEMENT AUTO
ONLINE;
What this actually does is ensure that the default
LOGGING|NOLOGGING attribute for objects created in this
tablespace is set to NOLOGGING. You can still subsequently
create a table as LOGGING in a tablespace you define as
NOLOGGING, all the NOLOGGING attribute does for a tablespace is
set the default for objects later created in the tablespace.
Once you've bulk loaded your data, created your table using
CREATE TABLE ... AS SELECT or rebuilt your index, the important
next step is to perform a baseline backup of the datafiles that
contain your objects; once you've done this, and until such time
as you carry out any other unrecoverable DML, you can then
recover your objects like any other. This baseline backup
can be a hot backup (as long as you're in ARCHIVELOG mode,
which in our case we are) and this backup then ensures that we
have a baseline copy of the object within our backup set if we
then need to perform a full recovery.
So where does the ARCHIVELOG setting for databases come into
things?
The general advice for setting up warehouse target databases
is to put the database into NOARCHIVELOG mode. NOARCHIVELOG
suppresses the archiving of redo log files and improves the
performance of data loads as you don't have the ARCn processes
running around in the background and holding things up as they
copy the redo log files off into your archive locations. The
obvious cost of this is that you will only have limited options
when recovering your instance - basically if you've overwritten
your redo log files and not had the contents archived off, and
you have a media failure, you can only restore your instance
back to the last full cold backup you took (as the redo required
to roll forward from this point has probably been overwritten,
and you didn't archive it off beforehand). However, for data
warehouses, this isn't usually an issue, as you can just reload
from the source files or databases, and the benefits of your
data load being quicker are usually greater than the (slim)
chance that you'll need to perform a media recovery.
What we have here then are two separate concepts - ARCHIVELOG
and NOARCHIVELOG mode, which determines whether redo logs are
archived off when they fill up, or just overwritten and the redo
then lost, and LOGGING | NOLOGGING, where we either allow all
redo to be generated as normal, or in certain circumstances (the
important ones, when we need to load lots of data) suppress the
majority of redo, at the cost of our operation being
unrecoverable and subsequently needing to reload the objects
from the source files. These two concepts interact in two ways:
- If we're looking to suppress as much redo as possible,
chances are we'd like to suppress archiving as well (and we
might as well, as the objects we're loading are unrecoverable
anyway), and
- If you specify NOARCHIVELOG mode for a database, all
indexes, tables and partitions are
automatically set to NOLOGGING, and therefore all INSERT
/*+ APPEND */ and SQL*Loader direct load operations will
automatically generate minimal redo.
However, in our instance, the DBAs want to keep the database
in ARCHIVELOG mode, so that they can manage the database in the
same way as the OLTP databases and ship the archivelogs off to
their DR site to maintain their standby databases. As I was
reading through the various articles and questions on Asktom, I
noticed
this answer which suggested that there could be some
complications when using NOLOGGING operations when you later on
wanted to ship the archivelogs to a standby database:
"when using a standby database for disaster recovery --
you cannot perform nologging operations without performing
additional, manual work (in fact, in 9i,
we can set a flag that says "ignore the request for nologging
-- log it anyway" on the database to avoid this issue
alltogether).
Please read the standby database documention -- it does
explain this in some detail:"
The answer then points through to the
Oracle 8.1.6 documentation for Standby Database which talks
about the problems you'll get when you try and apply the
(minimal) redo logs that were written for our unrecoverable
transactions to your standby database:
"In some SQL statements, the user has the option of
specifying the NOLOGGING clause, which indicates that the
database operation is not logged in the redo log file. Even
though the user specifies the NOLOGGING clause, a redo log
record is still written to the redo log. However, when the
redo log file is transferred to the standby site and applied
to the standby database, a portion of the datafile is unusable
and marked as being unrecoverable. When you either activate
the standby database, or open the standby database with the
read-only option, and attempt to read the range of blocks that
are marked as "UNRECOVERABLE," you will see error messages
similar to the following:
ORA-01578: ORACLE data block corrupted (file # 1, block # 2521)
ORA-01110: data file 1: '/vobs/oracle/dbs/stdby/tbs_1.f'
ORA-26040: Data block was loaded using the NOLOGGING option"
That doesn't sound good, and certainly I can't imagine our
DBA chuckling to himself when at 2.30 in the morning he's called
to recover our database and that error message comes up. It's
not as bad as it seems actually, as the article then goes on to
explain how you recover from this situation, but the point that
Tom was trying to make is that NOLOGGING isn't something you
should do and then tell no-one about - you should understand
what it does, talk it through with the DBA and have a plan in
mind for when you do eventually need to perform a recovery and
work around the fact that you may have some objects that are
unrecoverable - if you've done your baseline backup after your
bulk load, it shouldn't affect you, but if your data centre gets
hit by a meteorite then it may well affect the process of
starting up your standby database. Given that Standby Database
is now Data Guard with 10g, I checked the 9i Data Guard
documentation (the 10g documentation seems to be offline at the
moment) and
found this comment, which suggests that in fact we might
want to actually reconsider our use of NOLOGGING operations:
"For logical standby databases, when SQL apply
operations encounter a redo log record for an operation
performed with the NOLOGGING clause, it skips
over the record and continues applying changes from later
records. Later, if an attempt is made to access one of the
records that were updated with NOLOGGING in
effect, the following error is returned: ORA-01403 no
data found
To recover after the NOLOGGING
clause is specified, re-create one or more tables from the
primary database, as described in
Section 9.1.6.
Note:
In general, use of the
NOLOGGING clause is not recommended. Optionally, if you
know in advance that operations using the NOLOGGING
clause will be performed on certain tables in the primary
database, you might want to prevent the application of SQL
statements associated with these tables to the logical standby
database by using the DBMS_LOGSTDBY.SKIP procedure."
So, in summary, in our situation where we want to minimise
the amount of redo generated for our dimension and fact table
loads, but we want to work in ARCHIVELOG mode and ship our logs
to a standby database, I'd say the best approach would be to:
- Keep the database in ARCHIVELOG mode
- Create separate tablespaces, with their own datafiles, for
the dimensions and facts that we are going to bulk load, and
create these tablespaces as NOLOGGING
- Create the tables to implement these dimensions and facts
as normal (i.e. without the NOLOGGING clause, as this is
already the default for their tablespaces)
- Use INSERT /*+ APPEND */
to bulk load our data into these tables
- Once the loads have completed, do a baseline hot backup of
the datafiles that relate to the tables/tablespaces we've just
loaded
- At this point, our main database is completely
recoverable.
- If we need to recover the standby database, be prepared
for the fact that some manual work might be needed to activate
the standby database, and in fact be prepared for the DBA to
actually stop us from using NOLOGGING through the use of the
DBMS_LOGSTDBY.SKIP procedure.