CREATE TABLESPACE dims
DATAFILE 'c:\oradata\dims01.dbf' SIZE 50M
EXTENT MANAGEMENT LOCAL UNIFORM SIZE 256K
SEGMENT SPACE MANAGEMENT AUTO
What this actually does is ensure that the default LOGGING|NOLOGGING attribute for objects created in this tablespace is set to NOLOGGING. You can still subsequently create a table as LOGGING in a tablespace you define as NOLOGGING, all the NOLOGGING attribute does for a tablespace is set the default for objects later created in the tablespace.
Once you've bulk loaded your data, created your table using CREATE TABLE ... AS SELECT or rebuilt your index, the important next step is to perform a baseline backup of the datafiles that contain your objects; once you've done this, and until such time as you carry out any other unrecoverable DML, you can then recover your objects like any other. This baseline backup can be a hot backup (as long as you're in ARCHIVELOG mode, which in our case we are) and this backup then ensures that we have a baseline copy of the object within our backup set if we then need to perform a full recovery.
So where does the ARCHIVELOG setting for databases come into things?
The general advice for setting up warehouse target databases is to put the database into NOARCHIVELOG mode. NOARCHIVELOG suppresses the archiving of redo log files and improves the performance of data loads as you don't have the ARCn processes running around in the background and holding things up as they copy the redo log files off into your archive locations. The obvious cost of this is that you will only have limited options when recovering your instance - basically if you've overwritten your redo log files and not had the contents archived off, and you have a media failure, you can only restore your instance back to the last full cold backup you took (as the redo required to roll forward from this point has probably been overwritten, and you didn't archive it off beforehand). However, for data warehouses, this isn't usually an issue, as you can just reload from the source files or databases, and the benefits of your data load being quicker are usually greater than the (slim) chance that you'll need to perform a media recovery.
What we have here then are two separate concepts - ARCHIVELOG and NOARCHIVELOG mode, which determines whether redo logs are archived off when they fill up, or just overwritten and the redo then lost, and LOGGING | NOLOGGING, where we either allow all redo to be generated as normal, or in certain circumstances (the important ones, when we need to load lots of data) suppress the majority of redo, at the cost of our operation being unrecoverable and subsequently needing to reload the objects from the source files. These two concepts interact in two ways:
- If we're looking to suppress as much redo as possible, chances are we'd like to suppress archiving as well (and we might as well, as the objects we're loading are unrecoverable anyway), and
- If you specify NOARCHIVELOG mode for a database, all indexes, tables and partitions are automatically set to NOLOGGING, and therefore all INSERT /*+ APPEND */ and SQL*Loader direct load operations will automatically generate minimal redo.
However, in our instance, the DBAs want to keep the database in ARCHIVELOG mode, so that they can manage the database in the same way as the OLTP databases and ship the archivelogs off to their DR site to maintain their standby databases. As I was reading through the various articles and questions on Asktom, I noticed this answer which suggested that there could be some complications when using NOLOGGING operations when you later on wanted to ship the archivelogs to a standby database:
"when using a standby database for disaster recovery -- you cannot perform nologging operations without performing additional, manual work (in fact, in 9i,
we can set a flag that says "ignore the request for nologging -- log it anyway" on the database to avoid this issue alltogether).
Please read the standby database documention -- it does explain this in some detail:"
The answer then points through to the Oracle 8.1.6 documentation for Standby Database which talks about the problems you'll get when you try and apply the (minimal) redo logs that were written for our unrecoverable transactions to your standby database:
"In some SQL statements, the user has the option of specifying the NOLOGGING clause, which indicates that the database operation is not logged in the redo log file. Even though the user specifies the NOLOGGING clause, a redo log record is still written to the redo log. However, when the redo log file is transferred to the standby site and applied to the standby database, a portion of the datafile is unusable and marked as being unrecoverable. When you either activate the standby database, or open the standby database with the read-only option, and attempt to read the range of blocks that are marked as "UNRECOVERABLE," you will see error messages similar to the following:ORA-01578: ORACLE data block corrupted (file # 1, block # 2521) ORA-01110: data file 1: '/vobs/oracle/dbs/stdby/tbs_1.f' ORA-26040: Data block was loaded using the NOLOGGING option"
That doesn't sound good, and certainly I can't imagine our DBA chuckling to himself when at 2.30 in the morning he's called to recover our database and that error message comes up. It's not as bad as it seems actually, as the article then goes on to explain how you recover from this situation, but the point that Tom was trying to make is that NOLOGGING isn't something you should do and then tell no-one about - you should understand what it does, talk it through with the DBA and have a plan in mind for when you do eventually need to perform a recovery and work around the fact that you may have some objects that are unrecoverable - if you've done your baseline backup after your bulk load, it shouldn't affect you, but if your data centre gets hit by a meteorite then it may well affect the process of starting up your standby database. Given that Standby Database is now Data Guard with 10g, I checked the 9i Data Guard documentation (the 10g documentation seems to be offline at the moment) and found this comment, which suggests that in fact we might want to actually reconsider our use of NOLOGGING operations:
"For logical standby databases, when SQL apply operations encounter a redo log record for an operation performed with the
NOLOGGINGclause, it skips over the record and continues applying changes from later records. Later, if an attempt is made to access one of the records that were updated with
NOLOGGINGin effect, the following error is returned:
ORA-01403 no data found
To recover after the
NOLOGGINGclause is specified, re-create one or more tables from the primary database, as described in Section 9.1.6.
Note:In general, use of the
NOLOGGINGclause is not recommended. Optionally, if you know in advance that operations using the
NOLOGGINGclause will be performed on certain tables in the primary database, you might want to prevent the application of SQL statements associated with these tables to the logical standby database by using the DBMS_LOGSTDBY.SKIP procedure."
So, in summary, in our situation where we want to minimise the amount of redo generated for our dimension and fact table loads, but we want to work in ARCHIVELOG mode and ship our logs to a standby database, I'd say the best approach would be to:
- Keep the database in ARCHIVELOG mode
- Create separate tablespaces, with their own datafiles, for the dimensions and facts that we are going to bulk load, and create these tablespaces as NOLOGGING
- Create the tables to implement these dimensions and facts as normal (i.e. without the NOLOGGING clause, as this is already the default for their tablespaces)
- Use INSERT /*+ APPEND */ to bulk load our data into these tables
- Once the loads have completed, do a baseline hot backup of the datafiles that relate to the tables/tablespaces we've just loaded
- At this point, our main database is completely recoverable.
- If we need to recover the standby database, be prepared for the fact that some manual work might be needed to activate the standby database, and in fact be prepared for the DBA to actually stop us from using NOLOGGING through the use of the DBMS_LOGSTDBY.SKIP procedure.