 Oracle
Recovery Mistakes
July 2004
Donald K. Burleson
Even Oracle Certified DBAs cringe at the thought of performing a
real-world database recovery. As disk and hardware has become
super-stable, many Oracle DBAs have never experienced the adrenaline
rush of a full-blown Oracle recovery.
With a mission critical database at-stake, these Oracle recoveries
are often performed under great stress, especially when thousands of
employees cannot do their jobs until you have recovered their Oracle
system.
I am amazed how many Oracle shops do not discover that their backups
are bad until they are doing a mission-critical recovery. The most
common causes of backup include:
Cold backups while the database is running - This is a
very common Oracle backup error. When you restore the media,
Oracle will not open the database because the system change
numbers (SCN) in the file headers do not match.
Bad Media - Many Oracle backups do not check the media
to ensure that the database has been successfully written to the
backup tape or disk. I have seen many cases where the backup
writes and empty of incomplete backup, or where parity checks
exist. Some shops re-read their backups to ensure against parity
errors.
No ARCHIVELOG mode - I have seen many shops who only
discover that they cannot roll-forward until they attempt a
recovery. Many DBAs have lost their jobs when they must tell
management that many days of work has been lost forever.
Bad hot backups - There are many ways to perform a hot
backup, and many of them will not work properly. Hot backups are
tricky, and the prudent DBA will insist that the recovery from
the hot backup is tested, or at least get a CYI memo from
management if they refuse to test their Oracle recovery
capabilities. Remember, someone IS going to be fired when a
mission-critical database cannot be restored, and this memo
could save your job.
There are also areas of Oracle database recovery where the DBA
could make a serious error. These mistakes are often the result of
stress and poor judgment, and even Oracle Technical Support may fail
to insist of these precautions. Here are two common Oracle recovery
issues to avoid:
Back-up your failed database first
The very first task of an Oracle DBA should be to back-up the
corrupt database. Sadly, many Oracle shops do not test their
recoveries, and in cases where you discover that your backups are
not recoverable, you may be glad that you have a copy of the
original corrupt database.
Rarely force open a bad recovery
When an Oracle database recovery is corrupt the Oracle database will
not open. This is for a good reason and it is Oracle's way of
preventing the serious complications of having to manually repair
thousands of corrupt database blocks. When a recovered database will
not open you have three choices:
Go to an earlier backup - It is far better to have a
longer roll-forward period than to have to repair Oracle
corruption.
Restore the initial failed database - This may have less
manual repair time than forcing open a bad recovery.
Force-open the database - It amazes me how many Oracle
DBAs will have a failed recovery and call Oracle Technical
Support asking them to force-open the database without
considering other alternatives. Forcing-open a corrupt database
is a last-resort and should only be done with the help of Oracle
Technical Support and when there are no other options. At this
juncture, many Oracle DBAs will realize that they are going to
be fired anyway, and walk off the job.
In sum, the Oracle DBA is the custodian of the database and they
must be prudent and cautious with their mission-critical system and
always follow best-practices to ensure recoverability of the Oracle
system.
If you like Oracle tuning, you might enjoy my latest book "Oracle Tuning: The Definitive Reference" by Rampant TechPress. It's only $41.95(I don't think it is right to charge a fortune for books!) and you can
buy it right now at this link:
www.rampant-books.com/book_2003_1_rac.htm
|