A fairly common event in a database's lifecycle
is that of the migration from version "older" to version "newer."
Migrating from one version to another may be as simple as
exporting the old and importing into the new, but chances are
there is a lot more involved than first meets the eye. It is not
uncommon to also incorporate other significant changes such as an
operating system change, a schema modification, and changes to
related applications. Each change has its own inherent risk, but
lumping them together in one operation flies in the face of common
sense, even more so without having tested the migration from start
to end. Amazingly, this situation occurs all too often.
From a software engineering standpoint, is it
safe or a best practice to heap so many significant changes
together in one step? Further, wouldn't it seem obvious that you
would want to one, not only practice the migration, but two, test
the changes before actually applying them to your live/production
environment?
Here is something else to consider: break a
dependency chain before it breaks you and the migration process.
Given the scenario of migrating from Oracle8i to 10g, changing the
underlying operating system to Linux from Solaris, modifying major
tables within a schema, and running newer/modified versions of
related applications, where are the places you can break the
dependency chain? Put another way, what are the
safer/well-known/"charted by many others before you" steps, and
which are the uncharted/"applies only to you" steps?
Separate the known from the
unknown (where versus how)
For non-leading edge/early adopter/early
implementers ("sure, we're more than happy to provide our
production environment as a beta testing ground for the rest of
the world") of a new version of Oracle, by the time you (and your
company) are ready to migrate from an older version of the RDBMS
software to a newer one, many others will have gone before you.
Likewise, many others have already crossed over to the dark side
by having adopted Linux as their underlying OS.
Considering the combined RDBMS/OS version
change as the known, this combination is also the "where" part of
"where versus how." Where your production database lives in terms
of version and OS is a logical place to break the dependency
chain. In an all-or-nothing do-or-die migration scenario, failure
means losing the time spent on what is perhaps the simplest part
of the scenario, namely, the hours spent on exporting and
importing. If you can separate the overall migration into at least
two distinct stages, you will have broken the dependency chain
into smaller chains. The guiding principle/lesson to be learned
here is to move from point A to D via safe, incremental steps.
Unfortunately, no one can authoritatively tell
you what the best approach is for "how." How your database
operates with respect to schema and application interaction is up
to you to determine. Until you have thoroughly test driven schema
and application changes, this part of the overall migration
process stays in the realm of the unknown. Going live and finding
out - for the first time - that the new application/database code
results in cascading triggers (thereby bringing an instance to its
knees, so to speak) is obviously a poor time to become aware of
this situation. Developers and testers using 100 records as a test
size when the production environment contains tens of millions
records is hardly a thorough test.
Export and Import via a
proactive approach
With respect to the export and import
utilities, you do not have to accept the default parameters. In
fact, you owe it to yourself to use quite a few non-default
settings, and doing so makes the process easier to perform and
saves time when it is time do it for real. Let's look at the
indexfile parameter as a start. There are (at least) four
excellent reasons to use indexfile=filename on an import.
The first is that the output documents the
storage of tables and indexes (all or some, depends on what was
included in the export dump file). Where is your source code for
schema creation? If you do not have source code, this parameter
(along with a fairly simple query that returns everything else)
goes a very long way towards providing that information. The query
part is spooling out the contents of all or user_source. Code for
packages, package bodies, procedures, functions, and triggers will
be included in the output. With very little editing such as adding
"create or replace" and cleaning up SQL*Plus artifacts (i.e.,
feedback, heading, page breaks - if these weren't suppressed to
begin with), you are left with the current source for a
significant portion of a schema.
The second is that if you are going to do any
housecleaning or rearranging of tables and indexes, now is the
time to edit the indexfile and update tablespace mappings and
storage parameters. If the logical layout is to remain the same,
then the third reason comes into play.
Separate the tables from the indexes, that is,
separate the SQL create statements (one script for tables, the
other for indexes). Do as much as you can on the target database
before it is time to do the actual migration. Part of this
includes creating the same/new tablespaces and running the create
tables script. Run the create tables script ahead of time for two
reasons: one is to validate the logical layout, the other is to
help speed up the import (concepts question: how does import work
if an object exists or does not exist?).
The fourth reason comes back to the indexes
listed in the indexfile. Performance-wise, when doing bulk
inserts, is it better to have indexes or not? What happens when a
new record is inserted? One or more indexes have to be updated
(assuming there is at least a primary key for that record).
Oracle's recommendation is that (for large databases) you should
hold off on creating indexes until after all the data has been
inserted. Again, this comes back to the importance of the
indexfile because it is the link between export using "indexes=n"
(the default is y) and your being able to re-create the indexes
after the data has been loaded.
In Closing
In the next article about migration, I will
provide a checklist/plan covering steps and procedures for the
before, during, and after phases. Even if you are forced to bundle
together four major changes at the same time, there are proactive
measures you can take to mitigate and reduce risk.