ORACLE 10G AND DATA CHANGE CAPTURE
As oracle data warehousing becomes increasingly
more complex by the day, powerful features that simplify the ETL process
have been added to the latest release of oracle-oracle 10g. Essentially,
much has been done in the area of facilitating the extraction and
movement of large volumes of data between oracle databases. The
emergence of heterogeneous transportable tablespaces and oracle data
pump are testimonies to this fact and would be described later in this
paper.
From the inherent data structure perspective,
change data capture can either occur at the data level or at the
application level. At the data level, a table in the target database is
regarded as a remote snapshot of a table in the origin database. At
whichever level capturing and propagation is taking place, it is
imperative to note that there is always an increase in the workload on
the source database. However, with oracle 10g, additional overhead is a
forgotten issue. Asynchronous CDC is now adopted, in which change data
is extracted from the redo logs without any negative performance
implication on the source database. Furthermore, asynchronous CDC can be
described as a lightweight technology targeted towards change extraction
and propagation in a data warehousing system and in which changes to the
source tables are viewed as relational data for onward consumption by
subscribers.
There is therefore no gain saying that asynchronous CDC has greatly
enhanced parallel log file processing and data transformation.
Heterogeneous Transportable Tablespaces.
Transportable tablespaces was introduced in oracle 8.0. The movement of
data using transportable tablespaces is much more faster than when
compared to the export/import methodology of the same data. This is
because; tablespaces transportation involves the copying of datafiles
and integrating the database schema information into the data
dictionary.
Transportable tablespaces has proved to be useful in diverse ways, which
includes
-
Data
loading from OLTP applications to data warehouse systems.
-
Feeding data marts from central data warehouses.
-
Updating data warehouses and data marts from staging systems.
In as much as the benefits are enormous,
transportable tablespaces have suffered some limitations over the years.
These setbacks includes
-
The
source and target databases must be on the same operating system for
you to be able to transport tablespaces. It is impossible to transport
a tablespace from an NT oracle database to an HP-UX oracle database.
-
It is
only the set of tablespaces that have no references from within the
set of tablespaces pointing outside the tablespace can be transported.
-
The
source and target database cannot use different character set. The
character set of both databases must be the same.
-
Transportable tablespaces do not support function- based index.
Function based index is used primarily to improve query performance in
cases where the WHERE clause of a SQL statement contains operations on
the columns.
-
Transportable tabalespaces do not support materialized views.
-
Scoped REFs are also not supported by transportable tablespaces.
However, with the advent of oracle 10g, the
monopoly, inherent in former releases of oracle as it relates to same
operating system for the source database and target database has been
eradicated. Transportable tablespaces are now platform-friendly as in;
you can transport tablespaces from databases of different platforms.
When transporting tablespaces of different platforms, the RMAN utility
and the CONVERT command are used to convert the byte ordering to the
same thing. Alternatively, you can convert the destination platform
after the tablespace datafiles have been transported.
Oracle Data Pump
One of the new features of oracle 10g is the
oracle data pump. It is indeed an exciting server-side infrastructure
suitable for fast, bulk-data and metadata movement from one oracle
database to another database. Oracle data pump is highly flexible in
that, it only not allows you to use a customized data movement utility,
but also allows you monitor status, cancel, suspend and resume a load.
Also, loading can be restarted after failure without the loss of data
integrity.
When unloading data physically, oracle data pumps makes use of the
external table as the unload mechanism. Thus, resulting in the external
table data pump unload driver unloading the driver into
platform-independent, oracle proprietary files. The new external table
unload mechanism can be used as a standalone without necessarily using
the new export/import utilities (expdp/impdp) which are client side
utilities that make application programming interface (API)
calls/requests into the oracle data pump.
The above text is an excerpt from Mark
Rittman's Oracle Weblog
http://www.rittman.net/archives/000901.html
About the author
Kehinde O Eseyin is a computer scientist, an
Oracle DBA and an Oracle certified professional working with the
Nigerian Ports Authority, Delta ports, Warri. Nigeria.