Introducing External
Tables
One of the key new features in Oracle for
business intelligence and data warehousing was the inclusion of
a number of ETL features within the database, the point of which
was to remove the requirement to purchase a separate ETL engine
such as Informatica, Genio or Datastage. The approach proposed
by the author uses these new features to shortcut the warehouse
load process.
If you remember from earlier in the article, our
load process had to carry out three steps:
-
load data from a large flat file
-
apply transformations to the data
-
update/insert the data into a dimension table
Traditionally, you would accomplish this first
step by using
SQL*Loader. SQL*Loader allows you to import a flat file into
the database, carry out basic transformations on it (change
case, reformat, reject invalid rows), and if it helps speed up
the process, load it in parallel using direct path. As you can
only insert data using SQL*Loader, and the degree to which
transformations can take place is limited, you'd generally load
the data into a staging table and then process it in a second
step.
Oracle however introduced a new feature
called
External Tables, which allows you to define a database table
over a flat file. Taking as our example a comma-separated
contracts file that is used to load data into a contracts
dimension, the code to create an external table would be:
create
directory inp_dir as '/home/oracle/input_files';
create table
contracts_file (contract_id number, desc varchar2(50),
init_val_loc_curr number)
organization external (
type oracle_loader
default directory inp_dir
access parameters (
fields terminated
by ','
)
location ('contracts_file.inp')
)
parallel 10;
The External Table feature allows you to embed
the SQL*Loader control file into the table DLL script, and then
allows you to run SELECT statements against the flat file. You
can include the external table in joins, subqueries and so on,
but you can't use the external table to delete or update data in
the flat file. External tables came in with Oracle , and
they're available as data sources in recent versions of Oracle
Warehouse Builder.