Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

Free Oracle Tips

HTML Text

 Home
 E-mail Us
 Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB   


 

 

 


 

 

 

 

 

Oracle Data Warehouse Loading Techniques

Oracle Data Warehouse Tips by Burleson Consulting

Loading Techniques

In Oracle, the standard way to load a data warehouse involves multiple steps. The steps involved are usually:

1. Extract data from source database(s)

2. Load data into temporary work tables performing any possible transformation/clean-up and aggregation.

3. Use internal/external scripts and stored objects to transform, aggregate and load the data

The benefits of using temporary tables is that the data is loaded fairly fast into the environment where it can be cleaned, transformed and aggregated using internal procedures which can be a highly automated process.

Alternatively the data can be transformed and aggregated as it is extracted using scripts and procedures and then can be loaded into the database using SQL*Loader. SQL*Loader can be used with a direct load option which pre-builds data blocks and inserts them into the database.

If data resides in non-Oracle data sources, there are gateways which Oracle provides that should be used to make these sources available to your processes:

* Procedural Gateway to APPC – applications that use IBM's Advanced program-to program communication.

* Transparent Gateway for IBM's DB2.

* Transparent Gateway for IBM's DRDA using the DRDA standard APPC/LU 6.2 protocol

* Transparent Gateway for EDA/SQL that allows transparent access to 15 IBM MVS mainframe databases.

* Transparent Gateway to Informix

* Transparent Gateway to RMS on Digital (DEC) (Now Compaq) VAX

* Transparent Gateway to Sybase

* Transparent Gateway to Teradata

Data can be sucked across these gateways into a holding table in an Oracle database for either manipulation or extraction into a flat file depending on how you wish to load the results into the warehouse.

Alternatively flat files can be extracted from the data source and transferred to the data warehouse server. If you can have IBM files translated from EBCIDIC format prior to your having to deal with it makes things much easier.

SQL*Loader direct load options

SQL*Loader  is usually the work horse of the data warehouse loading effort.  SQL*Loader can do the following:

* Load data from disk or tape

* Support a large number of data types

* Load into one or more tables based on specified criteria

* Load fixed or variable length records.

* Generate unique derived numeric keys as the data warehouse is loaded

* Support several high performance options

* Produce detailed error reports and reload tables to allow the isolation and correction of errors quickly and easily

* Preprocess data as it is loading it into the database.

If you specify the SQL*Loader command for your platform at the command line without specifying any arguments a usage guide will be printed. Here is an example:

SQL*Loader: Release 8.1.5.0.0 - Production on Sun Aug 29 20:10:28 1999

(c) Copyright 1999 Oracle Corporation.  All rights reserved.

Usage: SQLLOAD keyword=value [,keyword=value,...]

Valid Keywords:

userid -- ORACLE username/password          

control -- Control file name                 

log -- Log file name                      

bad -- Bad file name                     

data -- Data file name                    

discard -- Discard file name                 

discardmax -- Number of discards to allow (Default all)

skip -- Number of logical records to skip (Default 0)

load -- Number of logical records to load (Default all)

errors -- Number of errors to allow (Default 50)

rows -- Number of rows in conventional path bind array or between

direct path data saves (Default: Conventional path 64, Direct path all)

bindsize -- Size of conventional path bind array in bytes (Default 65536)

silent -- Suppress messages during run (header,feedback,errors,discards,partitions)

direct -- use direct path (Default FALSE)

parfile -- parameter file: name of file that contains parameter specifications

parallel -- do parallel load (Default FALSE)

file -- File to allocate extents from     

skip_unusable_indexes -- disallow/allow unusable indexes or index partitions  (Default FALSE)

skip_index_maintenance -- do not maintain indexes, mark affected indexes as unusable  (Default FALSE)

commit_discontinued -- commit loaded rows when load is discontinued (Default FALSE)

readsize -- Size of Read buffer (Default 65535)

PLEASE NOTE: Command-line parameters may be specified either by position or by keywords.  An example of the former case is 'sqlload scott/tiger foo'; an example of the latter is 'sqlload control=foouserid=scott/tiger'.  One may specify parameters by position before but not after parameters specified by keywords.  For example,

'sqlload scott/tiger control=foo logfile=log' is allowed, but 'sqlload scott/tiger control=foo log' is not, even though the position of the parameter 'log' is correct.

One thing to notice when discussing the command line parameters for the SQL*Loader command is the caveat under the ROWS parameter. A direct path load by default commits only after the complete load is completed. What this means for a direct load without a ROWS specification is that you better have enough rollback area to handle the entire load transaction volume.

A direct path load is the fastest way to load data into an Oracle database. You specify a direct path load by use of the DIRECT=TRUE keyword pair. By splitting input record sets into multiple files and then using the PARALLEL=TRUE keyword pair you can use multiple SQL*Loader sessions to load data into the same table (each will get a data segment equal to the setting of the INITIAL parameter to load into, plan for this) allowing almost a direct multiplication of throughput with an equally linear decrease in loading time.

Another key to speeding the load process using SQL*Loader is to eliminate logging as much as possible. You can eliminate all but data dictionary action redo logging by using the NOLOGGING option against the tables to be loaded before beginning the load. To put is simply, issue the

ALTER TABLE table-name NOLOGGING;

command against the table before beginning the load. Another method is to place the keyword UNRECOVERABLE in the start of the SQL*Loader control file.

A SQL*Loader session will produce a log file and a discard file. The discard and log file will be named the same as the load file with the endings ".bad" and ".log" unless the tool is told otherwise. The arguments can be specified by position or through the use of the keyword with or without an equal sign.

To summarize, in order to minimize the time required to perform a SQL*Load operation you:

1. Set all involved tables to NOLOGGING using ALTER TABLE

2. Split the input data into multiple files

3. Place the UNRECOVERABLE keyword in the control file(s) for the load

4. Use the DIRECT=TRUE and PARALLEL=TRUE keywords

IMPORT-EXPORT

Import and export are two reflexive products, one undoes what the other does. Export extracts a logical copy of the objects and data while Import uses the data from Export to rebuild and reload objects and data. An export file taken from any recent version of Oracle can usually be loaded into any current version at least back one or two main point releases or more.

This logical unload and reload of data performed in EXPORT and IMPORT can be used to move both small and large tables from source Oracle databases to the data warehouse to load into temporary tables for further processing. Generally you will not load the data into the same user as it was extracted from so a FROMUSER-TOUSER IMPORT is generally used. Using the same method as we did with SQL*Loader (except you must specify help=y) here are quick synopsis of the IMP and EXP command sets on an NT version of Oracle:

E:\ exp help=y

Export : Release 8.1.5.0 – Production on Sun Aug 29 20:58:21 1999

(c) Copyright 1999 Oracle Corporation. All rights reserved.

You can let Export prompt you for parameters by entering the EXP

Command followed by your username/password:

     Example: EXP SCOTT/TIGER

Or, you can control how Export runs by entering the EXP command followed by various arguments. To specify parameters, you use keywords:

     Format: EXP KEYWORD=value or KEYWORD=(value2, value2,…,valueN)

     Example: EXP SCOTT/TIGER GRANTS=Y TABLES=(EMP,DEPT,MGR)

              Or TABLES=(T1:P1,TI:P2) if on Oracle8 and T1 is partitioned.

USERID must be the first parameter on the command line.

Keyword  Description (Default)     Keyword      Description(Default)

USERID   userid/password           FULL         export entire file (N) 

BUFFER   size of data buffer       OWNER        list of owner usernames

FILE     output files (EXPDAT.DMP) TABLES       list of table names

COMPRESS import into one extent    RECORDLENGTH length of IO record

GRANTS   export grants (Y)         INCTYPE      incremental export type

INDEXES  export indexes (Y)        RECORD       track incr. Export (Y)

ROWS     export data rows (Y)      PARFILE      parameter filename

CONSTRAINTS export constraints (Y) CONSISTENT   cross-table consistency

LOG      log file of screen output STATISTICS   analyze objects

DIRECT   direct path (N)                        (ESTIMATE)

FEEDBACK display progress every X rows (0)

Oracle8 additions:

POINT_IN_TIME_RECOVER Tablespace point in time recovery (N)

RECOVERY_TABLESPACES List of tablespaces to recover

Oracle8i Additions:

FILESIZE maximum size of each dumpfile

QUERY    select clause used to export a subset of a table

The following keywords only apply to transportable tablespaces (O8i)

TRANSPORT_TABLESPACE export transportable tablespace metadata (N)

TABLESPACES list of tablespaces to transport

Export terminated successfully without warnings

E:\ imp help=y

Import: Release 8.1.5.0.0 – Production on Sun Aug 29 1999 20:58:21 1999

(c) Copyright 1999 Oracle Corporation. All rights reserved.

You can let Export prompt you for parameters by entering the EXP

Command followed by your username/password:

     Example: IMP SCOTT/TIGER

Or, you can control how Import runs by entering the IMP command followed by various arguments. To specify parameters, you use keywords:

     Format:  IMP KEYWORD=value or KEYWORD=(value2, value2,…,valueN)

     Example: IMP SCOTT/TIGER IGNORE=Y TABLES=(EMP,DEPT) FULL=N

              Or TABLES=(T1:P1,TI:P2) if on Oracle8 and T1 is partitioned.

USERID must be the first parameter on the command line.

Keyword  Description (Default)      Keyword     Description(Default)

USERID   username/password          FULL        import entire file (N)

BUFFER   size of data buffer        FROMUSER    list of owner usernames

FILE     input files (EXPDAT.DMP)   TOUSER      list of usernames

SHOW     just list file contents (N) TABLES     list of table names

IGNORE   ignore create errors (N)   RECORDLENGTH length of IO record

GRANTS   import grants (Y)          INCTYPE     incremental import type

INDEXES  import indexes (Y)         COMMIT      commit array insert (N)

ROWS     import data rows (Y)       PARFILE     parameter filename

LOG      log file of screen output  CONSTRAINTS import constraints (Y)

DESTROY  overwrite tablespace data file (N)

INDEXFILE write table/index info to specified file

SKIP_UNUSABLE_INDEXES skip maintenance of unusable indexes (N)

Oracle8 additions:

CHARSET character set of export file (NLS_LANG)

POINT_IN_TIME_RECOVER tablespace point-in-time recovery (N)

ANALYZE execute ANALYZE statments in dump file (Y)

FEEDBACK display progress every x rows (0)

Oracle8i additions:

TOID_VALIDATE skip validation of specified type ids

FILESIZE maximum size of each dump file

RECALCULATE_STATISTICS recalculate statistics (N)

The following keywords only apply to transportable tablespaces

TRANSPORT_TABLESPACE import transportable tablespace metadata (N)

DATAFILES datafiles to be transported into tablespace

ITS_OWNERS users that own data in the transportable tablespace set

Import terminated successfully without warnings.

Both import and export can be run as FULL, OWNER or TABLE level activities. Notice how partitions are specified with the colon separating the table name from the partition name. This use of partition name in the export import utility and the ability to specify it in selects and other command in Oracle8 argue for not letting Oracle do default naming of partitions.

Notice the COMPRESS keyword. The COMPRESS keyword forces all data for the tables in the export to be forced into one extent, the next extent is sized the same as in the original table. If you aren't careful using COMPRESS in export and import you could exceed the size of your import tablespace data files  for large tables. Another important IMPORT keyword is IGNORE, if the tables already exist in the target database use the IGNORE keyword to ignore creation errors when objects already exist.

Another IMPORT keyword is COMMIT, use COMMIT to specify to commit after the number of rows that will fit into a single buffer are imported. By committing after a specified number of rows instead of the default which is after a full table is imported you control the size of rollback segment required. If you use commit use an initial buffer specification of around 1,000,000 to start with and move up from there. On some platforms such as SUN Solaris after a certain buffer size is reached no further increases change the number of rows committed.

 
 
 
Get the Complete
Oracle Utility Information 

The landmark book "Advanced Oracle Utilities The Definitive Reference"  contains over 600 pages of filled with valuable information on Oracle's secret utilities. This book includes scripts and tools to hypercharge Oracle 11g performance and you can buy it for 30% off directly from the publisher.
 

 


 

 
  
 

 
 
 
 
Oracle performance tuning software
 
 

 

 
 
 
Oracle performance Tuning 10g reference poster
 
 
 
Oracle training in Linux commands
 
Oracle training Excel
 
Oracle training & performance tuning books
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2012 

All rights reserved.

Oracle © is the registered trademark of Oracle Corporation.