 |
|
The Master Table
Oracle RAC
Cluster Tips by Burleson Consulting |
This is an excerpt from the bestselling book
Oracle Grid & Real Application Clusters. To get immediate
access to the code depot of working RAC scripts, buy it
directly from the publisher and save more than 30%.
At the heart of the Data Pump
operation is the Master Table. This table is created at the
beginning of a Data Pump operation and is dropped at the end of the
successful completion of a Data Pump operation. The Master Table can
also be dropped if the job is killed using the kill_job interactive
command. If a job is stopped using the stop_job interactive command
or if the job is terminated unexpectedly, the Master Table will be
retained. The keep_master parameter can be set to Y to retain
the Master Table at the end of a successful job for debugging
purposes. The name of the Master Table is the same as the Data
Pump job name and has the following columns:
SQL>
desc SYS_ESTIMATE_SCHEMA_01
Name
Null? Type
----------------------------------------- -------- ----------------
PROCESS_ORDER
NUMBER
DUPLICATE
NUMBER
DUMP_FILEID
NUMBER
DUMP_POSITION
NUMBER
DUMP_LENGTH
NUMBER
DUMP_ALLOCATION
NUMBER
COMPLETED_ROWS
NUMBER
ERROR_COUNT
NUMBER
ELAPSED_TIME
NUMBER
OBJECT_TYPE_PATH
VARCHAR2(200)
OBJECT_PATH_SEQNO
NUMBER
OBJECT_TYPE
VARCHAR2(30)
IN_PROGRESS
CHAR(1)
OBJECT_NAME
VARCHAR2(500)
OBJECT_SCHEMA
VARCHAR2(30)
PARTITION_NAME
VARCHAR2(30)
FLAGS
NUMBER
COMPLETION_TIME
DATE
OBJECT_TABLESPACE
VARCHAR2(30)
SIZE_ESTIMATE
NUMBER
OBJECT_ROW
NUMBER
PROCESSING_STATE
CHAR(1)
PROCESSING_STATUS
CHAR(1)
BASE_OBJECT_TYPE
VARCHAR2(30)
BASE_OBJECT_NAME
VARCHAR2(30)
BASE_OBJECT_SCHEMA
VARCHAR2(30)
PARALLELIZATION
NUMBER
UNLOAD_METHOD
NUMBER
GRANULES
NUMBER
SCN
NUMBER
DOMAIN_INDEX
VARCHAR2(30)
DOMAIN_INDEX_SCHEMA
VARCHAR2(30)
GRANTOR
VARCHAR2(30)
NAME
VARCHAR2(30)
VALUE_T
VARCHAR2(4000)
VALUE_N
NUMBER
IS_DEFAULT
NUMBER
FILE_TYPE
NUMBER
USER_DIRECTORY
VARCHAR2(4000)
USER_FILE_NAME
VARCHAR2(4000)
FILE_NAME
VARCHAR2(4000)
EXTEND_SIZE
NUMBER
FILE_MAX_SIZE
NUMBER
EXTEND_ACTIVE
NUMBER
OVERFLOW_TO
NUMBER
PROCESS_NAME
VARCHAR2(30)
LAST_UPDATE
DATE
WORK_ITEM
VARCHAR2(30)
NON_TRANSACTIONAL
CHAR(1)
OBJECT_NUMBER
NUMBER
COMPLETED_BYTES
NUMBER
TOTAL_BYTES
NUMBER
METADATA_IO
NUMBER
DATA_IO
NUMBER
CUMULATIVE_TIME
NUMBER
OLD_VALUE
VARCHAR2(4000)
SEED
NUMBER
LAST_FILE
NUMBER
USER_NAME
VARCHAR2(30)
OPERATION
VARCHAR2(30)
JOB_MODE
VARCHAR2(30)
VERSION
NUMBER
DB_VERSION
VARCHAR2(30)
STATE
VARCHAR2(30)
PHASE
NUMBER
GUID
RAW(16)
START_TIME
DATE
BLOCK_SIZE
NUMBER
METADATA_BUFFER_SIZE
NUMBER
DATA_BUFFER_SIZE
NUMBER
DEGREE
NUMBER
LANGUAGE
VARCHAR2(30)
PLATFORM
VARCHAR2(100)
ABORT_STEP
NUMBER
INSTANCE
VARCHAR2(16)
The Master Table is used to
track the detailed progress information of a Data Pump job:
* The current set of dump files.
* The current state of every
object exported or imported and their locations in the dump file
set.
* The job?s user-supplied
parameters.
* The status of every worker
process.
* The state of current job
status and restart information.
The Master Table is created in
the schema of the current user running the Pump Dump export or
import, and it keeps tracks of lots of detailed information.
As a result, this table can take up a significant amount of storage
space.
Tip: If an export or an import
job is running on a database with a large number of database
objects, make sure the schema user has sufficient tablespace quota.
The Master Table is the key to
Data Pump?s restart capability in the event of a planned or
unplanned job stoppage.
Process Architecture
Data Pump has many processes
through which the entire data load and unload methodology is
conducted. Figure 17.2 shows all components and their interrelation.
Figure 17.2: Process
architecture
The following is a listing of
some of the most important components:
* Client Process: This process,
impdp or expdp, makes calls to the Data Pump API. After a Data
Pump job is established, the client process is not required to keep
the job running. Multiple clients can attach and detach from a
job for the purpose of monitoring and control.
* Shadow Process: This is a
foreground process that resides on the database server. This
process is created when a client logs into the Oracle server.
The shadow process creates a job, which primarily consists of
creating the Master Table, creating the queues in Advanced Queues (AQ)
used for communication among the various processes, and creating the
master control process. Once a job is running, the shadow
process? main job is to check the job status for the client process.
If the client process detaches, the shadow process goes away;
however, the remaining Data Pump job processes are still active.
Another client process can create a new shadow process and attach to
the existing job.
* Master Control Process: This
process controls the execution and sequencing of a Data Pump job.
It mainly sits in a work dispatch loop during a job?s execution and
stays until the job is successfully completed. The master
process has two main functions: to divide the loading and unloading
of data and metadata tasks and handle the worker processes; and to
manage the information in the Master Table and record job activities
in the log file. The master control process maintains job
state, job description, restart, and dump file set information in
the Master Table.
* Worker Process: This process
handles the request assigned by the master control process.
The worker process is responsible for loading and unloading data and
metadata. The number of worker processes needed can be defined
by assigning a number to the parallel parameter. The worker
process maintains the object rows and their current status, such as
pending, completed, or failed, in the Master Table and maintains
information about on what type of object, such as tables, indexes,
or views, work is being done. This process can be used to stop
or restart a job.
* Parallel Query Process: This
process is used when the Data Pump chooses External Table API as the
data access method for loading and unloading data. The worker
process that uses the External Table API creates multiple parallel
query processes for data movement, with the worker process acting as
the query coordinator.