The Master Table
Cluster Tips by Burleson Consulting
This is an excerpt from the bestselling book
Oracle Grid & Real Application Clusters. To get immediate
access to the code depot of working RAC scripts, buy it
directly from the publisher and save more than 30%.
At the heart of the Data Pump
operation is the Master Table. This table is created at the
beginning of a Data Pump operation and is dropped at the end of the
successful completion of a Data Pump operation. The Master Table can
also be dropped if the job is killed using the kill_job interactive
command. If a job is stopped using the stop_job interactive command
or if the job is terminated unexpectedly, the Master Table will be
retained. The keep_master parameter can be set to Y to retain
the Master Table at the end of a successful job for debugging
purposes. The name of the Master Table is the same as the Data
Pump job name and has the following columns:
----------------------------------------- -------- ----------------
The Master Table is used to
track the detailed progress information of a Data Pump job:
* The current set of dump files.
* The current state of every
object exported or imported and their locations in the dump file
* The job?s user-supplied
* The status of every worker
* The state of current job
status and restart information.
The Master Table is created in
the schema of the current user running the Pump Dump export or
import, and it keeps tracks of lots of detailed information.
As a result, this table can take up a significant amount of storage
Tip: If an export or an import
job is running on a database with a large number of database
objects, make sure the schema user has sufficient tablespace quota.
The Master Table is the key to
Data Pump?s restart capability in the event of a planned or
unplanned job stoppage.
Data Pump has many processes
through which the entire data load and unload methodology is
conducted. Figure 17.2 shows all components and their interrelation.
Figure 17.2: Process
The following is a listing of
some of the most important components:
* Client Process: This process,
impdp or expdp, makes calls to the Data Pump API. After a Data
Pump job is established, the client process is not required to keep
the job running. Multiple clients can attach and detach from a
job for the purpose of monitoring and control.
* Shadow Process: This is a
foreground process that resides on the database server. This
process is created when a client logs into the Oracle server.
The shadow process creates a job, which primarily consists of
creating the Master Table, creating the queues in Advanced Queues (AQ)
used for communication among the various processes, and creating the
master control process. Once a job is running, the shadow
process? main job is to check the job status for the client process.
If the client process detaches, the shadow process goes away;
however, the remaining Data Pump job processes are still active.
Another client process can create a new shadow process and attach to
the existing job.
* Master Control Process: This
process controls the execution and sequencing of a Data Pump job.
It mainly sits in a work dispatch loop during a job?s execution and
stays until the job is successfully completed. The master
process has two main functions: to divide the loading and unloading
of data and metadata tasks and handle the worker processes; and to
manage the information in the Master Table and record job activities
in the log file. The master control process maintains job
state, job description, restart, and dump file set information in
the Master Table.
* Worker Process: This process
handles the request assigned by the master control process.
The worker process is responsible for loading and unloading data and
metadata. The number of worker processes needed can be defined
by assigning a number to the parallel parameter. The worker
process maintains the object rows and their current status, such as
pending, completed, or failed, in the Master Table and maintains
information about on what type of object, such as tables, indexes,
or views, work is being done. This process can be used to stop
or restart a job.
* Parallel Query Process: This
process is used when the Data Pump chooses External Table API as the
data access method for loading and unloading data. The worker
process that uses the External Table API creates multiple parallel
query processes for data movement, with the worker process acting as
the query coordinator.