Question: What is an
Oracle two phase commit? I have a distributed update and I
want to understand how Oracle prevents partial updates with the two
phase commit mechanism.
Answer: Oracle developed
the two phase commit to allow t]for the controlling SQL to ensure
that all remote sites have committed their data before issuing a
local commit.
While data
integrity is managed very effectively within a single database with
row locking, deadlock detection, and roll-back features, distributed
data integrity is far more complex. Recovery in a distributed
database environment involves ensuring that the entire transaction
has completed successfully before issuing a COMMIT to each of the
subcomponents in the overall transaction. This can often be a
cumbersome chore, and it is the idea behind the the two-phase
commit.
One popular alternative to the two-phase commit is
replicating information and relying on asynchronous replication
techniques to enforce the data integrity. Asynchronous replication
refers to Oracle snapshots and requires a master-slave type of
configuration, whereby a master database relays updates to the slave
database on a periodic basis (using Oracle snapshots to create
master-slave replication). The snapshot approach makes sense when an
overall system does not require instant referential integrity.
Managing Two-Phase Commits (2PCs) with SQL*Net
When a distributed update (or delete) has
finished processing, SQL*Net will coordinate COMMIT processing,
which means that the entire transaction will roll back if any
portion of the transaction fails.
The first phase of this process is a prepare
phase to each node, followed by the COMMIT, and then terminated by a
forget phase.
If a distributed update is in the process of
issuing the 2PC and a network connection breaks, Oracle will place
an entry in the
dba_2pc_pending table. The recovery background
process (RECO) will then roll back or commit the good node to match
the state of the disconnected node to ensure consistency. You can
activate RECO via the ALTER SYSTEM ENABLE DISTRIBUTED RECOVERY
command.
The
dba_2pc_pending table contains an ADVISE column
that directs the database to either commit or roll back the pending
item. You can use the ALTER SESSION ADVISE syntax to direct the 2PC
mechanism. For example, to force the completion of an INSERT, you
could enter the following:
ALTER SESSION
ADVISE COMMIT;
INSERT INTO PAYROLL@LONDON . . . ;
When a 2PC transaction fails, you can query the
dba_2pc_pending table to check the STATE column. Oracle notes:
Before RECO recovers an in-doubt transaction, the transaction appears in DBA_2PC_PENDING STATE as COLLECTING, COMMITTED, or PREPARED.
If you force an in-doubt transaction using COMMIT FORCE or ROLLBACK FORCE, then the states FORCED COMMIT or FORCED ROLLBACK may appear.
Automatic recovery normally deletes entries in these states.
The only exception is when recovery discovers a forced transaction that is in a state inconsistent with other sites in the transaction. In this case, the entry can be left in the table and the MIXED column in DBA_2PC_PENDING has a value of YES.
These entries can be cleaned up with the DBMS_TRANSACTION.PURGE_MIXED procedure.
See these notes on using dbms_transaction.
If you do this, the row will disappear from
dba_2pc_pending after the transaction has been
resolved. If you force the transaction the wrong way (for example,
roll back when other nodes committed), RECO will detect the problem,
set the MIXED column to yes, and the row will remain in the
dba_2pc_pending table.
Internally, Oracle examines the init.ora
parameters to determine the rank that the commit processing will
take. The
commit_point_strength
init.ora parameter determines which of the distributed databases is
to be the commit point site. In a distributed update, the database
with the largest value of commit_point_strength will be the
commit point site.
Viewing pending two phase commit
transactions
The commit point site is the database
that must successfully complete before the transaction is updated at
the other databases. Conversely, if a transaction fails at the
commit point site, the entire transaction will be rolled back at all
of the other databases. In general, the commit point site should be
the database that contains the most critical data. Below is a script
that will identify a two-phase commit transaction that has failed to
complete.
Two Phase commit pending.sql
reports on any pending distributed transactions.
set pagesize 999;
set feedback off;
set wrap on;
column local_tran_id
format a22 heading 'Local Txn Id'
column global_tran_id format
a50 heading 'Global Txn Id'
column state
format a16 heading 'State'
column mixed
format a5 heading 'Mixed'
column advice
format a5 heading 'Advice'
select
local_tran_id,
global_tran_id,
state,mixed,advice
from
dba_2pc_pending
order
by local_tran_id;
Managing two phase commits
When a distributed update (or delete) has finished processing,
SQL*Net will coordinate commit processing so that if any portion of
the transaction fails, the entire transaction will roll back. The
first phase of this process is a "prepare phase" to each node,
followed by the commit, and then terminated by a "forget phase."
If a distributed update is in the process of issuing the 2PC and
a network connection breaks, Oracle will place an entry in the
dba_2pc_pending table, and the recovery background process,
reco, will roll back or commit the good node to match the state
of the disconnected node, thereby ensuring consistency.
You can activate reco via the alter system enable
distributed recovery command. The reco process is
applicable only when the distributed option is installed in Oracle,
and is only used to manage 2PCs. If your system does not perform
cross-database synchronization, you may want to disable reco.
The dba_2pc_pending table contains an "advise" column
that directs the database to either commit or rollback the pending
item. You can use the alter session advise syntax to direct the 2PC
mechanism. For example, to force the completion of an insert, you
can enter the following:
ALTER SESSION ADVISE COMMIT;INSERT INTO PAYROLL@LONDON . . . ;
When a 2PC transaction fails, you can query the
dba_2pc_pending table to check the "state" column. You can
enter SQL*DBA and use the "recover in-doubt transaction" dialog box
to force either a rollback or a commit of the pending transaction.
If you do this, the row will disappear from dba_2pc_pending
after the transaction has been resolved.
If you force the transaction the wrong way (for example, to roll
back when other nodes have already committed), reco will detect the
problem, set the mixed column to "yes," and the row will remain in
dba_2pc_pending table.
Two Phase commit errors
Here are common errors related to the two phase
commit:
ORA-24756: Transaction
does not exist
Cause: An invalid
transaction identifier or context was used or the transaction has
completed.
Action: Supply a valid identifier if
the transaction has not completed and retry the call
ORA-02019: "connection description for
remote database not found"
Cause: The user
attempted to connect or log in to a remote database using a
connection description that could not be found.
Action:
Specify an existing database link. Query the data dictionary
to see all existing database links. See your operating
system-specific SQL*Net documentation for valid connection
descriptors.
ORA-02058: "no prepared
transaction found with ID %s"
Cause: A COMMIT
FORCE was attempted on a transaction, but the transaction with
LOCAL_TRAN_ID or GLOBAL_TRAN_ID was not found in the DBA_2PC_INDOUBT
table in prepared state.
Action: Check the
DBA_2PC_INDOUBT table to ensure the proper transaction ID is used
and attempt the commit again.
ORA-02068: "following severe error from %s%s
Cause: A severe error (disconnect, fatal Oracle error)
received from the indicated database link. See following error
text.
Action: Contact the remote system
administrator.
ORA-02050:
"transaction %s rolled back, some remote DBs may be in-doubt"
Cause: Network or remote failure in 2PC.
Action: Notify operations; remote DBs will
automatically re-sync when the failure is repaired.
ORA-02053: Transaction string committed, some remote DBs may be in-question
Cause: Network or remote failure in 2PC. (Two Phase commit)
Action: Notify operations; remote DBs will
automatically re-sync when the failure is repaired.
The ORA-02050 is for distributed updates, over
a database link, and it's often the result of losing connectivity
during an update. The root cause of this ORA-02050 error is a
network failure, not Oracle.
First, you need to clean any
in-flight transaction that may have failed by querying
dba_2pc_pending:
select * from
dba_2pc_pending;
Since the root cause is a network failure, get
your Network Administrator to monitor the network with netstat,
looking for lost packets or other network errors.
|
|
Get the Complete
Oracle SQL Tuning Information
The landmark book
"Advanced Oracle
SQL Tuning The Definitive Reference" is
filled with valuable information on Oracle SQL Tuning.
This book includes scripts and tools to hypercharge Oracle 11g
performance and you can
buy it
for 30% off directly from the publisher.
|