Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 


Tracing Parallel Execution - Part II
July 11, 2005
Mark Rittman

Lots of PX "idle" events, and SQL*Net message to/from client, but not much on what the query actually waited on. If I take a look in my BDUMP directory though, I've got another four trace files, 2 for the processes that scanned my SALES_COPY table and two that did the sorting because of the join.
C:\oracle\admin\markr10g\bdump>dir
 Volume in drive C has no label.
 Volume Serial Number is D47C-241A

 Directory of C:\oracle\admin\markr10g\bdump

26/04/2005  20:03    <DIR>          .
26/04/2005  20:03    <DIR>          ..
26/04/2005  19:42           349,515 alert_markr10g.log
26/04/2005  20:03             2,353 markr10g_p000_11560.trc
26/04/2005  20:03             2,228 markr10g_p001_6892.trc
26/04/2005  20:03             8,857 markr10g_p002_7568.trc
26/04/2005  20:03             8,312 markr10g_p003_4672.trc
               5 File(s)        371,265 bytes
               2 Dir(s)  28,870,373,376 bytes free

Now if I format the P002 trace file using TKPROF and look at the bit relating to my SELECT statement, I get the following output:

select p.prod_name, sum(s.amount_sold)
from   products p, sales_copy s
where  p.prod_id = s.prod_id
group by p.prod_name

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.99       3.03       2246       2269          0           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.99       3.03       2246       2269          0           0

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 61     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  PX COORDINATOR  (cr=0 pr=0 pw=0 time=0 us)
      0  SORT GROUP BY (cr=0 pr=0 pw=0 time=0 us)(Orphan Entry)
      0  PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us)
      0  SORT GROUP BY (cr=0 pr=0 pw=0 time=0 us)(Orphan Entry)
      0  HASH JOIN  (cr=0 pr=0 pw=0 time=0 us)
      0   BUFFER SORT (cr=0 pr=0 pw=0 time=0 us)
      0    PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us)
      0  TABLE ACCESS FULL PRODUCTS (cr=0 pr=0 pw=0 time=0 us)(Orphan Entry)
      0   PX BLOCK ITERATOR (cr=0 pr=0 pw=0 time=0 us)
      0    TABLE ACCESS FULL SALES_COPY (cr=0 pr=0 pw=0 time=0 us)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PX Deq: Execution Msg                          28        0.08          0.23
  PX Deq Credit: send blkd                        3        0.00          0.00
  PX Deq: Table Q Normal                          3        0.01          0.01
  PX qref latch                                   2        0.00          0.00
  direct path read                               74        0.13          1.58

Note the "direct path read" waits. The P003 trace file had the same profile. Also, if I take a look at the P000 and P001 trace files, which presumably are doing the sorting, I get the following profile:

select p.prod_name, sum(s.amount_sold)
from   products p, sales_copy s
where  p.prod_id = s.prod_id
group by p.prod_name

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       3.06          0          0          0           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       3.06          0          0          0           0

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 61     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  PX COORDINATOR  (cr=0 pr=0 pw=0 time=0 us)
      0  SORT GROUP BY (cr=0 pr=0 pw=0 time=0 us)(Orphan Entry)
      0  PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us)
      0  SORT GROUP BY (cr=0 pr=0 pw=0 time=0 us)(Orphan Entry)
      0  HASH JOIN  (cr=0 pr=0 pw=0 time=0 us)
      0   BUFFER SORT (cr=0 pr=0 pw=0 time=0 us)
      0    PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us)
      0  TABLE ACCESS FULL PRODUCTS (cr=0 pr=0 pw=0 time=0 us)(Orphan Entry)
      0   PX BLOCK ITERATOR (cr=0 pr=0 pw=0 time=0 us)
      0    TABLE ACCESS FULL SALES_COPY (cr=0 pr=0 pw=0 time=0 us)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PX Deq: Execution Msg                           2        2.00          2.92
  PX Deq: Table Q Normal                          3        0.10          0.10
  PX Deq Credit: send blkd                        2        0.00          0.00
  PX qref latch                                   2        0.00          0.01

So what do I do if I need to get the full picture of all the wait events that have slowed my SQL statement down?

I'm using Oracle 10g, and I remembered from a previous article I'd written that DBMS_MONITOR allows us to set a "client_id" that can be used, along with a new utility called TRCSESS, to "stitch together" trace files that are part of one logical session. Using DBMS_MONITOR, I should be able to produce a single trace file that contains all the trace data for my SQL statement.

The first step then is to log on again as SH and this time use DBMS_MONITOR

SQL> exec dbms_session.set_identifier('px_test2');

PL/SQL procedure successfully completed.

SQL> alter session set timed_statistics = true;

Session altered.

SQL> alter session set max_dump_file_size = unlimited;

Session altered.

SQL> alter session set tracefile_identifier = 'px_test2';

Session altered.

SQL> dbms_monitor.client_id_trace_enable(client_id=>'px_test2');
SP2-0734: unknown command beginning "dbms_monit..." - rest of line ignored.
SQL> exec dbms_monitor.client_id_trace_enable(client_id=>'px_test2');

PL/SQL procedure successfully completed.

Note the call to DBMS_SESSION that sets the client_identifier for this session. This client identifier will be used to tag my SQL statements in the SQL trace file. Also, ensure that your user has execute permission on DBMS_MONITOR for this to work.

Now, I run my SELECT statement as before;

SQL> select p.prod_subcategory, sum(s.amount_sold)
  2  from products p, sales_copy s
  3  where p.prod_id = s.prod_id
  4  group by p.prod_subcategory;

PROD_SUBCATEGORY                                   SUM(S.AMOUNT_SOLD)
-------------------------------------------------- ------------------
CD-ROM                                                        3995633
Camcorders                                                  8314815.4
Camera Batteries                                           2143810.36
Camera Media                                               1190971.74
Cameras                                                     6312268.4
Documentation                                              3721699.61
Game Consoles                                               2082330.3
Memory                                                     4948442.05
Monitors                                                   12687899.6
Portable PCs                                               15011642.5
Recordable CDs                                             2075521.27

PROD_SUBCATEGORY                                   SUM(S.AMOUNT_SOLD)
-------------------------------------------------- ------------------
Y Box Games                                                1550229.35
Accessories                                                4057301.26
Bulk Pack Diskettes                                         624333.35
Desktop PCs                                                5635963.08
Home Audio                                                 10537384.2
Modems/Fax                                                 2185350.59
Operating Systems                                          3543725.89
Printer Supplies                                           5244547.32
Recordable DVD Discs                                       1914390.17
Y Box Accessories                                           427571.78

21 rows selected.

and then switch off trace:

SQL> exec dbms_monitor.client_id_trace_disable(client_id=>'px_test2');

PL/SQL procedure successfully completed.

If I take a look at the raw trace file for my main session, the one that requested all the PX slaves, I can see the client ID within the trace entries:

Dump file c:\oracle\admin\markr10g\udump\markr10g_ora_11704_px_test2.trc
Tue Apr 26 20:27:11 2005
ORACLE V10.1.0.4.0 - Production vsnsta=0
vsnsql=13 vsnxtr=3
Oracle Database 10g Enterprise Edition Release 10.1.0.4.0 - Production
With the Partitioning, OLAP and Data Mining options
Windows XP Version V5.1 Service Pack 2
CPU             : 1 - type 586
Process Affinity: 0x00000000
Memory (A/P)    : PH:561M/2047M, PG:1392M/3945M, VA:1665M/2047M
Instance name: markr10g

Redo thread mounted by this instance: 1

Oracle process number: 17

Windows thread id: 11704, image: ORACLE.EXE (SHAD)


*** 2005-04-26 20:27:11.455
*** ACTION NAME:() 2005-04-26 20:27:11.445
*** MODULE NAME:(SQL*Plus) 2005-04-26 20:27:11.445
*** SERVICE NAME:(markr10g) 2005-04-26 20:27:11.445
*** 

CLIENT ID:(px_test2)

2005-04-26 20:27:11.445 *** SESSION ID:(140.5258) 2005-04-26 20:27:11.445 WAIT #2: nam='reliable message' ela= 556 p1=1869466216 p2=1869427312 p3=1870014168 ===================== PARSING IN CURSOR #2 len=72 dep=0 uid=61 oct=47 lid=61 tim=376623839081 hv=4221193623 ad='6db34444' BEGIN dbms_monitor.client_id_trace_enable(client_id=>'px_test2'); END; END OF STMT EXEC #2:c=10014,e=24730,p=5,cr=1,cu=7,mis=1,r=1,dep=0,og=1,tim=376623839073 WAIT #2: nam='log file sync' ela= 660 p1=15 p2=0 p3=0 WAIT #2: nam='SQL*Net message to client' ela= 4 p1=1413697536 p2=1 p3=0 *** 2005-04-26 20:31:13.192 WAIT #2: nam='SQL*Net message from client' ela= 241736360 p1=1413697536 p2=1 p3=0 ===================== ...

And I can find the same in the PX slave trace files,

STAT #1 id=12 cnt=0 pid=7 pos=2 obj=0 op='PX BLOCK ITERATOR (cr=0 pr=0 pw=0 time=0 us)'
STAT #1 id=13 cnt=0 pid=12 pos=1 obj=73335 op='TABLE ACCESS FULL SALES_COPY (cr=0 pr=0 pw=0 time=0 us)'

*** TRACE DUMP CONTINUED FROM FILE c:\oracle\admin\markr10g\bdump\markr10g_p003_4672.trc ***


*** 2005-04-26 20:31:14.544
*** SERVICE NAME:(markr10g) 2005-04-26 20:31:14.544
*** CLIENT ID:(px_test2) 2005-04-26 20:31:14.544
*** SESSION ID:(150.6373) 2005-04-26 20:31:14.544
WAIT #0: nam='PX Deq: Msg Fragment' ela= 687 p1=268566527 p2=1 p3=0
=====================
PARSING IN CURSOR #1 len=131 dep=1 uid=61 oct=3 lid=61 tim=376866932282 hv=3562105006 ad='6db3f074'
select p.prod_subcategory, sum(s.amount_sold)
from products p, sales_copy s
where p.prod_id = s.prod_id
group by p.prod_subcategory
END OF STMT
PARSE #1:c=0,e=171,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=376866932270
WAIT #1: nam='PX qref latch' ela= 2777 p1=1 p2=1 p3=1849357532

Note that the SQL statement before the client ID was recorded was for my previous session when I didn't use a client ID.

Now, I can use the TRCSESS utility to create another trace file that contains all the SQL statements carried out by the "px_test2" client ID. Where there are common SQL statements, such as our SELECT statement above, TKPROF will amalgamate them into a single statement that has multiple executions. TRCSESS is found in the /bin directory along with TKPROF.

C:\oracle\admin\markr10g\udump>trcsess output = "markr10g_px_test2_full.trc" clientid = "px_test2" markr10g_ora_11704_px_test2.trc ../bdump/*.trc

Note here that I've included my main trace file, and all the PX slave trace files in the BDUMP directory, as all of them could contain statements tagged with my client ID.

Now, If I format the "markr10g_px_test2_full.trc" trace file with TKPROF and take a look at my SELECT statement, I get a view of all of the waits that contributed to my statement execution time.

select p.prod_subcategory, sum(s.amount_sold)
from products p, sales_copy s
where p.prod_id = s.prod_id
group by p.prod_subcategory

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        5      0.00       0.04          0          1          0           0
Execute      5      1.75      22.13       4433       4479          0           0
Fetch        3      0.02       5.58          0          4          0          21
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       13      1.77      27.75       4433       4484          0          21

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  PX COORDINATOR  (cr=0 pr=0 pw=0 time=0 us)
      0  SORT GROUP BY (cr=0 pr=0 pw=0 time=0 us)(Orphan Entry)
      0  PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us)
      0  SORT GROUP BY (cr=0 pr=0 pw=0 time=0 us)(Orphan Entry)
      0  HASH JOIN  (cr=0 pr=0 pw=0 time=0 us)
      0   BUFFER SORT (cr=0 pr=0 pw=0 time=0 us)
      0    PX RECEIVE  (cr=0 pr=0 pw=0 time=0 us)
      0  TABLE ACCESS FULL PRODUCTS (cr=0 pr=0 pw=0 time=0 us)(Orphan Entry)
      0   PX BLOCK ITERATOR (cr=0 pr=0 pw=0 time=0 us)
      0    TABLE ACCESS FULL SALES_COPY (cr=0 pr=0 pw=0 time=0 us)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  PX qref latch                                  29        0.03          0.23
  PX Deq: Execution Msg                          54        2.00         11.43
  PX Deq: Table Q Normal                          6        2.00          2.01
  PX Deq Credit: send blkd                       12        1.99          3.97
  direct path read                              145        0.23          4.27
  PX Deq: Parse Reply                             5        0.12          0.14
  SQL*Net message to client                       3        0.00          0.00
  PX Deq: Execute Reply                          55        0.33          3.14
  SQL*Net message from client                     3       89.12         89.13
  PX Deq: Signal ACK                              7        0.09          0.20
********************************************************************************

You will see that the Parse and Execute counts for the statement are now 5, reflecting the fact that the main session then called on 4 PX slaves (2 to scan the SALES_COPY table, 2 to sort the results for the join). There are also a couple of points to note with this technique:

  1. Firstly, the elapsed time figure is the sum of all the elapsed times for the main session and the PX slaves, i.e. it won't reflect the actual time that the query took to execute. For this figure, you need to use the elapsed time in the main process trace file.
  2. This is an import one. If you keep executing the same query, with the same client ID, and then use TRCSESS to amalgamate the results, the PX slave trace files will still contain entries for this client ID and this statement from previous executions. This is because the PX slave sessions sit there "permanently" and wait for parallel queries to run, and will write to the same trace file irrespective of the session ID of the calling session. What this means in reality is that an amalgamated trace file created on say the 10th execution of a statement for a particular client ID will contain all the accumulated waits, execution parse and fetch counts, and elapsed time, for all the previous PX executions of this query. The workaround for this is to ensure that your client ID is unique each execution, perhaps by just adding a number to it each time, either manually or via a sequence.

If you're looking for specific details on the waits that a particular PX slave encountered when executing a query, as opposed to the amalgamated (1 x main) + (4 x PX slaves) details, you can also run TRCSESS on a single PX slave trace file, to extract just those statements that relate to a particular client ID. To do this, run TRCSESS against the particular PX slave trace file, like this:

C:\oracle\admin\markr10g\udump>trcsess output = "markr10g_p002_7568_full.trc" clientid = "px_test2" markr10g_p002_7568.trc

I've used this a couple of times now and it seems to work for me. If I've missed anything or misinterpreted a trace feature, drop me a line.


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational