I found out some interesting things yesterday. First, while reviewing a
clients Oracle application which had been ported from SQL Server, I
noticed they were using the constructs ?WHERE 1=1? and ?WHERE 1=2? in
order to build dynamic SQL for dynamic queries. In the first case of
?WHERE 1=1? it was always followed by a group of one or more ?AND?
clauses so the optimizer would of course practice predicate elimination
and remove it. However in the second case, ?WHERE 1=2? it was being used
to merely return the column headers to avoid error handling.
It is this second case, ?WHERE 1=2? I want to talk about. Of course
since 1 never equals 2 you never get a result back, however it forces
a full table scan! In testing with 10gR1 version 10.1.0.4 and on
R2, 9.2.0.4 if there was no bitmap index on the table using the ?WHERE
1=2? construct resulted in a full table scan. I tried using a unique and
a non-unique B-Tree and they were ignored, seems only a bitmap index
would allow the optimizer to determine that 1 would never equal 2, why
this is so is a mystery since obviously the bitmap had nothing to do
with whether 1=1 or 1=2.
In SQL Server, the optimizer is smart enough to realize 1 will never
equal 2 and there, it simply returns the column list without any other
action. Oracle, you paying attention? Make a note!
What we ended up doing in this clients case was to replace the ?WHERE
1=2? with a clause that equated the primary key of the table with an
impossible value for that key, in this case the ID was being passed in
as a GUID (a hexadecimal value) so we use a ?WHERE KEY=HEX(00)? and got
a low cost unique index lookup instead of a costly full table scan.
A second item of note, one of the tables in a SQL involved in a match
against values passed into a global table temporary table had a degree
setting of 4, but the instance had a parallel min servers setting of 0
so there was no way that a parallel scan could be performed. However,
the fun part in this situation comes up when you get an explain plan.
Using the command line explain plan you get a perfectly reasonable plan
using an index scan on the permanent table and a full scan on the global
temp. However, when you check on this SQL in the V$SQL_PLAN table it is
attempted a parallel query with a single query slave doing a full table
scan on both the permanent and temporary table!
The proper plan for this non-parallel query being forced into a
non-optimal parallel plan was also shown in the EM database control
session SQL screen (since it is probably pulled from the
DBA_HIST_SQL_PLAN which is a child of the V$SQL_PLAN table, this makes
sense.) However, the developer, using the PL/SQL Developer tool, was
getting the bad explain plan and couldn?t see the problem. We changed
the DEGREE to 1 on the table and the plans then matched between the
online explain plan and the V$SQL_PLAN explain plan.
A second note for Oracle: If the DEGREE is set on a table and the
MIN_PARALLEL_SERVERS is set to 0, completely disregard the DEGREE
setting.
And a final note: The plan generated by the EXPLAIN PLAN and AUTOTRACE
commands should match the one generated into the V$SQL_PLAN table.
Further 1=2 Work
Finally having some spare time I decided to perform
a few more tests on the 1=2 situation. In actuality it seems to be
another case of Oracle explain plans not actually matching what is
happening in the database.
Observe the following:
SQL> create table test as select * from dba_objects;
Table created.
SQL> select count(*) from test;
COUNT(*)
----------
50115
SQL> create index test_idx on test(object_id);
Index created.
SQL> analyze table test compute statistics;
Table analyzed.
SQL> set autotrace on
SQL> select count(object_id) from test where 1=2;
COUNT(OBJECT_ID)
----------------
0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS
(Cost=23 Card=1 Bytes=4)
1 0 SORT
(AGGREGATE)
2 1
FILTER
3 2 INDEX (FAST FULL SCAN) OF 'TEST_IDX'
(INDEX) (Cost=23
Card=50115
Bytes=200460)
Statistics
----------------------------------------------------------
1 recursive
calls
0 db block
gets
0 consistent
gets
0 physical
reads
0 redo
size
310 bytes sent via SQL*Net to
client
496 bytes received via SQL*Net from
client
2 SQL*Net roundtrips to/from
client
0 sorts
(memory)
0 sorts
(disk)
1 rows processed
If we base our decision on cost, we see a cost of
23, but the true cost is in the statistics where it required only one
recursive call, no db gets, consistent gets or physical reads. Now, if
we flush the cache and shared pool we see the following:
SQL> alter system flush buffer_cache;
System altered.
SQL> alter system flush shared_pool;
System altered.
SQL> select count(object_id) from test where 1=2;
COUNT(OBJECT_ID)
----------------
0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS
(Cost=23 Card=1 Bytes=4)
1 0 SORT
(AGGREGATE)
2 1
FILTER
3 2 INDEX (FAST FULL SCAN) OF 'TEST_IDX'
(INDEX) (Cost=23
Card=50115
Bytes=200460)
Statistics
----------------------------------------------------------
297 recursive
calls
0 db block
gets
43 consistent
gets
7 physical
reads
0 redo
size
310 bytes sent via SQL*Net to
client
496 bytes received via SQL*Net from
client
2 SQL*Net roundtrips to/from
client
4 sorts
(memory)
0 sorts
(disk)
1 rows processed
So in adding parsing to the cost, we see that the
recursive calls, consistent gets and physical reads do play a part, but
only for the first execution. Now, dropping the index we see
SQL> drop index test_idx;
Index dropped.
SQL> select count(object_id) from test where 1=2;
COUNT(OBJECT_ID)
----------------
0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS
(Cost=138 Card=1 Bytes=4
)
1 0 SORT
(AGGREGATE)
2 1
FILTER
3 2 TABLE ACCESS (FULL) OF 'TEST' (TABLE)
(Cost=138 Card=5
0115
Bytes=200460)
Statistics
----------------------------------------------------------
169 recursive
calls
0 db block
gets
18 consistent gets
0 physical
reads
0 redo
size
310 bytes sent via SQL*Net to
client
496 bytes received via SQL*Net from
client
2 SQL*Net roundtrips to/from
client
3 sorts
(memory)
0 sorts
(disk)
1 rows
processed
And now with cache and pool populated:
SQL> select count(object_id) from test where 1=2;
COUNT(OBJECT_ID)
----------------
0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS
(Cost=138 Card=1 Bytes=4
)
1 0 SORT
(AGGREGATE)
2 1
FILTER
3 2 TABLE ACCESS (FULL) OF 'TEST' (TABLE)
(Cost=138 Card=5
0115
Bytes=200460)
Statistics
----------------------------------------------------------
1 recursive
calls
0 db block
gets
0 consistent
gets
0 physical
reads
0 redo
size
310 bytes sent via SQL*Net to
client
496 bytes received via SQL*Net from
client
2 SQL*Net roundtrips to/from
client
0 sorts (memory)
0 sorts
(disk)
1 rows processed
Again we see that while the cost figure is several
times higher than with the index, the true cost in recursive calls, db
block gets and physical reads is the same as with the index. Now
what about with a comparison to an impossible column value? Let?s see:
SQL> create index test_idx on test(object_id);
Index created.
SQL> analyze table test compute statistics;
Table analyzed.
SQL> alter system flush shared_pool;
System altered.
SQL> alter system flush buffer_cache;
System altered.
SQL> select count(object_id) from test where
object_id=0;
COUNT(OBJECT_ID)
----------------
0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1
Card=1 Bytes=4)
1 0 SORT
(AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'TEST_IDX'
(INDEX) (Cost=1 Card=1
Bytes=4)
Statistics
----------------------------------------------------------
294 recursive
calls
0 db block
gets
44 consistent
gets
18 physical
reads
0 redo
size
310 bytes sent via SQL*Net to
client
496 bytes received via SQL*Net from
client
2 SQL*Net roundtrips to/from
client
4 sorts
(memory)
0 sorts (disk)
1 rows processed
With parsing we see the cost in recursive calls, db
block gets and physical reads is very similar to the costs using 1=2,
but how about once the statement is loaded and the buffer cache filled?
Let?s look:
SQL> /
COUNT(OBJECT_ID)
----------------
0
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1
Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'TEST_IDX'
(INDEX) (Cost=1 Card=1
Bytes=4)
Statistics
----------------------------------------------------------
0 recursive
calls
0 db block
gets
2 consistent
gets
0 physical
reads
0 redo
size
310 bytes sent via SQL*Net to
client
496 bytes received via SQL*Net from
client
2 SQL*Net roundtrips to/from
client
0 sorts
(memory)
0 sorts
(disk)
1 rows
processed
As you can see, the stated cost is much less, but
the cost in consistent gets is 2. In timing, both statements showed a
response time average of 0.05 seconds.
Conclusions
Oracle explain plans don?t always match what the
database is doing, and costs can be next to useless in many situations.
Always look at the underlying statistics to see what is happening.
Always test in your own environment to ensure stated Oracle behavior
from one OS to another remains constant.
As far as whether using 1=2 or using a comparison
to an index column with a value that can?t exist is more efficient,
using 1=2 is marginally better with 0 consistent gets verses 2
consistent gets however in performance tests the two statements
performances where virtually indistinguishable.
Also see these notes on
SQL with
"Where 1=1".
|
|
Get the Complete
Oracle SQL Tuning Information
The landmark book
"Advanced Oracle
SQL Tuning The Definitive Reference" is
filled with valuable information on Oracle SQL Tuning.
This book includes scripts and tools to hypercharge Oracle 11g
performance and you can
buy it
for 30% off directly from the publisher.
|