Mike Ault's thoughts on various topics, Oracle related and not. Note: I reserve the right to delete comments that are not contributing to the overall theme of the BLOG or are insulting or demeaning to anyone. The posts on this blog are provided “as is” with no warranties and confer no rights. The opinions expressed on this site are mine and mine alone, and do not necessarily represent those of my employer.

Monday, March 27, 2006

Further 1=2 Work

Added 4/4/2006 - Did a bit more probing via the 10gR2 explain plan tool. What Oracle actually does with the 1=2 is replace it with the predicate NULL IS NOT NULL. Simply use this instead of the 1=2 to eliminate some parsing and confusion in your code. This also seems to indicate you can replace the 1=1 with NULL IS NULL. Anyway, it's been some interesting research.

Finally having some spare time I decided to perform a few more tests on the 1=2 situation. In actuality it seems to be another case of Oracle explain plans not actually matching what is happening in the database.

Observe the following:

SQL> create table test as select * from dba_objects;

Table created.

SQL> select count(*) from test;

COUNT(*)
----------
50115

SQL> create index test_idx on test(object_id);

Index created.

SQL> analyze table test compute statistics;

Table analyzed.

SQL> set autotrace on
SQL> select count(object_id) from test where 1=2;

COUNT(OBJECT_ID)
----------------
0


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=23 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 INDEX (FAST FULL SCAN) OF 'TEST_IDX' (INDEX) (Cost=23
Card=50115 Bytes=200460)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
310 bytes sent via SQL*Net to client
496 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

If we base our decision on cost, we see a cost of 23, but the true cost is in the statistics where it required only one recursive call, no db gets, consistent gets or physical reads. Now, if we flush the cache and shared pool we see the following:

SQL> alter system flush buffer_cache;

System altered.

SQL> alter system flush shared_pool;

System altered.

SQL> select count(object_id) from test where 1=2;

COUNT(OBJECT_ID)
----------------
0


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=23 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 INDEX (FAST FULL SCAN) OF 'TEST_IDX' (INDEX) (Cost=23
Card=50115 Bytes=200460)

Statistics
----------------------------------------------------------
297 recursive calls
0 db block gets
43 consistent gets
7 physical reads
0 redo size
310 bytes sent via SQL*Net to client
496 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
1 rows processed

So in adding parsing to the cost, we see that the recursive calls, consistent gets and physical reads do play a part, but only for the first execution. Now, dropping the index we see:

SQL> drop index test_idx;

Index dropped.

SQL> select count(object_id) from test where 1=2;

COUNT(OBJECT_ID)
----------------
0


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=138 Card=1 Bytes=4
)

1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'TEST' (TABLE) (Cost=138 Card=5
0115 Bytes=200460)


Statistics
----------------------------------------------------------
169 recursive calls
0 db block gets
18 consistent gets
0 physical reads
0 redo size
310 bytes sent via SQL*Net to client
496 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
1 rows processed

And now with cache and pool populated:

SQL> select count(object_id) from test where 1=2;

COUNT(OBJECT_ID)
----------------
0


Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=138 Card=1 Bytes=4
)

1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'TEST' (TABLE) (Cost=138 Card=5
0115 Bytes=200460)

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
0 consistent gets
0 physical reads
0 redo size
310 bytes sent via SQL*Net to client
496 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

Again we see that while the cost figure is several times higher than with the index, the true cost in recursive calls, db block gets and physical reads is the same as with the index.

Now what about with a comparison to an impossible column value? Let’s see:

SQL> create index test_idx on test(object_id);

Index created.

SQL> analyze table test compute statistics;

Table analyzed.

SQL> alter system flush shared_pool;

System altered.

SQL> alter system flush buffer_cache;

System altered.

SQL> select count(object_id) from test where object_id=0;

COUNT(OBJECT_ID)
----------------
0

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'TEST_IDX' (INDEX) (Cost=1 Card=1
Bytes=4)

Statistics
----------------------------------------------------------
294 recursive calls
0 db block gets
44 consistent gets
18 physical reads
0 redo size
310 bytes sent via SQL*Net to client
496 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
1 rows processed

With parsing we see the cost in recursive calls, db block gets and physical reads is very similar to the costs using 1=2, but how about once the statement is loaded and the buffer cache filled? Let’s look:

SQL> /

COUNT(OBJECT_ID)
----------------
0

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=4)
1 0 SORT (AGGREGATE)
2 1 INDEX (RANGE SCAN) OF 'TEST_IDX' (INDEX) (Cost=1 Card=1
Bytes=4)

Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
310 bytes sent via SQL*Net to client
496 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed

As you can see, the stated cost is much less, but the cost in consistent gets is 2. In timing, both statements showed a response time average of 0.05 seconds.

Conclusions

Oracle explain plans don’t always match what the database is doing, and costs can be next to useless in many situations always look at the underlying statistics to see what is happening. Always test in your own environment to ensure stated Oracle behavior from one OS to another remains constant.

As far as whether using 1=2 or using a comparison to an index column with a value that can’t exist is more efficient, using 1=2 is marginally better with 0 consistent gets verses 2 consistent gets however in performance tests the two statements performances where virtually indistinguishable.

No comments: