Optimizer curiosity in 12.1.0.2

For almost as long as I can remember, the optimizer has had a nifty little trick when you (in effect) try to combine two different usage models within a single SQL.  To explain that, I’m referring to the common scenario of: “If a bind value is provided by the caller, then use it, otherwise it shouldn’t limit the result set.  So we commonly see queries like:

select *
from MY_TABLE
where COL1 = NVL(:mybindvar, COL1)

[For ease of discussion, we’ll assume COL1 is not nullable]

Anyway, the nice little optimizer trick was to optimize the query to handle the two separate use cases, so you see a CONCATENATION step in the execution plan, and two FILTER’s, one to handle the case when the bind variable is null, and one to handle the case where it is provided.



-----------------------------------------------
| Id  | Operation                             |
-----------------------------------------------
|   0 | SELECT STATEMENT                      |
|   1 |  CONCATENATION                        |
|*  2 |   FILTER                              |
            
|*  5 |   FILTER                              |
            
-----------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(:B1 IS NULL)
   5 - filter(:B1 IS NOT NULL)

In effect, the optimizer has taken care of the old SQL tuning advice we used to give to the developers to rewrite the SQL as a UNION ALL, or even split it into two distinct SQL statements to cater for each case.  With that in mind, I picked up a change to this behaviour in 12.1.0.2 (and some additional work by Jonathan Lewis suggests 11.2.0.4 as well), where bind peeking seems to create some confusion.

Let’s look at an example.  I’ll create a table, populate it with approx 4million rows, where two cols are of interest:

  • SEQ, 2m distinct values and hence highly selective
  • PSEQ, only 2 distinct values, and hence not a great option for an index path

The primary key is the composite of these two columns, and a secondary index on PSEQ



SQL> create table T (
  2    seq int, pseq int, blah char(30),
  3    constraint T_PK primary key (seq,pseq)
  4  )
  5  /
 
Table created.
 
SQL> insert into T
  2  select trunc(rownum/2) seq,
  3         mod(rownum,2) pseq,
  4         'x' blah
  5  from
  6    ( select 1 from dual connect by level < 1000 ),
  7    ( select 1 from dual connect by level < 4000 )
  8  /
 
3995001 rows created.
 
SQL> create index T_IX2 on T ( pseq );
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats('','T',no_invalidate=>false);
 
PL/SQL procedure successfully completed.

Now we’ll execute an SQL in the form previously mentioned, and take the case where the bind variable in the NVL is null.



SQL> variable b1 number
SQL> variable b2 number
SQL> exec :b1 := null;
SQL> exec :b2 := 1706496;
 
SQL> select /*+ gather_plan_statistics */ *
  2  from T
  3  WHERE seq = :B2
  4  AND pseq = NVL(:B1 ,pseq)
  5  /
 
       SEQ       PSEQ BLAH
---------- ---------- ------------------------------
   1706496          0 x
   1706496          1 x
 
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0caq50rvfkub5, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from T WHERE seq = :B2 AND pseq
= NVL(:B1 ,pseq)
 
Plan hash value: 3837764478
 
---------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |      1 |        |      2 |00:00:00.01 |       6 |
|   1 |  CONCATENATION                        |       |      1 |        |      2 |00:00:00.01 |       6 |
|*  2 |   FILTER                              |       |      1 |        |      2 |00:00:00.01 |       6 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T     |      1 |      2 |      2 |00:00:00.01 |       6 |
|*  4 |     INDEX RANGE SCAN                  | T_PK  |      1 |      2 |      2 |00:00:00.01 |       4 |
|*  5 |   FILTER                              |       |      1 |        |      0 |00:00:00.01 |       0 |
|*  6 |    TABLE ACCESS BY INDEX ROWID BATCHED| T     |      0 |      1 |      0 |00:00:00.01 |       0 |
|*  7 |     INDEX RANGE SCAN                  | T_IX2 |      0 |      1 |      0 |00:00:00.01 |       0 |
---------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(:B1 IS NULL)
   4 - access("SEQ"=:B2)
       filter("PSEQ" IS NOT NULL)
   5 - filter(:B1 IS NOT NULL)
   6 - filter("SEQ"=:B2)
   7 - access("PSEQ"=:B1)
 
 
30 rows selected.

Because :B1 is null, you can see from the Predicate Information, and from the Actual Rows information, that we took the execution path in lines 3-4.  But take a peek (no pun intended) at lines 6-7.  That path, whilst not used, is proposing the use of index T_IX2, which as we know is 2 distinct keys across 4 million rows.  The problem is … we’ve now loaded that execution plan into our library cache.  So let’s see what happens when we exercise that part of the plan when we this time specify both bind variables



SQL> exec :b1 := 1;
 
SQL> select /*+ gather_plan_statistics */ *
  2  from T
  3  WHERE seq = :B2
  4  AND pseq = NVL(:B1 ,pseq)
  5  /
 
       SEQ       PSEQ BLAH
---------- ---------- ------------------------------
   1706496          1 x
 
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0caq50rvfkub5, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ * from T WHERE seq = :B2 AND pseq
= NVL(:B1 ,pseq)
 
Plan hash value: 3837764478
 
------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |       |      1 |        |      1 |00:00:00.70 |   28124 |    178 |
|   1 |  CONCATENATION                        |       |      1 |        |      1 |00:00:00.70 |   28124 |    178 |
|*  2 |   FILTER                              |       |      1 |        |      0 |00:00:00.01 |       0 |      0 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T     |      0 |      2 |      0 |00:00:00.01 |       0 |      0 |
|*  4 |     INDEX RANGE SCAN                  | T_PK  |      0 |      2 |      0 |00:00:00.01 |       0 |      0 |
|*  5 |   FILTER                              |       |      1 |        |      1 |00:00:00.70 |   28124 |    178 |
|*  6 |    TABLE ACCESS BY INDEX ROWID BATCHED| T     |      1 |      1 |      1 |00:00:00.70 |   28124 |    178 |
|*  7 |     INDEX RANGE SCAN                  | T_IX2 |      1 |      1 |   1997K|00:00:00.63 |    3898 |    178 |
----------------------------------------------------------

Ouch…2000 million rows scanned in the index.  Since we provided both bind variables, the obvious access path should have been a simple primary key lookup.  But the path derived from our first execution (with :B1 being null) has left behind a “nasty legacy”.

The troubling thing about this (in terms of application stability) is that the order in which we run queries now impacts the performance of how they run.  Lets clear out the cursor by recalculating stats and then reverse the order of execution.



SQL> exec dbms_stats.gather_table_stats('','T',no_invalidate=>false);
 
PL/SQL procedure successfully completed.
 
SQL> exec :b1 := 1;
SQL> exec :b2 := 1706496;
 
PL/SQL procedure successfully completed.
 
SQL> select /*+ gather_plan_statistics */ *
  2  from T
  3  WHERE seq = :B2
  4  AND pseq = NVL(:B1 ,pseq)
  5  /
 
       SEQ       PSEQ BLAH
---------- ---------- ------------------------------
   1706496          1 x
 
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
 
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID  0caq50rvfkub5, child number 1
-------------------------------------
select /*+ gather_plan_statistics */ * from T WHERE seq = :B2 AND pseq
= NVL(:B1 ,pseq)
 
Plan hash value: 933468988
 
--------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |      |      1 |        |      1 |00:00:00.01 |       4 |
|   1 |  CONCATENATION                        |      |      1 |        |      1 |00:00:00.01 |       4 |
|*  2 |   FILTER                              |      |      1 |        |      0 |00:00:00.01 |       0 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| T    |      0 |      2 |      0 |00:00:00.01 |       0 |
|*  4 |     INDEX RANGE SCAN                  | T_PK |      0 |      2 |      0 |00:00:00.01 |       0 |
|*  5 |   FILTER                              |      |      1 |        |      1 |00:00:00.01 |       4 |
|   6 |    TABLE ACCESS BY INDEX ROWID        | T    |      1 |      1 |      1 |00:00:00.01 |       4 |
|*  7 |     INDEX UNIQUE SCAN                 | T_PK |      1 |      1 |      1 |00:00:00.01 |       3 |
--------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter(:B1 IS NULL)
   4 - access("SEQ"=:B2)
       filter("PSEQ" IS NOT NULL)
   5 - filter(:B1 IS NOT NULL)
   7 - access("SEQ"=:B2 AND "PSEQ"=:B1)

When we optimize the query for the initial case of both bind variables provided, you can see that both sides of the CONCATENATION have yielded a sensible path.

Advertisements

3 thoughts on “Optimizer curiosity in 12.1.0.2

  1. Hi Connor.
    I hate bind peeking…
    I think it’s worth mentioning that at least Adaptive Cursor Sharing fixes this for later executions.
    So actually repeating your test WITHOUT clearing out the cursor (by recalculating stats) will yield the same results (but due to different reasons). In this case, after the first two executions we’ll have a single child cursor with IS_BIND_AWARE=’N’ and IS_SHAREABLE=’Y’. After the third execution this child will have IS_SHAREABLE=’N’ and a second child will be created (with the good plan) with IS_BIND_AWARE=’Y’ and IS_SHAREABLE=’Y’.

    Thanks,
    Oren.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s