Performing a large correlated update

We had a question on AskTom recently asking for the best way to update a new column on a very large table.  Four in-place update options are at our disposal:

  • Serial update SQL
  • Parallel DML update SQL
  • DBMS_PARALLEL_EXECUTE
  • PL/SQL batching (as long as sensible restart-after-error logic is readily available)

all of which will achieve the required outcome but they have a significant redo and undo cost associated with them.  Also, depending on the size of the new column, such an update might create chaos with row migration, because whilst unlikely, there is definitely the potential for every row to grow beyond the available block space required to hold it in situ.

So a common approach to tackling a large “update” is to recast the problem to creating a new version (copy) of the table.  This can be done with a CREATE TABLE AS SELECT (CTAS).  Using this method we get the benefits of less resource consumption, but two issues can get in the way of this approach:

  1. It requires significant downtime whilst the new table is populated
  2. It requires careful planning to ensure all of the dependent objects (indexes, constraints, etc etc) are correctly recreated on the new table

But perhaps there is an alternative – it might not be 100% as fast as a pure CTAS, but what if it solves both the issues mentioned above ?  That alternative is DBMS_REDEFINITION

People often dismiss DBMS_REDEFINITION as purely for changing the structure of a table without changing the underlying data (for example, partitioning a table).  But they fail to take notice of the column mapping functionality that is available – and as long as our column mapping expressions are deterministic, then we can get more value out of the package, including a correlated update.

So here’s our example:

We will have a table called T1, which is a copy of DBA_OBJECTS and will be the table to which we will want a add a new column




SQL>
SQL> create table t1  as
  2  select owner, object_name, subobject_name,
  3       object_id, data_object_id, object_type,
  4       created, last_ddl_time, timestamp, status,
  5       temporary, generated, secondary
  6    from all_objects
  7    where object_id is not null;

Table created.

SQL>
SQL> alter table t1
  2    add constraint t1_pk
  3    primary key(object_id);

Table altered.

T2 is the table from which we will want to source the new data to apply to T1. In this case, T2 has a primary key of OBJECT_ID which can be used to source a new data element called FNAME. In our example, this is just owner and object name concatenated.


SQL>
SQL> create table t2 as select object_id, owner||'.'||object_name fname
  2  from all_objects
  3    where object_id is not null;

Table created.

SQL>
SQL> alter table t2
  2    add constraint t2_pk
  3    primary key(object_id);

Table altered.

So here is the requirement. T1 gets a new column called FULL_NAME. It must be populated via the FNAME column on T2 via the OBJECT_ID lookup. We can see that T1 starts with FULL_NAME being null, and we can see a sample of the values that should come from T2


SQL> alter table t1 add full_name varchar2(200);

Table altered.

SQL>
SQL> select object_id, owner, full_name from t1
  2  where rownum < 10;

 OBJECT_ID OWNER                          FULL_NAME
---------- ------------------------------ --------------------------------------------------
        30 SYS
        47 SYS
        32 SYS
        49 SYS
        17 SYS
         2 SYS
        29 SYS
        45 SYS
        10 SYS

9 rows selected.

SQL>
SQL> select object_id, fname from t2
  2  where rownum < 10;

 OBJECT_ID FNAME
---------- ---------------------------
        30 SYS.I_COBJ#
        47 SYS.I_USER2
        32 SYS.CCOL$
        49 SYS.I_COL2
        17 SYS.FILE$
         2 SYS.C_OBJ#
        29 SYS.C_COBJ#
        45 SYS.I_TS1
        10 SYS.C_USER#

We can’t do joins or subqueries in DBMS_REDEFINITION but we can do deterministic expressions. In our case, we’ll use a deterministic PL/SQL function to mimic the join


SQL>
SQL> create or replace
  2  function get_full(p_id int )  return varchar2 deterministic is
  3    f varchar2(100);
  4  begin
  5    select  fname
  6    into f
  7    from t2
  8    where object_id = p_id;
  9
 10    return f;
 11  end;
 12  /

Function created.

As per normal, we create an interim table to hold the converted data as we redefine the table.


SQL>
SQL> create table t_interim (
  2   owner                         varchar2(128),
  3   object_name                   varchar2(128),
  4   subobject_name                varchar2(128),
  5   object_id                     number,
  6   data_object_id                number,
  7   object_type                   varchar2(23),
  8   created                       date,
  9   last_ddl_time                 date,
 10   timestamp                     varchar2(19),
 11   status                        varchar2(7),
 12   temporary                     varchar2(1),
 13   generated                     varchar2(1),
 14   secondary                     varchar2(1),
 15   full_name                     varchar2(200)
 16  );

Table created.

And now we are ready to use DBMS_REDEFINITION. The critical part here is the COL_MAPPING parameter. We are using the call to GET_FULL to populate the new column that will ultimately end up on T1.


SQL>
SQL>
SQL> declare
  2      l_colmap varchar(512);
  3    begin
  4      l_colmap :=
  5            'OWNER
  6            ,OBJECT_NAME
  7            ,SUBOBJECT_NAME
  8            ,OBJECT_ID
  9            ,DATA_OBJECT_ID
 10            ,OBJECT_TYPE
 11            ,CREATED
 12            ,LAST_DDL_TIME
 13            ,TIMESTAMP
 14            ,STATUS
 15            ,TEMPORARY
 16            ,GENERATED
 17            ,SECONDARY
 18            ,get_full(OBJECT_ID) FULL_NAME';
 19
 20      dbms_redefinition.start_redef_table
 21      (  uname           => user,
 22         orig_table      => 'T1',
 23         int_table       => 'T_INTERIM',
 24         col_mapping   => l_colmap );
 25   end;
 26   /

PL/SQL procedure successfully completed.

And here is the nice part. By using DBMS_REDEFINITION, we dont need to worry about the dependencies – we can let COPY_TABLE_DEPENDENTS do the heavy lifting.


SQL>
SQL> variable nerrors number
SQL> begin
  2    dbms_redefinition.copy_table_dependents
  3      ( user, 'T1', 'T_INTERIM',
  4        copy_indexes => dbms_redefinition.cons_orig_params,
  5        num_errors => :nerrors );
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL>
SQL> print nerrors

   NERRORS
----------
         0

SQL> begin
  2    dbms_redefinition.finish_redef_table
  3      ( user, 'T1', 'T_INTERIM' );
  4  end;
  5  /

PL/SQL procedure successfully completed.

And we’re done ! A correlated update using DBMS_REDEFINITION.


SQL>
SQL> select object_id, owner, full_name from t1
  2  where rownum < 10;

 OBJECT_ID OWNER                          FULL_NAME
---------- ------------------------------ --------------------------------------------------
        30 SYS                            SYS.I_COBJ#
        47 SYS                            SYS.I_USER2
        32 SYS                            SYS.CCOL$
        49 SYS                            SYS.I_COL2
        17 SYS                            SYS.FILE$
         2 SYS                            SYS.C_OBJ#
        29 SYS                            SYS.C_COBJ#
        45 SYS                            SYS.I_TS1
        10 SYS                            SYS.C_USER#

9 rows selected.

SQL>
SQL>

You might be thinking “But we’ll be slow because we’re calling that PL/SQL millions of times”.  Yes, this will be slower than a pure CTAS operation, but by using DBMS_REDEFINITION the process is online with nearly zero service disruption.  Hence the overall execution time is not nearly so critical anymore as a CTAS where applications are unavailable during the activity.

Code enhancements without changing code

An interesting suggestion came through on AskTom this week, which prompted the following exploration.

Let us assume you are populating a table with


INSERT INTO EMP SELECT * FROM SOURCE_EMPS

and it falls over with


ORA-12899: value too large for column

To capture the data that caused that error is not trivial. Of course, we could edit the code to perform DML error logging but of course, that means changing the source code, which means change control, testing, red tape etc etc etc. Is there a better way ? Perhaps there is.

In 12c, there is the SQL translation framework, designed to work for those applications being migrated from one database platform to Oracle. We intercept incoming SQL statements in (say) SQL Server syntax format, and recast it on the fly to an equivalent Oracle syntax. But we can also take advantage of that to handle this data error issue. Here’s a demo of that in action:


SQL> create user demo identified by demo quota 100m on users;

User created.

SQL> grant create session, alter session, create table, create sql translation profile, create procedure to DEMO;

Grant succeeded.

SQL>
SQL> conn demo/demo
Connected.

SQL> create table emp ( x varchar(10));

Table created.

SQL> create table source_emps as
  2  select cast(rpad('E',rownum,'E') as varchar2(20)) x_in
  3  from dual connect by level <= 15; Table created. SQL> select * from source_emps;

X_IN
--------------------
E
EE
EEE
EEEE
EEEEE
EEEEEE
EEEEEEE
EEEEEEEE
EEEEEEEEE
EEEEEEEEEE
EEEEEEEEEEE
EEEEEEEEEEEE
EEEEEEEEEEEEE
EEEEEEEEEEEEEE
EEEEEEEEEEEEEEE

15 rows selected.

So we have an EMP table with 10byte column ”X”, and some source rows from SOURCE_EMPS with ever increasingly larger sizes for “X_IN”. So once our rows get larger than 10 bytes, we are going to have a problem inserting them into EMP.


SQL> INSERT INTO EMP SELECT * FROM SOURCE_EMPS;
INSERT INTO EMP SELECT * FROM SOURCE_EMPS
                       *
ERROR at line 1:
ORA-12899: value too large for column "DEMO"."EMP"."X" (actual: 11, maximum: 10)

So let us now intercept that SQL using the translation framework.


SQL>
SQL> begin
  2    dbms_sql_translator.drop_profile(profile_name => 'DEMO_PROFILE');
  3  exception when others then null;
  4  end;
  5  /

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_sql_translator.create_profile('DEMO_PROFILE');

PL/SQL procedure successfully completed.

SQL>
SQL> BEGIN
  2      DBMS_SQL_TRANSLATOR.REGISTER_SQL_TRANSLATION(
  3        profile_name    => 'DEMO_PROFILE',
  4        sql_text        => 'INSERT INTO EMP SELECT * FROM SOURCE_EMPS',   
  5        translated_text => 'INSERT INTO EMP SELECT * FROM SOURCE_EMPS LOG ERRORS INTO ERR$_EMP (TO_CHAR(SYSTIMESTAMP)) REJECT LIMIT UNLIMITED');
  6  END;
  7  /

PL/SQL procedure successfully completed.

So we can create an error logging table, and activate our translation profile


SQL> exec dbms_errlog.create_error_log ( dml_table_name => 'EMP' );

PL/SQL procedure successfully completed.

SQL> alter session set sql_translation_profile = DEMO_PROFILE;

Session altered.

SQL> alter session set events = '10601 trace name context forever, level 32';

Session altered.

SQL>
SQL> INSERT INTO EMP SELECT * FROM SOURCE_EMPS;

10 rows created.

Notice now that our insert was “successful” because we are now capturing the errors.


SQL> select count(*) from err$_emp;

  COUNT(*)
----------
         5

SQL>
SQL> select ORA_ERR_MESG$ from err$_emp;

ORA_ERR_MESG$
---------------------------------------------------------------------------------
ORA-12899: value too large for column "DEMO"."EMP"."X" (actual: 11, maximum: 10)
ORA-12899: value too large for column "DEMO"."EMP"."X" (actual: 12, maximum: 10)
ORA-12899: value too large for column "DEMO"."EMP"."X" (actual: 13, maximum: 10)
ORA-12899: value too large for column "DEMO"."EMP"."X" (actual: 14, maximum: 10)
ORA-12899: value too large for column "DEMO"."EMP"."X" (actual: 15, maximum: 10)

Truncating a timestamp to the second

We had an interesting AskTom question recently, about why certain TRUNC commands would generate an error when applied to a TIMESTAMP value.  At first glance, TRUNC seems fine.


SQL> select trunc(localtimestamp,'MM') to_the_month from dual;

TO_THE_MO
---------
01-OCT-16

1 row selected.

SQL> select trunc(localtimestamp,'YYYY') to_the_year from dual;

TO_THE_YE
---------
01-JAN-16

1 row selected.

SQL> select trunc(localtimestamp,'DD') to_the_day from dual;

TO_THE_DA
---------
20-OCT-16

1 row selected.

But the moment you try apply a TRUNC down to the second, then things go wrong


SQL> select trunc(localtimestamp,'SS') to_the_day from dual;
select trunc(localtimestamp,'SS') to_the_day from dual
                            *
ERROR at line 1:
ORA-01899: bad precision specifier

The clue here is if we look at the DUMP output for some of these commands


SQL> select dump(localtimestamp) from dual;

DUMP(LOCALTIMESTAMP)
-----------------------------------------------------------------------
Typ=187 Len=20: 224,7,10,20,13,48,45,0,192,3,180,35,8,0,3,0,0,0,0,0

1 row selected.

SQL> select dump(trunc(localtimestamp,'YYYY')) dmp from dual;

DMP
-----------------------------------------------------------------------
Typ=13 Len=8: 224,7,1,1,0,0,0,0

1 row selected.

Notice that the datatype has changed. In fact, type 13 is the same datatype as we see for SYSDATE


SQL> select dump(sysdate) from dual;

DUMP(SYSDATE)
-------------------------------------------
Typ=13 Len=8: 224,7,10,20,13,50,1,0

If you look in the SQL documentation, you’ll see that there actually is NOT a trunc command for timestamps. So what is in fact happening is:

  • the timestamp is being silently converted to a date,
  • the trunc command is being applied to the date

and just like any date, SS is not an appropriate TRUNC mask.


SQL> select trunc(sysdate,'SS') from dual;
select trunc(sysdate,'SS') from dual
                     *
ERROR at line 1:
ORA-01899: bad precision specifier

A simple workaround is to convert the timestamp to a date, and if necessary, convert it back to a timestamp to preserve the datatype.


SQL> select localtimestamp ts, cast(localtimestamp as date) truncd_to_secs from dual;

TS                                       TRUNCD_TO_SECS
---------------------------------------- -------------------
20-OCT-16 01.54.09.991000 PM             20/10/2016 13:54:09

SQL> select dump(cast(cast(localtimestamp as date) as timestamp)) from dual;

DUMP(CAST(CAST(LOCALTIMESTAMPASDATE)ASTIMESTAMP))
-----------------------------------------------------------------------------
Typ=187 Len=20: 224,7,10,20,13,54,45,0,0,0,0,0,0,0,3,0,0,0,0,0

Taking a peek at SYS_CONTEXT

There was Twitter discussion about using context variables, accessed via SYS_CONTEXT within queries. It’s an easy means of passing parameters into a view. The question that got asked was – does a sys_context variable act exactly like a bind variable. Let’s take a look at a simple demo to explore that.

We’ll create a table with 1 row with a value of ‘a’, and 100,000 rows with a value of ‘b’. We’ll index that column, and histogram it, so we’ve got the standard example of skewed data to work with.



SQL>
SQL> create table t ( x varchar2(10), y char(100));

Table created.

SQL>
SQL> insert into t
  2  select 'a', rownum
  3  from dual
  4  /

1 row created.

SQL>
SQL> insert into t
  2  select 'b', rownum
  3  from dual
  4  connect by level <= 100000   5  / 100000 rows created. SQL>
SQL> commit;

Commit complete.

SQL>
SQL> create index ix on t ( x ) ;

Index created.

SQL>
SQL> exec dbms_stats.gather_table_stats('','T',method_Opt=>'for all columns size 5');

PL/SQL procedure successfully completed.

So in terms of accessing the table, it is relatively obvious to conclude that access to the rows with X=’a’, ie, single row lookup, should be done via the index, and access to the rows with X=’b’, ie, all rows except 1, should be done with a full scan of the table. Let’s look at some executions now of queries against our tables using a bind variable.




SQL>
SQL> variable b1 varchar2(10)
SQL>
SQL> exec :b1 := 'a';

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ gather_plan_statistics */ count(y)
  2  from t
  3  where x = :b1;

  COUNT(Y)
----------
         1

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  7kv4f2uc7qjsy, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x = :b1

Plan hash value: 2143077847

-------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  SORT AGGREGATE                      |      |      1 |      1 |      1 |00:00:00.01 |       3 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T    |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  3 |    INDEX RANGE SCAN                  | IX   |      1 |      1 |      1 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("X"=:B1)


20 rows selected.

SQL>
SQL> exec :b1 := 'b';

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ gather_plan_statistics */ count(y)
  2  from t
  3  where x = :b1;

  COUNT(Y)
----------
    100000

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
SQL_ID  7kv4f2uc7qjsy, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x = :b1

Plan hash value: 2143077847

-------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |      1 |        |      1 |00:00:00.03 |    1676 |
|   1 |  SORT AGGREGATE                      |      |      1 |      1 |      1 |00:00:00.03 |    1676 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T    |      1 |      1 |    100K|00:00:00.04 |    1676 |
|*  3 |    INDEX RANGE SCAN                  | IX   |      1 |      1 |    100K|00:00:00.01 |     183 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("X"=:B1)


20 rows selected.

So our first peek at the bind yielded ‘a’, and we optimized the query accordingly. If we keep executing the query, we’ll adapt accordingly and pick a better path for the case where the bind value was ‘b’


SQL>
SQL> exec :b1 := 'a';

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ gather_plan_statistics */ count(y)
  2  from t
  3  where x = :b1;

  COUNT(Y)
----------
         1

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
SQL_ID  7kv4f2uc7qjsy, child number 1
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x = :b1

Plan hash value: 2143077847

-------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |      1 |        |      1 |00:00:00.01 |       3 |
|   1 |  SORT AGGREGATE                      |      |      1 |      1 |      1 |00:00:00.01 |       3 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T    |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  3 |    INDEX RANGE SCAN                  | IX   |      1 |      1 |      1 |00:00:00.01 |       2 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("X"=:B1)


20 rows selected.

SQL>
SQL> exec :b1 := 'b';

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ gather_plan_statistics */ count(y)
  2  from t
  3  where x = :b1;

  COUNT(Y)
----------
    100000

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID  7kv4f2uc7qjsy, child number 2
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x = :b1

Plan hash value: 2966233522

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |    1509 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |    1509 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |    100K|    100K|00:00:00.01 |    1509 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("X"=:B1)


19 rows selected.

And we can confirm that we’ve adapted to handle the two scenarios, by taking a look at V$SQL for each child. Child 0 (our original child) has been marked not shareable, it having been “replaced” with child 1 and 2 for our two skewed data queries.



SQL>
SQL> select child_number, is_bind_sensitive,is_bind_aware,is_shareable
  2  from v$sql
  3  where sql_id = '7kv4f2uc7qjsy';

CHILD_NUMBER I I I
------------ - - -
           0 Y N N
           1 Y Y Y
           2 Y Y Y

3 rows selected.

So that is all working as we would expect. Now we’ll repeat the same exercise using a context variable in place of the bind variable.



SQL> create context blah using my_proc;

Context created.

SQL>
SQL> create or replace
  2  procedure my_proc(p_val varchar2) is
  3  begin
  4    dbms_session.set_context('BLAH','ATTRIB',p_val);
  5  end;
  6  /

Procedure created.

SQL>
SQL>
SQL> exec my_proc('a');

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ gather_plan_statistics */ count(y)
  2  from t
  3  where x = sys_context('BLAH','ATTRIB');

  COUNT(Y)
----------
         1

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  8kj3bg89h2cbu, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x =
sys_context('BLAH','ATTRIB')

Plan hash value: 2966233522

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |    1509 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |    1509 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |  50001 |      1 |00:00:00.01 |    1509 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("X"=SYS_CONTEXT('BLAH','ATTRIB'))


20 rows selected.

SQL>
SQL> exec my_proc('b');

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ gather_plan_statistics */ count(y)
  2  from t
  3  where x = sys_context('BLAH','ATTRIB');

  COUNT(Y)
----------
    100000

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  8kj3bg89h2cbu, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x =
sys_context('BLAH','ATTRIB')

Plan hash value: 2966233522

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |    1509 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |    1509 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |  50001 |    100K|00:00:00.02 |    1509 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("X"=SYS_CONTEXT('BLAH','ATTRIB'))


20 rows selected.

SQL> exec my_proc('a');

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ gather_plan_statistics */ count(y)
  2  from t
  3  where x = sys_context('BLAH','ATTRIB');

  COUNT(Y)
----------
         1

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  8kj3bg89h2cbu, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x =
sys_context('BLAH','ATTRIB')

Plan hash value: 2966233522

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |    1509 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |    1509 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |  50001 |      1 |00:00:00.01 |    1509 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("X"=SYS_CONTEXT('BLAH','ATTRIB'))


20 rows selected.

SQL>
SQL> exec my_proc('b');

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ gather_plan_statistics */ count(y)
  2  from t
  3  where x = sys_context('BLAH','ATTRIB');

  COUNT(Y)
----------
    100000

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  8kj3bg89h2cbu, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(y) from t where x =
sys_context('BLAH','ATTRIB')

Plan hash value: 2966233522

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |      1 |00:00:00.01 |    1509 |
|   1 |  SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.01 |    1509 |
|*  2 |   TABLE ACCESS FULL| T    |      1 |  50001 |    100K|00:00:00.02 |    1509 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("X"=SYS_CONTEXT('BLAH','ATTRIB'))


20 rows selected.

No matter how many times we ran the query, the results were the same. There is also an obvious giveaway in the Estimated rows column, with “50001” being the commonly observed “fall back” position of rows / distinct values. When we take a look at V$SQL, we can also see that we are now just using the original and only child.



SQL> select child_number, is_bind_sensitive,is_bind_aware,is_shareable
  2  from v$sql
  3  where sql_id = '8kj3bg89h2cbu';

CHILD_NUMBER I I I
------------ - - -
           0 N N Y

1 row selected.

SQL>
SQL>

So SYS_CONTEXT values can be bound and hence yield sharable cursors, just like a bind variable, but we dont peek inside it like we do with a conventional bind variable. Whether this is a good thing or a bad thing, depends on whether you are a fan of bind peeking / adaptive cursor sharing or not Smile

Locked rows and lunch breaks ? A simple fix

Even in the new world of stateless web applications, from time to time we still come across the “lunch break locking” problem.  That is, someone or something, commences a transaction which locks some rows, and then their session goes idle, without committing or rolling back the transaction.  The metaphor commonly used was the “someone locks a row in a table and then goes out for lunch”, with them being totally oblivious to the havoc they have caused until they return.

With modern applications being generally stateless (or transactionally stateless) you would think the problem would have disappeared, but if anything, whilst the frequency of the problem has dropped, the diagnosis of the problem when it occurs is harder than ever.  The anonymity of application server sessions, and the scale of web-based customers means when a session does “lose track” of what it was doing and leaves a transaction active, it is very hard to detect until we have a major problem or outage on our hands.

But here’s a mitigation strategy you might want to consider.  Resource Manager can do some of the heavy lifting for you.  It can kill sessions that are idle but are blocking other sessions from completing their work.

Here’s a simple demo of how to put a resource manager plan in place to do that


SQL> begin
  2    dbms_resource_manager.create_pending_area();
  3    --
  4
  5    dbms_resource_manager.create_consumer_group(
  6      CONSUMER_GROUP=>'CG_STOP_BLOCKERS',
  7      COMMENT=>'CG for stop blocking'
  8      );
  9
 10    dbms_resource_manager.create_plan(
 11      PLAN=> 'STOP_BLOCKERS',
 12      COMMENT=>'Plan for stop blocking'
 13    );
 14
 15    dbms_resource_manager.create_plan_directive(
 16      PLAN=> 'STOP_BLOCKERS',
 17      GROUP_OR_SUBPLAN=>'CG_STOP_BLOCKERS',
 18      COMMENT=>'Directive',
 19      MAX_IDLE_BLOCKER_TIME => 60
 20    );
 21
 22
 23    dbms_resource_manager.create_plan_directive(
 24      PLAN=> 'STOP_BLOCKERS',
 25      GROUP_OR_SUBPLAN=>'OTHER_GROUPS',
 26      COMMENT=>'leave others alone'
 27    );
 28
 29    DBMS_RESOURCE_MANAGER.VALIDATE_PENDING_AREA;
 30
 31    DBMS_RESOURCE_MANAGER.SUBMIT_PENDING_AREA();
 32
 33  end;
 34  /

PL/SQL procedure successfully completed.

The key element here is the plan directive, which sets MAX_IDLE_BLOCKER_TIME to 60 seconds. So if an idle session is blocking another for that duration, resource manage will kill the session. Let’s assign the the appropriate consumer group to SCOTT, enable the plan and see it in action


SQL> exec dbms_resource_manager_privs.grant_switch_consumer_group('SCOTT','CG_STOP_BLOCKERS',false);

PL/SQL procedure successfully completed.

SQL> exec dbms_resource_manager.set_initial_consumer_group('SCOTT','CG_STOP_BLOCKERS');

PL/SQL procedure successfully completed.

SQL> ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = STOP_BLOCKERS;

System altered.

SQL> conn scott/tiger
Connected.

SQL> create table t as select 1 x from dual;

Table created.

SQL> delete from t;

1 row deleted.

So this session now has a lock on the row, but is now idle…Now we fire up another session who will get blocked


SQL> set timing on
SQL> delete from t;

[waiting]

After approximately 60 seconds, the second session did in fact respond


SQL> set timing on
SQL> delete from t;

1 row deleted.

Elapsed: 00:00:58.09

So presumably the lock from session 1 has been released. If we return to session 1, we can see how the lock get released


SQL> select * from dual;
select * from dual
            *
ERROR at line 1:
ORA-03113: end-of-file on communication channel
Process ID: 24708
Session ID: 593 Serial number: 59128


They got killed ! So they can take their time at lunch, and come back with their double-choc muffins, chai latte and watercress salad…We don’t care, because our application is protected Smile

The simple fix to date queries

We had question in the OpenWorld panel about why queries on date columns are “always slow”.  Well….they aren’t Smile but here’s a common cause of that misconception.

Let’s create a table as a copy of DBA_OBJECTS, and index the CREATED column (which is  a date).


SQL> create table t as
  2  select *
  3  from dba_objects;

Table created.

SQL>
SQL> create index t_ix on t ( created );

Index created.

The problems start when we do a query on the CREATED column and get “unexpected” results


SQL> select owner, object_name
  2  from t
  3  where created = date '2016-09-20';

no rows selected

We were expecting to get some rows here, but none were returned. And we quickly deduce that this is because of the CREATED column also containing the time component. So nothing was created at midnight on September 20.

So to remove the time component, the query is recast as:


SQL> set autotrace on
SQL> select owner, object_name
  2  from t
  3  where trunc(created) = date '2016-09-20';

OWNER        OBJECT_NAME
------------ ------------------------------
MCDONAC      T1


Execution Plan
----------------------------------------------------------
Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   951 | 38040 |   446   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T    |   951 | 38040 |   446   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(TRUNC(INTERNAL_FUNCTION("CREATED"))=TO_DATE(' 2016-09-20
              00:00:00', 'syyyy-mm-dd hh24:mi:ss'))


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1602  consistent gets
          0  physical reads
          0  redo size
        620  bytes sent via SQL*Net to client
        552  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

and ker-splat, the ability to use the index on the CREATED column disappears because we put the TRUNC expression around it. We used 1602 logical I/O’s to satisfy our query. But all it takes is a little tinkering of our query to get CREATED “unwrapped”


SQL> select owner, object_name
  2  from t
  3  where created >= date '2016-09-20'
  4  and created < date '2016-09-20' + 1;

OWNER        OBJECT_NAME
------------ ------------------------------
MCDONAC      T1


Execution Plan
----------------------------------------------------------
Plan hash value: 3343387620

--------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |   184 |  7360 |     6   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T    |   184 |  7360 |     6   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T_IX |   184 |       |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CREATED">=TO_DATE(' 2016-09-20 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "CREATED"< processed rows 1 (disk) sorts 0 (memory) client from to roundtrips SQL*Net 2 via received bytes 552 sent 620 size redo reads physical gets consistent 4 block db calls recursive ---------------------------------------------------------- Statistics hh24:mi:ss?)) ?syyyy-mm-dd 00:00:00?, 2016-09-21>

The query is the same functionally, but we got to use the index to speed up our query.  This is not by any means claiming that using the index is always the best option, but at least by having the CREATED column “untarnished” by expressions, then we are giving the optimizer more choices on the potential means to best run the query.