Presenting in Perth on September 9th, 2015

I’ll be presenting at a “Lets Talk Oracle” event in Perth, with fellow Ex-Oracle ACE Directors Richard Foote and Chris Muir. Full agenda as follows:



8:30-9:00   Registration and coffee
9:00-10:30  Richard Part I – Database 12c New Features for DBAs (and Developers)
10:30-11:00 Break
11:00-12:30 Richard Part II – Database 12c New Features for DBAs (and Developers)
12:30-1:30  Lunch
1:30-2:30   Chris – Creating RESTful APIs with Oracle Data Services (for Developers and DBAs)
2:30-2:45   Break
2:45-4:15   Connor – Database 12c New Features for Developers (and DBAs)
4:15-5:00   Q&A with the ACES !
5:00        Wrap up

And yes… of course, the event is FREE!

It will be at the Oracle Offices in Perth so places are strictly limited. If you want to attend, send an email to dennis.ward@oracle.com

See you there !!

Less calls…more performance (part 2)

In the previous post, I mentioned that for a programming environment fetching rows from the database, then the method of

  • open a ref cursor
  • issue a fetch call
  • close the ref cursor

might not be appropriate for those situations where the result set is known be a single row (eg primary key lookup).

A better option might be to call a procedure and get those outputs as parameters.

And I broke a cardinal rule… I effectively said “Here’s something that I know to be true…so you should just trust me“.

Dear oh dear oh dear. That’s not good. So without further ado, it’s time to put together a test case.

We’re need to come up with a way to test this across a “fast” network, the aim being that even with a best network latency we can obtain, there will still  be a demonstratable benefit to using a PL/SQL procedure over the ref cursor method.

One way of doing this is, is to have two databases running on the same box. One database will be the “real” database server, ie, it will hold our data. The other database will actually be a client to this database, and we will access the data via a database link. So we never go out across the true physical network – we are simply getting the smallest latency we can by going (in my case)  localhost to localhost.

So firstly on our “real” database, we create a table with a primary key. This database is called “NP12” on my machine


SQL> drop table T purge;

Table dropped.

SQL> create table T (
  2    x int primary key,
  3    y int ,
  4    z char(100));

Table created.

SQL> insert into T
  2  select rownum, rownum, rownum
  3  from dual
  4  connect by level <= 1000;

1000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats('','T');

PL/SQL procedure successfully completed.

Now I’ve created a second database, so we’ll connect to that and that create some routines to access the table T on NP12


SQL> create database link np12 using 'np12';

Database link created.

These are our two procedure we will use for the test. One is our REF CURSOR procedure, which opens a cursor and passes it back to the client. The second does the whole job in the procedure but issuing the query and passing the fetched value back as a parameter.


SQL> create or replace
  2  procedure ref_cursor_open(p_i int, rc in out sys_refcursor) is
  3  begin
  4    open rc for select y from t@np12 where x = p_i;
  5  end;
  6  /

Procedure created.

SQL> create or replace
  2  procedure single_row_fetch(p_i int, p_y out int) is
  3  begin
  4    select y into p_y from t@np12 where x = p_i;
  5  end;
  6  /

Procedure created.

And now its time to test them for speed using a simple looping script. I’m using the INLINE pragma to reduce the overhead of PL/SQL calls so that we have a better accuracy with respect to the cost of the extra calls when using REF CURSOR.


SQL> set timing on
SQL> declare
  2    rc sys_refcursor;
  3    res int;
  4  begin
  5    for j in 1 .. 10 loop
  6      for i in 1 .. 1000 loop
  7        pragma inline(ref_cursor_open,'YES');
  8        ref_cursor_open(i,rc);
  9        fetch rc into res;
 10        close rc;
 11      end loop;
 12    end loop;
 13  end;
 14  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:05.55

SQL> declare
  2    rc sys_refcursor;
  3    res int;
  4  begin
  5    for j in 1 .. 10 loop
  6      for i in 1 .. 1000 loop
  7        pragma inline(single_row_fetch,'YES');
  8        single_row_fetch(i,res);
  9      end loop;
 10    end loop;
 11  end;
 12  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.80
SQL>

And there you have it. Almost twice as fast. If you just trusted what I said from the first post…then that’s great. But in general, if you read something in someone’s blog, either challenge them to prove their assertion or just whip up a script to prove it for yourself.

Less calls…more performance

In various programming environments, a common metaphor is to open a cursor on the database (a REF CURSOR in Oracle parlance), return that cursor handle to the calling environment, and then that cursor is used to fetch or “fill” a data object, which may map to a grid on screen, or just to an array in memory.

And that’s totally fine – its an effective means to obtain a result set from the database to the calling environment.

For example, a typical PLSQL routine might look like:

PROCEDURE my_proc(p_resultset IN OUT sys_refcursor) IS
BEGIN
   OPEN p_resultset FOR
     SELECT col1, col2, col3, ...
     FROM   my_table
     WHERE  status = 'Unprocessed';
END;  

That’s fine when you want a set of rows, and luckily, most programming environments have evolved to the state nowadays where they will perform that retrieval with an array fetch, rather than single row fetches.

Because as a result, we’ve become attuned to always using the REF CURSOR method, so we tend to use this method for all database access.

But what about when we are returning a single row?  For example, if the SQL in the example above was:

SELECT col1, col2, col3, ...
FROM   my_table
WHERE  primary_key_col = 123;

then perhaps fetching from a ref cursor is not the most appropriate course of action. 

Why ?  Because it (typically) will require the programmer to issue two calls to the database:

  • call the PL/SQL routine to open the cursor
  • issue a local “fetch” call, which will then visit the database again to fetch the results

[Depending on the environment, there may even be a third call to close the cursor to ensure cursors dont leak and consume endless amounts of memory]

When retrieving a single row, perhaps think about the efficiency gains of wrapping the SQL within a PLSQL program and pass parameters in and out accordingly. The example above might then become something like:

PROCEDURE my_proc(p_prim_key IN int, p_col1 OUT int, p_col2 OUT int, p_col3 OUT int) IS
BEGIN
   
     SELECT col1, col2, col3, ...
     INTO   p_col1, p_col2, p_col3
     FROM   my_table
     WHERE  primary_key_col = 123;
END;  

The whole operation is done with a single call, and you get the PLSQL cursor caching benefit for free as well.

For high volume calls, this might be a significant performance boost.  Let’s face it – nowadays there is a good chance that your database server is just insanely fast at processing data, whether it be fetching it or storing it.  The speeds and core counts just continue to boggle the mind.  But all of that will count for little if you are jumping back and forth across the network between your middle tier program and the database.  Less calls = less trips = more performance.

Avoiding public embarrassment with triggers

If you create a trigger that does not compile, any part of your application that refers to the table with the invalid trigger is basically dead until that trigger becomes valid.  Even if a DML on that table is not going affect any rows, you still wont be able to issue it.  For example,

SQL> delete from T where 1=0;
delete from T where 1=0
            *
ERROR at line 1:
ORA-04098: trigger 'TRG' is invalid and failed re-validation

That’s a bad bad place for your application…and a bad bad place for your career Smile
 
In 11g and above, you can avoid this using the DISABLE keyword every time you create a trigger
 
All triggers should be created as:
 

create or replace
trigger MY_TRIGGER
DISABLE
begin
   ...
end;

 
If the trigger is created and compiled successfully, then  you can enable it.
 

alter trigger MY_TRIGGER enable

 
If the trigger for some unforeseen reason does not compile, it is disabled, and hence, the failed compilation will not break your application.

Make it a coding standard for your database developers.

Where is my tracefile

As a developer, you might have have been reading about SQL trace. After all, we all want to be able to diagnose and improve the performance of the code we write. So if you’ve googled on Oracle performance, you will have no doubt stumbled upon SQL trace.

So if you’ve run “alter session set sql_trace = true”, or perhaps “DBMS_MONITOR.SESSION_TRACE_ENABLE”.  you’re probably itching to get your hands on that trace file.  The question is – where is it ?  And that is the subject of my next quick tip…Enjoy

https://youtu.be/Xoq3lvxUTmY

Loading file data … easier than you think

So a request comes in from a colleague…

”Hi Connor.  I’ve created this new table, and I need to populate it with some data.  I’ve got it in Excel – can you help me load it”

Now, at this point, my heart normally sinks, because whilst the Oracle tools to load flat file data are comprehensive, I know that I now have to

  • go find out the table definition, column names and data types
  • write up a SQL Loader control file, or
  • write up an external table definition script
  • repeat this 10 times until I get the syntax right

all of which is going to be a drain on my time…

What I really would like, is to do this:

1) Check that my table is ready for loading


SQL> desc EMP_TAB
 Name                                      Null?    Type
 ----------------------------------------- -------- -------------
 EMPNO                                     NOT NULL NUMBER(4)
 ENAME                                              VARCHAR2(10)
 JOB                                                VARCHAR2(9)
 MGR                                                NUMBER(4)
 HIREDATE                                           DATE
 SAL                                                NUMBER(7,2)
 COMM                                               NUMBER(7,2)
 DEPTNO                                             NUMBER(2)
 
SQL> select * from EMP_TAB;

no rows selected

 

2) Have a quick squizz at the data, make sure it looks sensible

7369,SMITH,CLERK,7902,17-DEC-80,800,,20
7499,ALLEN,SALESMAN,7698,20-FEB-81,1600,300,30
7521,WARD,SALESMAN,7698,22-FEB-81,1250,500,30
7566,JONES,MANAGER,7839,02-APR-81,2975,,20
7654,MARTIN,SALESMAN,7698,28-SEP-81,1250,1400,30
7698,BLAKE,MANAGER,7839,01-MAY-81,2850,,30
7782,CLARK,MANAGER,7839,09-JUN-81,2450,,10
7788,SCOTT,ANALYST,7566,09-DEC-82,3000,,20
7839,KING,PRESIDENT,,17-NOV-81,5000,,10
7844,TURNER,SALESMAN,7698,08-SEP-81,1500,0,30
7876,ADAMS,CLERK,7788,12-JAN-83,1100,,20
7900,JAMES,CLERK,7698,03-DEC-81,950,,30
7902,FORD,ANALYST,7566,03-DEC-81,3000,,20
7934,MILLER,CLERK,7782,23-JAN-82,1300,,10

 

3) And then just load it ! 

I mean seriously…why can’t I just run sqlldr, pass the table name, and just have the thing work !

Well…maybe….just maybe….

C:\temp>sqlldr userid=scott/tiger data=emp.dat table=emp_tab

SQL*Loader: Release 12.1.0.2.0 - Production on Mon Aug 17 11:08:33 2015

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Express Mode Load, Table: EMP_TAB
Path used:      External Table, DEGREE_OF_PARALLELISM=AUTO

Table EMP_TAB:
  14 Rows successfully loaded.

Check the log files:
  emp_tab.log
  emp_tab_%p.log_xt
for more information about the load.

Yup…Once you get to version 12c, there is now express mode for SQL Loader, which for all those simple loads that dont need all the bells and whistles, you can just jump straight in… and load your data. Very cool indeed.

The log file reveals what is going on “under the covers”

creating external table "SYS_SQLLDR_X_EXT_EMP_TAB"

CREATE TABLE "SYS_SQLLDR_X_EXT_EMP_TAB" 
(
  "EMPNO" NUMBER(4),
  "ENAME" VARCHAR2(10),
  "JOB" VARCHAR2(9),
  "MGR" NUMBER(4),
  "HIREDATE" DATE,
  "SAL" NUMBER(7,2),
  "COMM" NUMBER(7,2),
  "DEPTNO" NUMBER(2)
)
ORGANIZATION external 
(
  TYPE oracle_loader
  DEFAULT DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000
  ACCESS PARAMETERS 
  (
    RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8MSWIN1252
    BADFILE 'SYS_SQLLDR_XT_TMPDIR_00000':'emp_%p.bad'
    LOGFILE 'emp_tab_%p.log_xt'
    READSIZE 1048576
    FIELDS TERMINATED BY "," LRTRIM 
    REJECT ROWS WITH ALL NULL FIELDS 
    (
      "EMPNO" CHAR(255),
      "ENAME" CHAR(255),
      "JOB" CHAR(255),
      "MGR" CHAR(255),
      "HIREDATE" CHAR(255)
        DATE_FORMAT DATE MASK "DD-MON-RR",
      "SAL" CHAR(255),
      "COMM" CHAR(255),
      "DEPTNO" CHAR(255)
    )
  )
  location 
  (
    'emp.dat'
  )
)REJECT LIMIT UNLIMITED

executing INSERT statement to load database table EMP_TAB

INSERT /*+ append parallel(auto) */ INTO EMP_TAB 
(
  EMPNO,
  ENAME,
  JOB,
  MGR,
  HIREDATE,
  SAL,
  COMM,
  DEPTNO
)
SELECT 
  "EMPNO",
  "ENAME",
  "JOB",
  "MGR",
  "HIREDATE",
  "SAL",
  "COMM",
  "DEPTNO"
FROM "SYS_SQLLDR_X_EXT_EMP_TAB"

dropping external table "SYS_SQLLDR_X_EXT_EMP_TAB"

Table EMP_TAB:
  14 Rows successfully loaded.

Run began on Mon Aug 17 11:08:33 2015
Run ended on Mon Aug 17 11:08:36 2015

Elapsed time was:     00:00:03.21
CPU time was:         00:00:00.09


Learn more about SQL Loader Express here

Mr DISTINCT might not be your friend

Whenever you have the need to use the DISTINCT keyword, its worth just pausing for a second, and making sure that you are not hiding just a larger issue. It actually might represent either incorrect use of SQL or incorrect assumptions from the data model.

Consider the following example

SELECT DISTINCT d.dname
FROM   emp e, dept d
WHERE  e.ename = 'SMITH'
AND    e.deptno = d.deptno

The query is certainly valid, but when I see "distinct" I ask myself the following questions:

Has the DISTINCT has been added in an attempt to only return a single row ?, ie, is someone working under the assumption being that an employee name can only refer to a single department ? Unless there is a unique constraint on the ENAME column, then we can still just as easily get multiple rows back (even with the DISTINCT), so the SQL will be a "sleeping problem" in the application until the data causes it to fail.

Because the DISTINCT keyword here:

  • doesn’t give you any guarantee that you’ll only get one row, and
  • might be making the database do excessive work (get all rows, sort them and remove duplicates), every single time you run this SQL

Alternatively, perhaps there is meant to be a single department for an ENAME, ie, ENAME is unique. If that is the case, then we should change the database model/design to reflect it and not use DISTINCT to "fake" it.

I’m not saying DISTINCT is always a bug – but it’s always worth just asking yourself if you were correct to be using it.