The SQL Loader log file … as data

I had an interesting AskTom question recently where the poster was using SQL Loader to load in tables, but wanted to be able to analyze the resultant log file after execution.  And of course, what better way to analyze..well…anything…than with a database and some SQL.

So we need to be able to access the log file as a table, and an external table is perfect for that, so let’s start there.

Here’s a sample SQL Loader log file (with a little perturbation to preserve anonymity).  It’s quite complex because multiple tables were loaded as part of a single SQL Loader run.


Table SCOTT.T1: 222455 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  1229074 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Table SCOTT.T2:
  202547 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  1248982 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Table SCOTT.T3:
  952092 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  499437 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Table SCOTT.T4:
  74373 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  1377156 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Table SCOTT.T5:
  62 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  1451467 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Space allocated for bind array:               11920000 bytes(10000 rows)
Read   buffer bytes:60485760
Total logical records skipped:          0
Total logical records read:       1451529
Total logical records rejected:         0
Total logical records discarded:        0
Run began on Mon Jan 02 16:47:04 2017
Run ended on Mon Jan 02 16:47:42 2017
Elapsed time was:     00:00:38.17
Elapsed time was:     00:00:38.17
CPU time was:         00:00:32.61

Now obviously it’s a trivial exercise to parse that file as an external table with one row per file line.


SQL> CREATE TABLE sqlldr_log (
  2    msg varchar2(200)
  3  )
  4  ORGANIZATION EXTERNAL (
  5    TYPE ORACLE_LOADER
  6    DEFAULT DIRECTORY temp
  7    ACCESS PARAMETERS (
  8      RECORDS DELIMITED BY NEWLINE
  9      FIELDS
 10      (
 11        msg      position(1:512)
 12      )
 13    )
 14    LOCATION ('sqlldr.log')
 15  )
 16  REJECT LIMIT UNLIMITED;

Table created.

SQL> select msg from sqlldr_log;

MSG
-----------------------------------------------------------------------------------------
Table SCOTT.T1: 222455 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  1229074 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Table SCOTT.T2:
  202547 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  1248982 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Table SCOTT.T3:
  952092 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  499437 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Table SCOTT.T4:
  74373 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  1377156 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Table SCOTT.T5:
  62 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  1451467 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.
Space allocated for bind array:               11920000 bytes(10000 rows)
Read   buffer bytes:60485760
Total logical records skipped:          0
Total logical records read:       1451529
Total logical records rejected:         0
Total logical records discarded:        0
Run began on Mon Jan 02 16:47:04 2017
Run ended on Mon Jan 02 16:47:42 2017
Elapsed time was:     00:00:38.17
Elapsed time was:     00:00:38.17
CPU time was:         00:00:32.61

35 rows selected.

SQL>

but “So what ?” I hear you say. Now I got a text file coming as a text output in a query. I don’t seem to be much better off.

But now that I have the full power of SQL, I can start to do some simple parsing of the rows to (for example), extract the row load information.

Firstly, I’ll extract just those rows that have the text “Table” or “Rows” in them, and use a CASE statement to parse each of the “type” of rows (success, failed, etc) into individual columns


SQL> select
  2     rownum r
  3    ,substr(msg,1,50)
  4    ,case when msg like 'Table%' then substr(msg,7,instr(msg,':')-7) end tname
  5    ,case when msg like '  %Rows%successfully%' then substr(msg,3,instr(msg,'Rows')-4) end loaded
  6    ,case when msg like '  %Rows%data errors%' then substr(msg,3,instr(msg,'Rows')-4) end errs_data
  7    ,case when msg like '  %Rows%all WHEN cla%' then substr(msg,3,instr(msg,'Rows')-4) end errs_when
  8    ,case when msg like '  %Rows%were null%' then substr(msg,3,instr(msg,'Rows')-4) end errs_null
  9  from sqlldr_log
 10  where msg like 'Table%' or msg like '  %Rows%';

         R SUBSTR(MSG,1,50)                                   TNAME                          LOADED       ERRS_DATA    ERRS_WHEN    ERRS_NULL
---------- -------------------------------------------------- ------------------------------ ------------ ------------ ------------ ------------
         1 Table SCOTT.T1: 222455 Rows successfully loaded.   SCOTT.T1
         2   0 Rows not loaded due to data errors.                                                        0
         3   1229074 Rows not loaded because all WHEN clauses                                                          1229074
         4   0 Rows not loaded because all fields were null.                                                                        0
         5 Table SCOTT.T2:                                    SCOTT.T2
         6   202547 Rows successfully loaded.                                                202547
         7   0 Rows not loaded due to data errors.                                                        0
         8   1248982 Rows not loaded because all WHEN clauses                                                          1248982
         9   0 Rows not loaded because all fields were null.                                                                        0
        10 Table SCOTT.T3:                                    SCOTT.T3
        11   952092 Rows successfully loaded.                                                952092
        12   0 Rows not loaded due to data errors.                                                        0
        13   499437 Rows not loaded because all WHEN clauses                                                           499437
        14   0 Rows not loaded because all fields were null.                                                                        0
        15 Table SCOTT.T4:                                    SCOTT.T4
        16   74373 Rows successfully loaded.                                                 74373
        17   0 Rows not loaded due to data errors.                                                        0
        18   1377156 Rows not loaded because all WHEN clauses                                                          1377156
        19   0 Rows not loaded because all fields were null.                                                                        0
        20 Table SCOTT.T5:                                    SCOTT.T5
        21   62 Rows successfully loaded.                                                    62
        22   0 Rows not loaded due to data errors.                                                        0
        23   1451467 Rows not loaded because all WHEN clauses                                                          1451467
        24   0 Rows not loaded because all fields were null.                                                                        0

24 rows selected.

Now with a simple analytic function, I can “fill in the blanks” with my table name so it appears in every row


SQL> with base_data as (
  2  select
  3     rownum r
  4    ,substr(msg,1,50)
  5    ,case when msg like 'Table%' then substr(msg,7,instr(msg,':')-7) end tname
  6    ,case when msg like '  %Rows%successfully%' then substr(msg,3,instr(msg,'Rows')-4) end loaded
  7    ,case when msg like '  %Rows%data errors%' then substr(msg,3,instr(msg,'Rows')-4) end errs_data
  8    ,case when msg like '  %Rows%all WHEN cla%' then substr(msg,3,instr(msg,'Rows')-4) end errs_when
  9    ,case when msg like '  %Rows%were null%' then substr(msg,3,instr(msg,'Rows')-4) end errs_null
 10  from sqlldr_log
 11  where msg like 'Table%' or msg like '  %Rows%'
 12  )
 13  select
 14    last_value(tname ignore nulls) over ( order by r ) as tname,
 15    loaded,
 16    errs_data,
 17    errs_when,
 18    errs_null
 19  from base_data;

TNAME                          LOADED       ERRS_DATA    ERRS_WHEN    ERRS_NULL
------------------------------ ------------ ------------ ------------ ------------
SCOTT.T1
SCOTT.T1                                    0
SCOTT.T1                                                 1229074
SCOTT.T1                                                              0
SCOTT.T2
SCOTT.T2                       202547
SCOTT.T2                                    0
SCOTT.T2                                                 1248982
SCOTT.T2                                                              0
SCOTT.T3
SCOTT.T3                       952092
SCOTT.T3                                    0
SCOTT.T3                                                 499437
SCOTT.T3                                                              0
SCOTT.T4
SCOTT.T4                       74373
SCOTT.T4                                    0
SCOTT.T4                                                 1377156
SCOTT.T4                                                              0
SCOTT.T5
SCOTT.T5                       62
SCOTT.T5                                    0
SCOTT.T5                                                 1451467
SCOTT.T5                                                              0

24 rows selected.

and once I’ve got that, then all I need to do put that result within a GROUP BY, and voila ! I now have a summary my SQL Loader execution in a nice usable format


SQL> with base_data as (
  2  select
  3     rownum r
  4    ,substr(msg,1,50)
  5    ,case when msg like 'Table%' then substr(msg,7,instr(msg,':')-7) end tname
  6    ,case when msg like '  %Rows%successfully%' then substr(msg,3,instr(msg,'Rows')-4) end loaded
  7    ,case when msg like '  %Rows%data errors%' then substr(msg,3,instr(msg,'Rows')-4) end errs_data
  8    ,case when msg like '  %Rows%all WHEN cla%' then substr(msg,3,instr(msg,'Rows')-4) end errs_when
  9    ,case when msg like '  %Rows%were null%' then substr(msg,3,instr(msg,'Rows')-4) end errs_null
 10  from sqlldr_log
 11  where msg like 'Table%' or msg like '  %Rows%'
 12  ), pad_table_name as
 13  (
 14  select
 15    last_value(tname ignore nulls) over ( order by r ) as tname,
 16    loaded,
 17    errs_data,
 18    errs_when,
 19    errs_null
 20  from base_data
 21  )
 22  select
 23    tname,
 24    max(loaded) loaded,
 25    max(errs_data) errs_data,
 26    max(errs_when) errs_when,
 27    max(errs_null) errs_null
 28  from   pad_table_name
 29  where  loaded is not null
 30    or errs_data  is not null
 31    or errs_when is not null
 32    or errs_null is not null
 33  group by tname
 34  order by 1;

TNAME                          LOADED       ERRS_DATA    ERRS_WHEN    ERRS_NULL
------------------------------ ------------ ------------ ------------ ------------
SCOTT.T1                                    0            1229074      0
SCOTT.T2                       202547       0            1248982      0
SCOTT.T3                       952092       0            499437       0
SCOTT.T4                       74373        0            1377156      0
SCOTT.T5                       62           0            1451467      0

5 rows selected.

 

Footnote: If the LAST_VALUE analytic function left you bamboozled, check out my video series on them here

Advertisements

One thought on “The SQL Loader log file … as data

  1. Thanks.

    But how do we confirm / proof that when we access ( select * from sqlldr_log ) – will read the data file mapped to the external table in a sequential manner? ( aka – line 1, then line 2..then line 3…etc ) – can it be possible to read the line 3 first before line -1 ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s