A new line on NEWLINE

Recently I was doing a simple external table load using a CSV file, but was getting an interesting error. My file looked simple enough


"ID","EMAIL","TIMESTAMP","SUBJECT","STATUS","STATUS_TS"
"2012348048","john@anon.com","05/02/2000","Subject 1","5","09/04/2007"
"2412348048","mike@anon.com","05/02/2000","Subject 1","5","09/16/2002"
"348543169051","sue@anon.com","03/10/2001","Subject 1","5","03/24/2008"
"348396029762","mary@anon.com","03/10/2001","Subject 1","5","03/10/2001"
"1212348047","sam@anon.com","05/02/2000","Subject 1","5","05/02/2000"
"1612348048","vincent@anon.com","05/02/2000","Subject 1,"5","06/02/2006"
...
...

So it should have been a fairly straightforward external table definition to access it


SQL> create table T
  2        ( ID int
  3         ,EMAIL varchar2(255)
  4         ,TIMESTAMP date
  5         ,SUBJECT varchar2(512)
  6         ,STATUS int
  7         ,STATUS_TS date
  8      )
  9      organization external
 10      ( default directory TMP
 11        access parameters
 12        ( records delimited by newline 
 13          fields terminated by ',' optionally enclosed by '"'
 14          ( ID, EMAIL, "TIMESTAMP" DATE 'MM/DD/YYYY', SUBJECT, STATUS, STATUS_TS DATE 'MM/DD/YYYY'
 15          )
 16        )
 17        location ('my_report.csv')
 18    )
 19    ;

Table created.

At which point, things went downhill rapidly 🙂 When I queried the external table, I got the following error


SQL> select * from T;
select * from T;
*
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-29400: data cartridge error
KUP-04020: found record longer than buffer size supported,
524288, in c:\temp\my_report.csv (offset=1048576)


The clue here is that a record was seen to be longer than allowed. Since I know all my “records” are quite short, it must be the definition of what ends a record that is causing a problem. problem.  In my case, the file had come from a system with a different characterset, and hence the concept of “newline” differs between characterset/target platform. Once I included that information into the external table definition, all was fine.


SQL> create table T
  2        ( ID int
  3         ,EMAIL varchar2(255)
  4         ,TIMESTAMP date
  5         ,SUBJECT varchar2(512)
  6         ,STATUS int
  7         ,STATUS_TS date
  8      )
  9      organization external
 10      ( default directory TMP
 11        access parameters
 12        ( records delimited by '\n' characterset AL32UTF8
 13          fields terminated by ',' optionally enclosed by '"'
 14          ( ID, EMAIL, "TIMESTAMP" DATE 'MM/DD/YYYY', SUBJECT, STATUS, STATUS_TS DATE 'MM/DD/YYYY'
 15          )
 16        )
 17        location ('my_report.csv')
 18    )
 19    ;

Table created.

SQL> select * from T where rownum < 10;

...

16,308 rows selected.


Advertisements

One thought on “A new line on NEWLINE

  1. The power of the Oracle’s external table feature is just awesome. No one in my organization believed that I could take files from an IBM Mainframe (EBCDIC), binary sftp them to an HP-UX server (ASCII) and load them via an external table definition using the ORACEL_LOADER access driver and the proper characterset like they were plain text files. This was in 11R2.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s