AskTom TV

Sometimes at conferences I get asked – “How do you know all of the answers for the AskTom questions?”

I’d love to say “Because we’re super smart” Smile but the reality is, we’re just like anyone else with a passion for database technology.  We like to research things, apply our skills and use our experience to solve problems.

So I’ve realised that even though we try to give as much detail as possible when we are helping the AskTom community, it might not always be immediately apparent what thought process we followed when tackling a problem.

To help with that, we now have “AskTom TV”.   It’s not really a TV show Smile but what we’re doing is taking a sample AskTom question, and talking about how we tackled a problem, rather than just jumping straight to the solution.

The first two episodes are here, and there will be more to come throughout 2017.

Enjoy.

AskTom takes another step forward

For over 16 years, AskTom has been one of the most valuable resources available to developers and database administrators working with the Oracle Database.  With over 20,000 questions tackled and answered, along with over 120,000 follow up’s to additional queries, it remains an outstanding knowledgebase of Oracle assistance.

And today, AskTom just got a whole lot better!

We’re excited to announce a new member of AskTom team…database evangelist Maria Colgan.  Many of you will know Maria for her work with Optimizer and In-Memory, but in fact she brings decades of expertise across the entire database technology stack to assist you with your questions. Maria blogs regularly at sqlmaria.com so make sure you check in there regularly as well. With Maria on the team, AskTom keeps getting better and better in 2017!

 

Maria_korea

Multi-table insert

An interesting question came through on AskTom recently.  The requirement was to perform a single pass through a source table, and load the data into three target tables.

Now that’s trivially achieved with a multi-table insert, but there was a subtle “twist” on this requirement.  Each of the three target tables may already contain some, none or all of the rows from the source table.  Hence the requirement was to “fill in the blanks”.

So here’s a little demo of one way we could achieve this.

First, here is our source table with 10 rows (1 through 10)


SQL> create table t_source as select rownum s from dual connect by level <= 10; Table created. SQL>
SQL> select * from t_source;

         S
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10

10 rows selected.

And here are our three target tables, T1, T2 and T3, each with a subset of the rows already


SQL> create table t1 as select rownum x from dual connect by level <= 5; Table created. SQL> create table t2 as select rownum y from dual connect by level <= 3; Table created. SQL> create table t3 as select rownum z from dual connect by level <= 6; Table created. SQL> select * from t1 order by 1;

         X
----------
         1
         2
         3
         4
         5

5 rows selected.

SQL> select * from t2 order by 1;

         Y
----------
         1
         2
         3

3 rows selected.

SQL> select * from t3 order by 1;

         Z
----------
         1
         2
         3
         4
         5
         6

6 rows selected.

Now obviously we could perform a simple insert-select-where-not-exists style operation for each table, but we need to meet our poster’s requirement of a single pass through the source table. So we will take advantage of an outer join to pick up just those rows that do not already match.



SQL> insert all
  2    when in_tab1 is null then
  3      into t1 (x ) values (s )
  4    when in_tab2 is null then
  5      into t2 (y ) values (s )
  6    when in_tab3 is null then
  7      into t3 (z ) values (s )
  8  select
  9    t_source.s,
 10    t1.x in_tab1,
 11    t2.y in_tab2,
 12    t3.z in_tab3
 13  from t_source, t1, t2, t3
 14  where t_source.s = t1.x(+)
 15  and t_source.s = t2.y(+)
 16  and t_source.s = t3.z(+)
 17  /

16 rows created.

SQL>
SQL> select * from t1 order by 1;

         X
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10

10 rows selected.

SQL> select * from t2 order by 1;

         Y
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10

10 rows selected.

SQL> select * from t3 order by 1;

         Z
----------
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10

10 rows selected.

SQL>

And the job is done. Our poster never really elaborated on why a single pass was necessary – but let’s assume it was due to the source table being large. If we look at the execution plan, we see a swag of cascading hash joins, so whilst a single pass of the source table has been achieved, there is no guarantee that we’re not going to end up with other issues in processing all of those “concurrent” joins.



---------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |    10 |   120 |     8   (0)| 00:00:01 |
|*  1 |  HASH JOIN OUTER     |          |    10 |   120 |     8   (0)| 00:00:01 |
|*  2 |   HASH JOIN OUTER    |          |    10 |    90 |     6   (0)| 00:00:01 |
|*  3 |    HASH JOIN OUTER   |          |    10 |    60 |     4   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| T_SOURCE |    10 |    30 |     2   (0)| 00:00:01 |
|   5 |     TABLE ACCESS FULL| T2       |     3 |     9 |     2   (0)| 00:00:01 |
|   6 |    TABLE ACCESS FULL | T1       |     5 |    15 |     2   (0)| 00:00:01 |
|   7 |   TABLE ACCESS FULL  | T3       |     6 |    18 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T_SOURCE"."S"="T3"."Z"(+))
   2 - access("T_SOURCE"."S"="T1"."X"(+))
   3 - access("T_SOURCE"."S"="T2"."Y"(+))

But that’s often life on AskTom.We only get half the story Smile

Happy birthday to …. well … us :-)

Last year on September 1, the AskTom site was resurrected under Apex 5 with myself and Chris Saxon manning the fort to deliver as much value to the Oracle community as the esteemed previous custodian of the site did in the many years preceding us.

In the last year, we have

  • answered ~2,500 questions
  • taken and followed up on ~3000 reviews

It’s great fun and very rewarding working on the AskTom site.  We get to participate in the ever growing Oracle community, and it is like “free training” for Chris and I – we get to learn new things every day.

So I hope you’re getting as much value out of the site as we are.  We have plans to continue the site’s evolution to yield even more value in future.

Thanks!

image

Technology debates

As always happens from time to time, we had the following request on AskTom today:

Could you list down 2 reasons why sql server is better than oracle?

Then 2 counter reasons as to why why oracle is better than sql server

 

And I thought I’d reproduce my response here

 

I’ll answer it slightly differently…

Q: When is SQL Server better then Oracle ?
A: When you have good, intelligent, knowledgeable SQL Server people on your workforce.

Q: When is Oracle better then SQL Server ?
A: When you have good, intelligent, knowledgeable Oracle people on your workforce.

I don’t really buy into all that “my Dad’s better than your Dad” stuff in the IT landscape.  Because the best product in the world will always fail if you have the wrong people building systems with it.  And conversely, good people will have the skill and confidence to say “Product X is not suited to task Y”, if that is indeed the case.

When I think back to the first Oracle system I worked on, I chuckle at how bad a job I did at it (something I know now, not then).  Conversely, I think of some recent projects and I’m incredibly proud of some of the stuff I’ve built.

Technology doesn’t drive success…quality people do.

The death of the demo script

On AskTom, one of the things we always strive to do is provide demo scripts in our answers where appropriate.  That way, people can see our line of thought, and hopefully get an understanding as to how we arrived at a solution.  And there’s two very important concepts in that last sentence:

  • our line of thought
  • how we arrived

The demo script itself, is a representation of that, not necessarily a cast iron guarantee that you should be cut/pasting it into your Production system!  If you are honestly expecting that a demo script that we run, will be exactly how it will run on your system, you’re perhaps being a little naive.

Why ? Because for a long time in the speaker community, there’s been a humorous theme about how the concept of demo scripts has required more and more “terms and conditions” as each version of Oracle got more sophisticated.  Namely, the evolution of  what a demo script might be impacted by :

(Now before anyone takes me to task here on versions and features etc, just like demo scripts themselves, I’m using this as an example of what you need be aware of, not a definitive or exhaustive list)

Oracle 7:

“Here’s my script…and chances are, it will just run like this on your Oracle 7 installation.”

Oracle 8: (multiple blocksizes)

“Here’s my script…it was built an 8k blocksize, your results may differ if you are not.”

Oracle 8i: (locally managed tablespace, and auto segment space management introduced)

“Here’s my script… it was built an 8k blocksize, on a locally managed tablespace, with uniform extent size 1m, in segment space auto-managed, your results may differ if you are not.”

Oracle 9i: (system stats, cpu costing introduced)

“Here’s my script… it was built an 8k blocksize, on a locally managed tablespace, with uniform extent size 1m, in segment space auto-managed, with no system stats calculated or stored, your results may differ if you are not.”

Oracle 10g: (NOWORKLOAD system stats always on, dynamic sampling defaults)

“Here’s my script… it was built an 8k blocksize, on a locally managed tablespace, with uniform extent size 1m, in segment space auto-managed, with NOWORKLOAD system stats gathered, dynamic sampling set to default, your results may differ if you are not.”

Oracle 10.2: (auto histogram, ‘default’ system stats)

“Here’s my script… it was built an 8k blocksize, on a locally managed tablespace, with uniform extent size 1m, in segment space auto-managed, with installation default system stats gathered, dynamic sampling set to default, stats gathered with default parameters, and no previous queries run against your table other than the one in the demo, your results may differ if you are not.”

Oracle 11: (auto multiblock read count)

“Here’s my script… it was built an 8k blocksize, on a locally managed tablespace, with uniform extent size 1m, in segment space auto-managed, with installation default system stats gathered, dynamic sampling set to default, stats gathered with default parameters, and no previous queries run against your table other than the one in the demo, with multiblock read count explicitly set to 128, and a file system that supports 1meg physical I/O, your results may differ if you are not”

Oracle 11.2: (serial direct read, exadata-specific system stats)

“Here’s my script… it was built an 8k blocksize, on a locally managed tablespace, with uniform extent size 1m, in segment space auto-managed, with installation default system stats gathered, dynamic sampling set to default, stats gathered with default parameters, and no previous queries run against your table other than the one in the demo, with multiblock read count explicitly set to 128, and a file system that supports 1meg physical I/O, with serial direct enabled, and a buffer cache size of exactly 2G, and on a non-exadata system, your results may differ if you are not”

Oracle 12: (auto-stats, clustering factor optimization, automatic extended stats)

“Here’s my script… it was built an 8k blocksize, on a locally managed tablespace, with uniform extent size 1m, in segment space auto-managed, with installation default system stats gathered, dynamic sampling set to default, stats gathered with default parameters, and no previous queries run against your table other than the one in the demo, with multiblock read count explicitly set to 128, or a file system that supports 1meg physical I/O, with serial direct enabled, and a buffer cache size of exactly 2G, and on a non-exadata system, with clustering factor TABLE_CACHED_BLOCKS left unchanged, and no hidden columns automatically added to your table by extended stats, your results may differ if you are not”

Oracle 12.1.0.2: (in-memory, large table caching, full database caching, on-chip functionality)

“Here’s my script… it was built an 8k blocksize, on a locally managed tablespace, with uniform extent size 1m, in segment space auto-managed, with installation default system stats gathered, dynamic sampling set to default, stats gathered with default parameters, and no previous queries run against your table other than the one in the demo, with multiblock read count explicitly set to 128, or a file system that supports 1meg physical I/O, with serial direct enabled, and a buffer cache size of exactly 2G, and on a non-exadata system, with clustering factor TABLE_CACHED_BLOCKS left unchanged, and no hidden columns automatically added to your table by extended stats, on an Intel chipset, without large table caching activated, and no auto capture of sql plans, your results may differ if you are not”

And I’ve not even mentioned the myriad of platform differences (things might run slightly different on Windows versus Linux versus AIX ), and of course, the question may have come from someone running Oracle Apps, which depending on the version being used, these parameters such as those below may be set to non-default values:

  • _b_tree_bitmap_plans
  • _fast_full_scan_enabled
  • _like_with_bind_as_equality
  • _sort_elimination_cost_ratio
  • optimizer_secure_view_merging

which changes how queries are optimized and even how they may be transformed before optimization.

But what if we eliminate the version discrepancy ? … Rest assured, even if you have the exact same version of software as us, the same potential for difference applies.  We’re typically on our VM inside our laptop on a private network, and you are on a server with different CPU, different RAM, different parameters, different storage, different platform.  What patches have you applied ? What parameters have you set ? There are just so many variables in play here.

So are demo scripts are waste of time?… Of course not.  Just remember what they are – an addenda to a line of thought, an application of logic, a glimpse into a potential useful process.  Your job is then to take advantage of that thought and logic and discover for yourself the possible benefits on your system.

Happy scripting ! Smile