Even Middle School Kids know better than Oracle



  • To see how rediculous Oracle is, you don't really need to know databases, programming, or even computers. But you will need to understand how an index works. 

    A popular library assignment they give middle school kids (well, at least when I was that age) was to come up with a list of books from a given author with the title, year published, category, and publishing house. The idea is to go the author-indexed card catalog, write down the Dewey Decimal numbers for the books, then go to the shelf and write the information down on the list.

    But that's not how Oracle works. If you were to sit Oracle down in the library, and ask it to find the same (non-indexed) info from all books by Garrison Keillor, it would start at the first book, look at the author, write down the information if it is the right author, and then move on to the next book, and to the next, all the way through all the library. It's to supid to realize that it can write down the Dewey numbers, then go to the book shelf to find the non-indexed data.

    For a technical example, check out this first query. <font color="#0000f0" size="2">

    SELECT</font><font size="2"> LOAN_NBR</font><font color="#0000f0" size="2">,</font><font size="2"> LOAN_RGSTRTN_TS
    </font><font color="#0000f0" size="2">FROM</font><font size="2"> WHL_LOAN_TB
    </font><font color="#0000f0" size="2">WHERE</font><font size="2"> LOANSTATUS_CD</font><font color="#0000f0" size="2">=</font><font color="#ff0000" size="2">'500'

    </font>

    It is asking for the Loan Number and the Registration Date for all loans that have a status of 500. The loan number and status columns are indexed, so what would you do? Here's what Oracle does ...

    <font face="Arial">Operation</font> <font face="Arial">Object Name</font> <font face="Arial">Rows</font> <font face="Arial">Bytes</font> <font face="Arial">Cost</font>
    <font face="Arial" color="#0000ff">SELECT STATEMENT</font> <font face="Arial" color="#0000ff"> </font> <font face="Arial" color="#0000ff">2 K</font> <font face="Arial"></font> <font face="Arial" color="#0000ff">107041</font>
    <font face="Arial" color="#ff0000">   TABLE ACCESS FULL</font> <font face="Arial">WHL_LOAN_TB</font> <font face="Arial">2 K</font> <font face="Arial">42 K</font> <font face="Arial">107041</font>

    It scans the entire table (note the cost of 107041, which works out to be like 10 minutes), completely ignoring the index on the status column because you asked for the Registration Date, which is not on the index. So, here's what you have to do to have the query executed in reasonable time (4344, which is 2464% faster), as the execution plan shows.

    <font color="#0000f0" size="2">

    SELECT</font><font size="2"> L1</font><font color="#0000f0" size="2">.</font><font size="2">LOAN_NBR</font><font color="#0000f0" size="2">,</font><font size="2"> L1</font><font color="#0000f0" size="2">.</font><font size="2">LOAN_RGSTRTN_TS
    </font><font color="#0000f0" size="2">FROM</font><font size="2"> WHL_LOAN_TB L1</font><font color="#0000f0" size="2">,</font><font size="2"> WHL_LOAN_TB L2
    </font><font color="#0000f0" size="2">WHERE</font><font size="2"> L1</font><font color="#0000f0" size="2">.</font><font size="2">LOAN_NBR </font><font color="#0000f0" size="2">=</font><font size="2"> L2</font><font color="#0000f0" size="2">.</font><font size="2">LOAN_NBR
       </font><font color="#0000f0" size="2">AND</font><font size="2"> L2</font><font color="#0000f0" size="2">.</font><font size="2">LOANSTATUS_CD</font><font color="#0000f0" size="2">=</font><font color="#ff0000" size="2">'500'

    </font>
    <font face="Arial">Operation</font> <font face="Arial">Object Name</font> <font face="Arial">Rows</font> <font face="Arial">Bytes</font> <font face="Arial">Cost</font>
    <font face="Arial" color="#0000ff">SELECT STATEMENT</font> <font face="Arial" color="#0000ff"> </font> <font face="Arial" color="#0000ff">2 K</font> <font face="Arial"></font> <font face="Arial" color="#0000ff">4344</font>
    <font face="Arial">_TABLE ACCESS BY INDEX ROWID</font> <font face="Arial">WHL_LOAN_TB</font> <font face="Arial">1</font> <font face="Arial">15</font> <font face="Arial">2</font>
    <font face="Arial">__NESTED LOOPS</font> <font face="Arial"> </font> <font face="Arial">2 K</font> <font face="Arial">60 K</font> <font face="Arial">4344</font>
    <font face="Arial">___VIEW</font> <font face="Arial">index$_join$_002</font> <font face="Arial">2 K</font> <font face="Arial">26 K</font> <font face="Arial">3857</font>
    <font face="Arial">____HASH JOIN</font> <font face="Arial"> </font> <font face="Arial">2 K</font> <font face="Arial">26 K</font> <font face="Arial"></font>
    <font face="Arial">_____INDEX FAST FULL SCAN</font> <font face="Arial">WHL_LOAN_IDX</font> <font face="Arial">2 K</font> <font face="Arial">26 K</font> <font face="Arial">515</font>
    <font face="Arial">_____INDEX FAST FULL SCAN</font> <font face="Arial">WHL_LOAN_IDX01</font> <font face="Arial">2 K</font> <font face="Arial">26 K</font> <font face="Arial">515</font>
    <font face="Arial">___INDEX RANGE SCAN</font> <font face="Arial">WHL_LOAN_IDX</font> <font face="Arial">1</font> <font face="Arial"></font> <font face="Arial">2</font>

    Even in this trivial example is harder to read and understand. Now immagine a real query. Now you know why you need expert oracle consultants to maintain and develop crap like this.



  • I know Oracle is not so smart, but what if you update statistics?



  • Uh, yeah - kinda hard to blame that one on Oracle, especially when that's the best solution you could come up with.

    - Chris



  • "4344, which is 2464% faster"

    Um, what does x% faster mean?  I take it to mean that something occurs in x% less time.  Clearly this isn't the case.  Although "x% faster" may not actually have a strict definition.  Much clearer would be to say "This shows a cost of 4344, which runs in 4% of the time as the full table scan". 

    This reminds me of a SyncSort ad back in the mainframe days, in the Computerworld newspaper, which quoted a mainframe sysadmin saying "Now that we use Syncsort, we get our daily workload done in 240% less time than before".  Ha!  I'm sure that the Syncsort company employs some mathematicians, but obviously they don't check the ad copy.



  • Um, not to pick on this too much (but after all it's a WTF), but I just realized that your division was reversed anyway; the long one takes 24.64 times as long as the short one.   That doesn't translate to the short one being 2464% faster.



  • This should not be happening. Yes, you have to run statistics occasionally so that the oracle optimiser knows what indexes to use.



  • So.... you started this hater's club because you have no idea what
    you're doing and instead of actually doing something about it, you're
    BRAGGING about it?  You sad, sad person.



    That second query is the silliest thing I have ever seen. It didn't even take me, the resident troll, to point this out -- the
    other posters second it.  This is an error between your ears, not
    Oracle.



    1.  Gather statistics, bubba.    Though even
    without stats, the rule based optimizer would know to grab the index
    for that first query, assuming it's turned on in your DB.  If
    you're forcing use of the cost based optimizeer, that would explain the
    result.



    2.  COST has nothing to do with execution time, it's an arbitrary metric, not very comparible between queries.




  • @Alex Papadimoulis said:

    But that's not how Oracle works. If
    you were to sit Oracle down in the library, and ask it to find the same
    (non-indexed) info from all books by Garrison Keillor, it would start
    at the first book, look at the author, write down the information
    if it is the right author, and then move on to the next book, and to
    the next, all the way through all the library. It's to supid to realize
    that it can write down the Dewey numbers, then go to the book shelf to
    find the non-indexed data.





    It all depends on your data and whether you have collected statistics.
    If all or most of your books have the same status, then there's little
    point going to the index first and then 'going to the shelf' - you
    might as well look at everything on the shelf and weed out the small
    number of irrelevant books as you look at them.



    The 'cost' is purely a method of comparing possible executions of the
    'SAME' query - a cost of 4000 means nothing in relation to a cost of 2
    if the underlying queries are different.



  • @Alex Papadimoulis said:

    The loan number and status columns are indexed

     

    One question for the original author, is this a single composite index containing both columns, or is it two seperate indices, each containing one of the columns?

     



  • The reason I asked the above (he said, noticing that the thread was two years old [:S]) was that if the index was a composite, I would expect exactly the described behaviour (analysed statistics or not).

     

    With a composite index on both LOAN_NBR (I assume the primary key) and LOANSTATUS_CD, the first query for LOANSTATUS_CD = '500' will not use this index, because it's not helpful, Oracle isn't being asked to access rows using both fields so won't make use of it.


    With the second query, where the L1 is being joined to L2 on LOAN_NBR and L2 has a clause on LOANSTATUS_CD, Oracle is effectively clausing both fields and uses the index L2, then L1 can be joined on becuase of the Primary Key index of L1.LOAN_NBR.

     

    If there was an index on the table that referenced only LOANSTATUS_CD, Oracle should use this first time and the query would return in the "best" time.

     

    PS: I hate Oracle too, but am made to work with it for a project, tweaking indices for optimal execution plans is a daily job [:(]



  • If Oracle does a FTS although the index exists and it estimates higher costs for the FTS than for an index scan, something is fundamentally broken. ANALYZE TABLE probably won't help, though you may give it a try.



  • because it's not helpful

    If you can write a query that returns the same set of results, uses the index, and is faster than the original query, then it is indeed helpful, whether Oracle thinks it is or not.



  • @DrPizza said:

    If you can write a query that returns the same set of results, uses the index, and is faster than the original query, then it is indeed helpful, whether Oracle thinks it is or not.



    Not necessarily.  A lot of factors go into developing robust query execution plans such as number of rows, selectivity, etc.  Many databases (including MS SQL) will do a full table scan even if there are indices when the number of rows are small (just a few thousand rows).  In these cases it is most likely more efficient to just scan the table and disregard the unmatched records than it is to read the index file and then read the main table based on those pointers.

    I'm not sure if Oracle behaves the same way as SQL Server in this regard.  If it does and the Rows column in the execution plan accurately describes the number of records, that could be why its performing a full scan.  Maybe we need to fork over $50k to get 5 minutes time from an Oracle expert and find out.

    Larry



  • @lpope187 said:

    @DrPizza said:

    If you can write a query that returns the same set of results, uses the index, and is faster than the original query, then it is indeed helpful, whether Oracle thinks it is or not.


    ...
    Maybe we need to fork over $50k to get 5 minutes time from an Oracle expert and find out.

    Larry

    Speaking as a former $50k/5minute Oracle expert, believe me that you would be better off spending $50 and a few days reading a book, especially one such as
    http://www.amazon.com/gp/product/1590595300/sr=8-1/qid=1149654223/ref=pd_bbs_1/002-3824938-9493625?_encoding=UTF8

    I think the hostility in these threads is based on a misperception: that a DBMS ought to be easy to understand. DBMSes are the most complex system software in the world, and Oracle is by far the "biggest". You're not going to understand what the CBO, for example, is doing just by looking at it - there is a lot going on under the covers. Fortunately, knowing Oracle internals is not required if you don't insist on doing stupid stuff. If you do some reading, you can create a reasonably Oracle-friendly database and get Oracle to mostly manage itself. Most of the anti-Oracle examples I've seen here, other than ones taken out of context, are not following good practices. They demonstrate nothing about Oracle.

    <parable>
    How is this for an analogy? Let's say you're a Java programmer and you've just been hired to write a utility in a language you've never used, say, C. You sit down and start coding, and a day later you're ready to compile. WTF, 753 compile errors, this language sucks! After a while you hit on the syntax for most of the problems, but you see an egregious pattern in the remaining errors - the damn compiler doesn't know what try and catch are! You decide to teach C a lesson, and you write an exception handling module that pretty much works like Java. Meanwhile you are posting to your favorite hacker blogs about the horrible new language that doesn't work right.

    Unfortunately, you spent 3 months with nothing measurable to show for it, so you end up posting to ijustgotfired.com. A new Java programmer takes over, reads a book, uses online resources, figures out the domain of the language, and is done in a month.
    </parable>

    So, to use Oracle properly, first figure out what Oracle is for. Don't try to make Oracle into Java or VB or MySQL or whatever, it is like running headfirst into a wall. Might dent the wall a little, but most definitely is not good for your head.



  • @lpope187 said:

    @DrPizza said:

    If you can write a query that returns the same set of results, uses the index, and is faster than the original query, then it is indeed helpful, whether Oracle thinks it is or not.



    Not necessarily.

    Yes necessarily.

    A lot of factors go into developing robust query execution plans such as number of rows, selectivity, etc.

    I think you miss the point.

    We have two queries here.  One which uses the index.  One which does not.  The query which uses the index is fast.  The query which does not is slow.  That means that the index is helpful.  Oracle is wrong in its evaluation.  This is not a hypothetical; we know that the index is genuinely helpful in the query, because when you use the index, the query goes faster.

     



  • @RyuO said:



    I think the hostility in these threads is based on a misperception:
    that a DBMS ought to be easy to understand. DBMSes are the most complex
    system software in the world, and Oracle is by far the "biggest".
    You're not going to understand what the CBO, for example, is doing just
    by looking at it - there is a lot going on under the covers.
    Fortunately, knowing Oracle internals is not required if you don't
    insist on doing stupid stuff. If you do some reading, you can create a
    reasonably Oracle-friendly database and get Oracle to mostly manage
    itself. Most of the anti-Oracle examples I've seen here, other than
    ones taken out of context, are not following good practices. They
    demonstrate nothing about Oracle.


    Never said that a database should be easy to understand.  I'm not a DBA, nor do I really want to be one.  My career is heading into businees process, not becoming a technical expert.  I just need to know from a high-level what is possible.  I'd rather have the experts setup and maintain the system. 

    You want my honest opinion, all database suck in there own special way.  Doesn't matter whether its Oracle, MS SQL, DB2, etc.  Oracle does get a lot of bad mouthing - some of it deserved, some of it not.  Many badmouth it just because its top dog (or at least was, I don't know what market share is like today).  Oracle definitely has its places, but I've seen it used in systems just slightly more complicated than a CD collection database which is a WTF itself. 

    As far as the high price of Oracle consultants, they are definitely not the only ones that overcharge in my opinion.  Almost every major ERP company charges similiarly.  I've seen quotes from SAP @ $2M just to implement a workorder module we already own that's going to be used for some 30 people. 

    We bitch and complain about software and companies all the time and that is a very good thing.  If we as consumers don't express our thoughts about quality, pricing, etc than the companies won't change for the better.  If you don't like it, that's your problem not mine.

    Larry




  • @DrPizza said:

    I think you miss the point.

    We have two queries here.  One which uses the index.  One which does not.  The query which uses the index is fast.  The query which does not is slow.  That means that the index is helpful.  Oracle is wrong in its evaluation.  This is not a hypothetical; we know that the index is genuinely helpful in the query, because when you use the index, the query goes faster.



    I concede that.  I missed your "and is faster" qualifier.

    Larry

Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.