Re: Unbelievable Outer Join

From: Jan <noreply.jan_at_gmail.com>
Date: Thu, 13 Aug 2009 06:34:17 -0700 (PDT)
Message-ID: <20acefb0-ff3b-4551-85f9-313f7f33f8e3_at_j21g2000yqe.googlegroups.com>



On 13 srp, 14:57, Mark D Powell <Mark.Pow..._at_eds.com> wrote:
> On Aug 12, 1:44 pm, Charles Hooper <hooperc2..._at_yahoo.com> wrote:
>
>
>
>
>
> > On Aug 12, 12:10 pm, Jan <noreply...._at_gmail.com> wrote:
>
> > > Neither of the two tables is partitioned. The statistics are I believe
> > > accurate.
>
> > > The only way I can explain it is that it is a bug. But this is such a
> > > basic functionality that I can't believe it would not be fixed yet. It
> > > just seems too silly to be a bug.
>
> > > TT203_10100000001372631 is filtered during TABLE ACCESS FULL
> > > operation, but the way I undestand the plans is that 6798 rows will be
> > > left (moved into the next part which is the nested loop) after filter
> > > is applied. No filtering is applied during NESTED LOOPS OUTER
> > > operation, it's really just "6798 rows " left outer join "1 row" the
> > > way I understand it, which should result in  "6798 rows " not "1 row"
> > > as estimated.
>
> > This is typical behavior for a nested loop join.  If you create a
> > DBMS_XPLAN for the SQL statement showing actual execution statistics
> > (with the STATISTICS_LEVEL set to ALL at the session level or a /*+
> > GATHER_PLAN_STATISTICS */ hint), you will see a couple additional
> > columns in the plan - one of those columns is the STARTS column.  If
> > 6,798 is exactly the number of rows returned from the
> > TT203_10100000001372631 table, you will see the number 6,798 in the
> > starts column of the TABLE ACCESS BY INDEX ROWID | TI238_14935572 line
> > in the plan.
>
> > For instance, a hash join for a query produces a DBMS_XPLAN output
> > like this (broken up to avoid word wrap problems):
> > Plan hash value: 2205292220
> > ---------------------------------------------------------------------------
> > | Id  | Operation                     | Name   | Starts | E-Rows | A-
> > Rows |
> > ---------------------------------------------------------------------------
> > |   0 | SELECT STATEMENT              |        |      1 |        |
> > 10000 |
> > |*  1 |  FILTER                       |        |      1 |        |
> > 10000 |
> > |*  2 |   HASH JOIN                   |        |      1 |     99 |
> > 10000 |
> > |*  3 |    TABLE ACCESS FULL          | T3     |      1 |    100 |
> > 10000 |
> > |   4 |    TABLE ACCESS BY INDEX ROWID| T4     |      1 |    100 |
> > 10000 |
> > |*  5 |     INDEX RANGE SCAN          | IND_T4 |      1 |    100 |
> > 10000 |
> > ---------------------------------------------------------------------------
>
> > ------------------------------------------------------------
> > |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
> > ------------------------------------------------------------
> > |00:00:00.01 |     532 |    355 |       |       |          |
> > |00:00:00.01 |     532 |    355 |       |       |          |
> > |00:00:00.01 |     532 |    355 |  1517K|  1517K| 1842K (0)|
> > |00:00:00.01 |     158 |    155 |       |       |          |
> > |00:00:00.01 |     374 |    200 |       |       |          |
> > |00:00:00.01 |     121 |     32 |       |       |          |
> > ------------------------------------------------------------
>
> > Predicate Information (identified by operation id):
> > ---------------------------------------------------
> >    1 - filter(:1<=:2)
> >    2 - access("T3"."C1"="T4"."C1")
> >    3 - filter(("T3"."C1"<=:2 AND "T3"."C1">=:1))
> >    5 - access("T4"."C1">=:1 AND "T4"."C1"<=:2)
>
> > The nested loop version of the plan looks like this:
> > ---------------------------------------------------------------------------
> > | Id  | Operation                     | Name   | Starts | E-Rows | A-
> > Rows |
> > ---------------------------------------------------------------------------
> > |   0 | SELECT STATEMENT              |        |      1 |        |
> > 10000 |
> > |*  1 |  FILTER                       |        |      1 |        |
> > 10000 |
> > |   2 |   NESTED LOOPS                |        |      1 |        |
> > 10000 |
> > |   3 |    NESTED LOOPS               |        |      1 |     99 |
> > 10000 |
> > |*  4 |     TABLE ACCESS FULL         | T3     |      1 |    100 |
> > 10000 |
> > |*  5 |     INDEX RANGE SCAN          | IND_T4 |  10000 |      1 |
> > 10000 |
> > |   6 |    TABLE ACCESS BY INDEX ROWID| T4     |  10000 |      1 |
> > 10000 |
> > ---------------------------------------------------------------------------
>
> > ---------------------------------
> > |   A-Time   | Buffers | Reads  |
> > ---------------------------------
> > |00:00:00.06 |     876 |    355 |
> > |00:00:00.06 |     876 |    355 |
> > |00:00:00.06 |     876 |    355 |
> > |00:00:00.06 |     623 |    187 |
> > |00:00:00.06 |     257 |    155 |
> > |00:00:00.01 |     366 |     32 |
> > |00:00:00.01 |     253 |    168 |
> > --------------------------------
>
> > Predicate Information (identified by operation id):
> > ---------------------------------------------------
> >    1 - filter(:1<=:2)
> >    4 - filter(("T3"."C1"<=:2 AND "T3"."C1">=:1))
> >    5 - access("T3"."C1"="T4"."C1")
> >        filter(("T4"."C1"<=:2 AND "T4"."C1">=:1))
>
> > Charles Hooper
> > IT Manager/Oracle DBA
> > K&M Machine-Fabricating, Inc.
>
> What is most annoying about this is I had just had a discussion with a
> developer on the same issue not very long ago.
>
> -- Mark D Powell --– Skrýt citovaný text –
>
> – Zobrazit citovaný text –

Thanks for all the great feedback. It is most helpfull, you are the best. I am not really sure how my company would react if I put SQL publicly available and I don't think I am allowed to without a permission....

  • dynamic sampling is enforced. We deleted statistics on the table and have default setting for dynamic sampling to be used in such case. I checked this is the case and that it applies.
  • I can't trace the session, I can't simulate the problem (without significant effort)
  • the index used is the not the problem
  • FILTER (ID 6) is this this predicate: TI238_14935572."T203VALUE_OID" IS NULL. this column is not nullable. Ie. query returns all rows from tt203 that do not exist in TI238. Since TT203 is joined by primary key all rows are distinct, TI238 has just one row.
  • Thanks for checking the ANSI alternative Charles, it's good to hear it would not do any difference
  • Mark, Ruud, I also saw more SQL related problems due to using ANSI than the other way around. I use ANSI in my coding and that seems to be causing problems from time to time...

Friends, I think Jonathan is right. This probably will be a CBO bug. Once I deploy workaround I will try to simulate the problem and report SR to Oracle if I can. I can't imagine another reason for this to be so.

Regards,
 Jan Rod Received on Thu Aug 13 2009 - 08:34:17 CDT

Original text of this message