Re: How does the FILTER operation work?

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Thu, 24 May 2018 15:51:30 +0000
Message-ID: <LO1P123MB09775A005EA349BDA9B2893AA56A0_at_LO1P123MB0977.GBRP123.PROD.OUTLOOK.COM>


I think the costing thing is a bug - or something that looks so much like a bug that it's hard to tell the difference.

I've modelled the query and data, and the cost of the tablescan WITHOUT the subquery happens to add an I/O component to the basic tablescan that is the number of blocks below the highwater mark of the LOB index. The query WITH the subquery adds 5% of that number to the basic I/O cost of the tablescan.

Historically the effect of a filter subquery on the cardinality estimate was 5%. So it looks as if the optimizer code for this query has decided that the tablescan will return 100% of the data, but costed the LOB predicate as if it only has to tested for 5% of the rows (i.e. as if the subquery runs first and does it old 5% thing).

Of course "lob is null" doesn't have to visit the lob segment at all, anyway - so there seem to be two bugs in the costing.

Question to OP - I am a little puzzled that your plan shows a FILTER subquery rather than an anti-join. Maybe it's related to data sizes, but do you have any optimizer parameters that are set to non-standard values ?  

Regards
Jonathan Lewis



From: Mark W. Farnham <mwf_at_rsiz.com>
Sent: 24 May 2018 16:13
To: Jonathan Lewis; 'Oracle Mailinglist' Subject: RE: How does the FILTER operation work?

Probably you want to do something like a with clause to create the list of row_ids where photo is NOT NULL, and then use that list in the rest of the query, avoiding multiple anythings on the blob. It might SEEM like two passes over the same thing, but it will work out to a single pass over the blobs and a smaller set with no blob references for the correlation.

I think. Draft quality response in a hurry. JL can set us both straight if I farbled it.

mwf

-----Original Message-----

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Jonathan Lewis
Sent: Thursday, May 24, 2018 8:54 AM
To: Oracle Mailinglist
Subject: Re: How does the FILTER operation work?

Not sure how I missed the line about photo.photo being a blob.

Hypothesis about the cost of the tablescan - and if it's right it's not self-consistent, but then again there are lots of places the optimizer isn't self-consistent.

With the subquery in place Oracle assumes the effect of the subquery is to reduce the number of times the blob is checked. Without the subquery the blob has to be checked every row. The impact of costing for access to the blob (is it declared as disable storage in row ?) may explain the difference.

Regards
Jonathan Lewis



From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> on behalf of Paul Houghton <Paul.Houghton_at_uis.cam.ac.uk> Sent: 24 May 2018 13:06
To: Oracle Mailinglist
Subject: How does the FILTER operation work?

Oracle 12.1.0.2. Linux x86_64.

I was asked why the following SQL was running for so long (~6 minutes). I can see a number of problems with it, and suggested that a way to tune it would be only to select the required data. For my own interest I want to understand why the cost of the insert is so high compared to the cost of the table scans.

If I remove the "not exists" (see below), the access of the temp table disappears (as you would expect) and the insert cost goes right down, so it must be the filter that is the issue. However, I don't understand why it is so expensive. I can't find documentation about what it is doing. Surely it has to compare 21211 rows against the 1 row (Actually 0, but I understand the optimizer never says 0) returned by the not exists sub query. I would have thought this means that the cost should be more like the cost of accessing the data in the first place? Clearly the optimizer is correct as the query does take a long time to run.

Also, why is the cost of the full table scan of the photo table so different between the two queries?

The queries are below. I edited the table names, (There are two, photo and temp) and the id, hopefully consistently.

Photo.photo is a blob. Emplid and emplid_search are varchar2(11). Oprid is varchar2(30). TEMP has no indexes. Photo has an index on emplid.

Thanks in advance for your insights.

  INSERT INTO TEMP
  SELECT DISTINCT 'id',
    PHT.emplid
  FROM PHOTO PHT
  WHERE 1 =1
  AND PHT.PHOTO IS NOT NULL
  AND NOT EXISTS
    (SELECT 'x'
    FROM TEMP TMP
    WHERE PHT.EMPLID=TMP.EMPLID_SRCH
    AND TMP.OPRID = 'id'
    )
  ;



  | Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)|
Time     |


----------------------------------------------------------------------------
---------
| 0 | INSERT STATEMENT | | 21210 | 3334K| 5802K (2)| 00:03:47 | | 1 | LOAD TABLE CONVENTIONAL | TEMP | | | | | |* 2 | FILTER | | | | | | |* 3 | TABLE ACCESS FULL | PHOTO | 21211 | 3334K| 313 (1)| 00:00:01 | |* 4 | TABLE ACCESS FULL | TEMP | 1 | 17 | 380 (2)|
00:00:01 |

  Predicate Information (identified by operation id):


   2 - filter( NOT EXISTS (SELECT 0 FROM " TEMP" "TMP" WHERE

              "TMP"."EMPLID_SRCH"=:B1 AND "TMP"."OPRID"='id'))
   3 - filter("PHT"."PHOTO" IS NOT NULL)
   4 - filter("TMP"."EMPLID_SRCH"=:B1 AND "TMP"."OPRID"='id')

...

  INSERT INTO TEMP
  SELECT DISTINCT 'id',
    PHT.emplid
  FROM PHOTO PHT
  WHERE 1 =1
  AND PHT.UC_PBI_PHOTO IS NOT NULL;



  | Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)|
Time     |


----------------------------------------------------------------------------
---------
| 0 | INSERT STATEMENT | | 21211 | 3334K| 3659 (1)| 00:00:01 | | 1 | LOAD TABLE CONVENTIONAL | TEMP | | | | | |* 2 | TABLE ACCESS FULL | PHOTO | 21211 | 3334K| 3659 (1)|
00:00:01 |

  Predicate Information (identified by operation id):


     2 - filter("PHT"."PHOTO" IS NOT NULL)

--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l


--
http://www.freelists.org/webpage/oracle-l
Received on Thu May 24 2018 - 17:51:30 CEST

Original text of this message