Re: Bloom Filter Partition Pruning

From: Toon Koppelaars <toon_at_rulegen.com>
Date: Fri, 16 Mar 2018 08:17:46 +0100
Message-ID: <CAA9w=EsTpK_1JGoheG7Z+210eQvmgrOMCJua78D_qUV0fVjj3g_at_mail.gmail.com>



Normal BF usage hashes the column-values and sets bits based on these hashes in the BF.

BF partition pruning usage of BF works differently: - The column-values (of the smaller table) are first fed into the function that produces the partition-id of the partition into which this column-value would have been stored in the bigger table. - It then hashes this partition-id and uses this hash to set bits in the BF.

Then upon scanning the big table:
- Before it starts scanning a partition, it hashes the partition-id and checks whether the bit is set in the BF
- If set: continue scanning the partition. - If not set: skip this partition.

On Thu, Mar 15, 2018 at 11:35 PM, Jaromir D.B.Nemec <jaromir_at_db-nemec.com> wrote:

> Hi All,
>
> I have basic understanding of the Bloom filter and the mechanism of the
> filtering the keys
> and I was convinced, that the same is true for the Bloom filter partition
> pruning. But recently
> after investigating a query on a hash partitioned table I'm somehow
> confused
> and need some clarification.
>
> Let me illustrate it on a simple example
>
> TABLEA is hash partitioned on the join column
> TABLEB is a simple table
>
>
> The query ...
>
> SELECT A.* FROM TABLEA A
> WHERE ID IN (SELECT ID FROM TABLEB)
>
> ... leads to the following execution plan:
>
> ------------------------------------------------------------
> ----------------
> ---------------------------
> | Id | Operation | Name | Rows | Bytes | Cost
> (%CPU)|
> Time | Pstart| Pstop |
> ------------------------------------------------------------
> ----------------
> ---------------------------
> | 0 | SELECT STATEMENT | | 24242 | 1088K| 460
> (1)|
> 00:00:01 | | |
> |* 1 | HASH JOIN RIGHT SEMI | | 24242 | 1088K| 460
> (1)|
> 00:00:01 | | |
> | 2 | PART JOIN FILTER CREATE | :BF0000 | 2 | 44 | 3
> (0)|
> 00:00:01 | | |
> |* 3 | TABLE ACCESS FULL | TABLEB | 2 | 44 | 3
> (0)|
> 00:00:01 | | |
> | 4 | PARTITION HASH JOIN-FILTER| | 400K| 9375K| 456
> (1)|
> 00:00:01 |:BF0000|:BF0000|
> | 5 | TABLE ACCESS FULL | TABLEA | 400K| 9375K| 456
> (1)|
> 00:00:01 |:BF0000|:BF0000|
> ------------------------------------------------------------
> ----------------
> ---------------------------
>
>
> So in the operation 2 the BF is created after reading all keys from the
> TABLEB and this BF is used in operation
> 4 to prune the partitions of the TABLEA.
>
> This is exact the point of my doubt - this could work for LIST partitioning
> schema, where I can take all list partition keys
> (from the partition definition) and check them against the BF to see if a
> partition is relevant or not. But this will IMO never work for a HASH
> partitioning. Oracle will not scan a partition of a TABLEA and check all
> keys with the BF to see that no key matches and
> the partition can be pruned...
>
> So my question is, how is the Bloom filter partition pruning implemented
> for
> HASH and RANGE partitioning schema.
> Does Oracle additionally to BF pass a list of partitions? I see in 10128
> trace that the pruning works fine and I didn't encounter a false positive
> (i.e. not pruned) partition, so I guess there will be some simple solution,
> but a web search provided no explanation.
>
>
> Kind Regards,
>
> Jaromir D.B. Nemec
> http://www.db-nemec.com
>
>
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

-- 
Toon Koppelaars
RuleGen BV
Toon.Koppelaars_at_RuleGen.com
www.RuleGen.com
TheHelsinkiDeclaration.blogspot.com

(co)Author: "Applied Mathematics for Database Professionals"
www.RuleGen.com/pls/apex/f?p=14265:13

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Mar 16 2018 - 08:17:46 CET

Original text of this message