RE: Bloom Filter Partition Pruning

From: Jaromir D.B.Nemec <>
Date: Fri, 16 Mar 2018 17:07:53 +0100
Message-ID: <16fa01d3bd40$f2897e50$d79c7af0$>

Hi Toon,    

OK, I understood. So in theory a false positive can happen (a partition is scanned, but it contains no key) – but due to the practical number of partitions this will be very rare.    

Kind Regards,  

Jaromir Nemec


Tel +436764039288  

From: Toon Koppelaars [] Sent: Freitag, 16. März 2018 11:10
To: jaromir nemec <> Subject: Re: Bloom Filter Partition Pruning  

I cannot find this in the docs, but it is how it works.

This of course requires that the join-column is driving the partition-id in the big table.

If that's not the case, you'll never get BF partition pruning.  

On Fri, Mar 16, 2018 at 11:00 AM, jaromir nemec < <> > wrote:

Hallo Toon,

thanks very much for the explanation, it makes total sense. The only additional question I have is, is there some documentation of this bahavior, or is it the *simplest possible explanation* that confirm with the observation.



> Normal BF usage hashes the column-values and sets bits based on these
> hashes in the BF.
> BF partition pruning usage of BF works differently:
> - The column-values (of the smaller table) are first fed into the
> function that produces the partition-id of the partition into which this
> column-value would have been stored in the bigger table.
> - It then hashes this partition-id and uses this hash to set bits in the
> BF.
> Then upon scanning the big table:
> - Before it starts scanning a partition, it hashes the partition-id and
> checks whether the bit is set in the BF
> - If set: continue scanning the partition.
> - If not set: skip this partition.

Received on Fri Mar 16 2018 - 17:07:53 CET

Original text of this message