Re: efficient compare

From: Christopher Browne <>
Date: Sun, 23 Apr 2006 08:12:22 -0400
Message-ID: <>

In the last exciting episode, Bob Badour <> wrote:
> Andersen wrote:
>> Bob Badour wrote:
>>> I don't see how you are going to get log(N).
>> log(N) message exchanges! Not bit complexity (i.e. size of the
>> messages).
> You are not going to get log(N) in messages either.
>>> Consider A & B are equal each with 65535 tuples. Insert a tuple
>>> into A at the beginning of the sort order. Insert a tuple into B at
>>> the end of the sort order.
>>> None of your subdivided sets are going to give matching checksums
>>> except in error until you get down to individual tuples.
>> Here is a different solution to that problem. Keep N bins, where N
>> is some very large constant. Hash every tuple to an integer between
>> 1-N and put it in the corresponding bin. Now run the divide and
>> conquer on this.
>> I.e, first send checksum of the value of all bins
> But you said N is a large number which means you have to send a large
> number of checksums.

I wouldn't think so.

I seem to recall Sybase having a "lock promotion" feature where they would basically conclude "Hey, you've locked so much of the table that we'll promote those page locks into a table lock. Yeah, it's a worse lock, but it's way cheaper to manage than 28,742 page locks, and any other processes trying to access this table were screwed anyways..."

This looks like a case to use a similar technique.

In effect, "Hey, there are so few matches that we might as well do a bulk copy of the whole table."

wm(X,Y):-write(X),write('_at_'),write(Y). wm('cbbrowne','').
"Sigh.  I like to  think it's just the Linux people who  want to be on
the `leading edge' so bad they walk right off the precipice."
-- Craig E. Groeschel
Received on Sun Apr 23 2006 - 14:12:22 CEST

Original text of this message