Re: efficient compare

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Sun, 23 Apr 2006 13:17:15 GMT
Message-ID: <v7L2g.64578$VV4.1222955_at_ursa-nb00s0.nbnet.nb.ca>


Christopher Browne wrote:

> In the last exciting episode, Bob Badour <bbadour_at_pei.sympatico.ca> wrote:
>

>>Andersen wrote:
>>
>>
>>>Bob Badour wrote:
>>>
>>>
>>>>I don't see how you are going to get log(N).
>>>
>>>log(N) message exchanges! Not bit complexity (i.e. size of the
>>>messages).
>>
>>You are not going to get log(N) in messages either.
>>
>>
>>
>>>>Consider A & B are equal each with 65535 tuples. Insert a tuple
>>>>into A at the beginning of the sort order. Insert a tuple into B at
>>>>the end of the sort order.
>>>>
>>>>None of your subdivided sets are going to give matching checksums
>>>>except in error until you get down to individual tuples.
>>>
>>>Here is a different solution to that problem. Keep N bins, where N
>>>is some very large constant. Hash every tuple to an integer between
>>>1-N and put it in the corresponding bin. Now run the divide and
>>>conquer on this.
>>>I.e, first send checksum of the value of all bins
>>
>>But you said N is a large number which means you have to send a large
>>number of checksums.

>
> I wouldn't think so.
>
> I seem to recall Sybase having a "lock promotion" feature where they
> would basically conclude "Hey, you've locked so much of the table that
> we'll promote those page locks into a table lock. Yeah, it's a worse
> lock, but it's way cheaper to manage than 28,742 page locks, and any
> other processes trying to access this table were screwed anyways..."
>
> This looks like a case to use a similar technique.
>
> In effect, "Hey, there are so few matches that we might as well do a
> bulk copy of the whole table."

But if you consider the example I gave previously that involved a single insert into each table, swapping logs would be trivial.

Likewise, by comparing the size of the log to the size of the relation variable, one could choose to send the relvar instead of the logs. However, one would then have trouble reconciling multiple changes to similar tuples. Received on Sun Apr 23 2006 - 15:17:15 CEST

Original text of this message