Re: efficient compare

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Sun, 23 Apr 2006 00:43:16 GMT
Message-ID: <E4A2g.64441$VV4.1212219_at_ursa-nb00s0.nbnet.nb.ca>


Andersen wrote:

> Bob Badour wrote:
>

>> If B has a very outdated database, simply copy A over B. This assumes 
>> the data is sufficiently outdated that one can discard any changes to 
>> B that were never reflected in A. If B has not been around, simply 
>> copy A over B. Otherwise, have each send the other its logs from the 
>> last sync forward.

>
>
> That is the whole problem. I do not know how up to date A and B are, and
> I want to efficiently find that out so that I can send as little data
> over as possible. Logs might be huge, how do I know which part of the
> log to send over, I do not want to send all of it, unless it is needed!
> Your example with sending the size of the log over is in line with what
> I am looking for. But I want something much more efficient.
>
> Here is a naive scheme I was thinking about:
> Send a checksum the sorted set A to c(B), and vice versa.
> If the checksums match, you are done.
> If not, split the set into two parts, take the checksum of each of the 2
> parts and let the nodes exchange. For those that it does not match,
> repeat the divide and conquer scheme until you know which parts to
> exchange. This would mean you would be done in worst case log(N) rounds.

I don't see how you are going to get log(N).

Consider A & B are equal each with 65535 tuples. Insert a tuple into A at the beginning of the sort order. Insert a tuple into B at the end of the sort order.

None of your subdivided sets are going to give matching checksums except in error until you get down to individual tuples. Received on Sun Apr 23 2006 - 02:43:16 CEST

Original text of this message