Re: efficient compare

From: Frank Hamersley <>
Date: Sun, 23 Apr 2006 08:23:59 GMT
Message-ID: <zQG2g.13583$>

Bob Badour wrote:
> Andersen wrote:

>> Bob Badour wrote:
>>> I don't see how you are going to get log(N).
>> log(N) message exchanges! Not bit complexity (i.e. size of the messages).

> You are not going to get log(N) in messages either.
>>> Consider A & B are equal each with 65535 tuples. Insert a tuple into 
>>> A at the beginning of the sort order. Insert a tuple into B at the 
>>> end of the sort order.
>>> None of your subdivided sets are going to give matching checksums 
>>> except in error until you get down to individual tuples.
>> Here is a different solution to that problem. Keep N bins, where N is 
>> some very large constant. Hash every tuple to an integer between 1-N 
>> and put it in the corresponding bin. Now run the divide and conquer on 
>> this.
>> I.e, first send checksum of the value of all bins 

> But you said N is a large number which means you have to send a large
> number of checksums.

My view is aligned with Bob, but if you want to see a case where there has been some brain power expended on the problem then have a look at rsync.

Cheers, Frank. Received on Sun Apr 23 2006 - 10:23:59 CEST

Original text of this message