Re: Replication in databases

From: Christoph Rupp <cruppstahl_at_gmail.com>
Date: Fri, 17 Oct 2008 06:15:11 -0700 (PDT)
Message-ID: <b24e8358-4cf7-4c4e-863b-0ce0a4e24793_at_y71g2000hsa.googlegroups.com>


On Oct 17, 8:21 am, David BL <davi..._at_iinet.net.au> wrote:
> On Oct 17, 5:18 am, Christoph Rupp <cruppst..._at_gmail.com> wrote:
>
> > I'm implementing my own database, and therefore i was thinking about
> > the same question. (i have not yet implemented recovery, because
> > there's so much other stuff to do).
>
> > Currently my idea is to send the log over the net, because in my
> > design i do not log physical pages, instead i just log the modified
> > key/record items.
>
> I recall your post 6 months ago :)
>
> Are your log records idempotent?

btw - regarding idempotence - this is a challenge.

in case of physical logging, a simple insert usually modifies multiple pages (especially i.e. if the btree is split). so the replication host has to send several database pages, which is a huge overhead, and in case of a problem (i.e. network communication problem or a system crash on the replication node) the remote database will be corrupt if only one of multiple modified pages was applied.

in my design i avoid that problem because my index trees only have atomic operations. so if i.e. an insert operation is sent over the network, i know that it's either applied or not. there's no recovery needed, and my log records do not need to be idempotent.

this atomic design of my index tree may come at the cost of performance (compared to a btree, as i implemented it for hamsterdb), but i think i will be able to avoid performance problems by moving all writing index operations to a background thread.

now i just need about 12 months of vacation to implement all this :) Received on Fri Oct 17 2008 - 15:15:11 CEST

Original text of this message