Re: How to flush data most efficiently from memory to disk when db checkpoint?

From: Anne & Lynn Wheeler <lynn_at_garlic.com>
Date: Sun, 01 Jul 2007 11:18:02 -0600
Message-ID: <m3bqewgi0l.fsf_at_garlic.com>


Sune <sune_ahlgren_at_hotmail.com> writes:
> I'm looking into designing an in-memory DB and I wonder:
>
> How to flush data most efficiently when I checkpoint?
>
> Say I have a page size of 8K and 1K of those have been updated in
> random places, that is, the changes may be contiguous but most likely
> they are not.
>
> Will it always be more efficient to flush the whole page instead of
> keeping track of each element and write them to disk one by one?
> Obviously, if I did this I would flush them from page offset 0 to the
> end of the page, in that order.
>
> Sorry to bother you with such elementary questions but I want to get
> things right from the beginning, and other people's experiences are
> usually very helpful.

some of this can be related to transactional memory ... there have been various past threads in comp.arch about both software & hardware transaction memory.

early 801/risc (late 70s, early 80s) had support for hardware transactional memory ... it was used for journaled filesystem (JFS) in aixv3 on RIOS (i.e. power, rs/6000) ... basically all the (unix) filesystem metadata was laid out in memory area defined for transactional memory. wiki reference
http://en.wikipedia.org/wiki/JFS_file_system

there are granualarity trade-offs regarding having explicit log API ... and having explicit references to all changes or having to scan for all the actual changes. when palo alto started looking at porting jfs to platforms w/o transaction memory ... they found that they actually had better performance with the explicit log calls ... even compared to retrofitting to aixv3 running on rs/6000.

references to software transactional memory http://en.wikipedia.org/wiki/Software_transactional_memory

part of the transactional memory tends to also get tied up with parallelism and concurrency models

comp.arch thread (from google groups)
http://groups.google.com/group/comp.arch/browse_thread/thread/5b0cb88a6d36b309/f5ad4a01cbed0a79?lnk=st&q=&rnum=12#f5ad4a01cbed0a79

intel article related to large number of cores http://www.intel.com/technology/magazine/computing/tera-scale-0606.htm Received on Sun Jul 01 2007 - 19:18:02 CEST

Original text of this message