Re: RM and abstract syntax trees

From: David BL <davidbl_at_iinet.net.au>
Date: Mon, 12 Nov 2007 19:10:34 -0800
Message-ID: <1194923434.409670.11930_at_s15g2000prm.googlegroups.com>


On Nov 10, 12:14 pm, Marshall <marshall.spi..._at_gmail.com> wrote:
> On Nov 9, 4:34 pm, David BL <davi..._at_iinet.net.au> wrote:
> > On Nov 10, 5:29 am, Marshall <marshall.spi..._at_gmail.com> wrote:
>
> > > The issue is that pointers need referencing and dereferencing
> > > operators which are not part of the relational algebra. Pointers
> > > have an associated address space. Pointers are volatile and
> > > anchored to a specific run of a specific program on a specific
> > > machine, whereas relational ids are durable. Pointers are
> > > physical and ids are logical.
>
> > You appear to draw conclusions from your assumption that the RDB is
> > durable whereas the machine process is not. What happens if you
> > delete the RDB? To what extent is a bank account identifier
> > meaningful without the associated DB?
>
> > What about pointers between objects in a POS (Persistent Object
> > Store)? Are you saying they aren't really pointers because the
> > address space is durable?
>
> > I would rather say that the pointer concept is orthogonal to volatile
> > versus durable concerns.
>
> You misunderstand me.
>
> I'll take you up on the object graph example. Suppose you
> have an object graph on one machine, with edges encoded
> as pointers. Suppose you serialize that object graph, send it
> to another machine and deserialize it there. The pointer values
> will all necessarily be different.

Not necessarily. Under restricted circumstances it is quite reasonable to send objects between two compatible Von Neumann machines on the assumption that they be written to the same memory locations. Core dumps do this sort of thing.

> Suppose you take the same graph as a pair of relations for
> nodes and edges, with node ids. Now consider the
> serialize/deserialize scenario again. The key values
> are all the same.

Not necessarily. It is possible (and would indeed sometimes be important) to reallocate node ids when sending a graph from one RDB to another, to avoid clashes.

> The point is, pointer values are transient, ephemeral,
> temporary, living only as long as the *process* they
> are embedded in. Whereas the key values in the
> relations are a first class part of the data, and so
> live as long as the data does.

The important thing is that pointer values are only meaningful in the address of the process, and node identifiers are only meaningful in the "address space" of the RDB. You argue that pointer values are transient, ephemeral, temporary, but that is only because processes tend to be transient, ephemeral, temporary.

What would you say if those relations are declared on the stack frame within the implementation of a function? Now those relations seem rather transient, ephemeral and temporary.

> The point is not the lifespan of the data; the point
> is the lifespan of the pointer values relative to the
> lifespan of the data. With keys, it's the same; with
> pointers, it's not.

What does "lifespan of the data" mean? Are you saying data exists independently of the database? Isn't that merely a question for philosophers to argue over?

> And this was only one of a number of differences
> I cited. The two aren't the same.
Received on Tue Nov 13 2007 - 04:10:34 CET

Original text of this message