Re: RM and abstract syntax trees

From: David BL <davidbl_at_iinet.net.au>
Date: Mon, 12 Nov 2007 22:36:24 -0800
Message-ID: <1194935784.793624.325570_at_k35g2000prh.googlegroups.com>


On Nov 13, 1:54 pm, Marshall <marshall.spi..._at_gmail.com> wrote:
> On Nov 12, 7:10 pm, David BL <davi..._at_iinet.net.au> wrote:
>
>
>
>
>
> > On Nov 10, 12:14 pm, Marshall <marshall.spi..._at_gmail.com> wrote:
> > > On Nov 9, 4:34 pm, David BL <davi..._at_iinet.net.au> wrote:
> > > > On Nov 10, 5:29 am, Marshall <marshall.spi..._at_gmail.com> wrote:
>
> > > > > The issue is that pointers need referencing and dereferencing
> > > > > operators which are not part of the relational algebra. Pointers
> > > > > have an associated address space. Pointers are volatile and
> > > > > anchored to a specific run of a specific program on a specific
> > > > > machine, whereas relational ids are durable. Pointers are
> > > > > physical and ids are logical.
>
> > > > You appear to draw conclusions from your assumption that the RDB is
> > > > durable whereas the machine process is not. What happens if you
> > > > delete the RDB? To what extent is a bank account identifier
> > > > meaningful without the associated DB?
>
> > > > What about pointers between objects in a POS (Persistent Object
> > > > Store)? Are you saying they aren't really pointers because the
> > > > address space is durable?
>
> > > > I would rather say that the pointer concept is orthogonal to volatile
> > > > versus durable concerns.
>
> > > You misunderstand me.
>
> > > I'll take you up on the object graph example. Suppose you
> > > have an object graph on one machine, with edges encoded
> > > as pointers. Suppose you serialize that object graph, send it
> > > to another machine and deserialize it there. The pointer values
> > > will all necessarily be different.
>
> > Not necessarily. Under restricted circumstances it is quite
> > reasonable to send objects between two compatible Von Neumann machines
> > on the assumption that they be written to the same memory locations.
> > Core dumps do this sort of thing.
>
> What is the larger context in which you are still arguing
> with me? What is the point you are trying to prove? I have
> already made it quite clear that I see similarities between
> pointers and foreign keys. Is there more? Because I don't
> understand why this conversation keeps going on.

I had missed your post on the 10th and when I cam across it today I disagreed with it, so I responded. Sorry! I actually find the matter quite boring. You are arguing as well so don't call the kettle black! I could just as easily ask you what you are trying to prove.

Note by the way that I'm only thinking of the case of node identifiers in an RM representation of an AST. Usually keys do not represent pointer values.

> Your core dump example is inapplicable. Yes,
> if you preserve *the entire address space* then pointer
> values will still be valid. Which says nothing about
> serializing object graphs, which is what *I* was
> talking about.

Serializing object graphs (as I interpret it) encompasses the special case of not translating pointer values.

I thought your argument was this:

    Sending object graphs between processes (always)     translates pointer values
    Sending relations between RDBs (never) translates     key values
    Therefore there is a fundamental difference.

Sorry to be up front but there is a flaw in that logic. Perhaps I misunderstand your argument.

> > > Suppose you take the same graph as a pair of relations for
> > > nodes and edges, with node ids. Now consider the
> > > serialize/deserialize scenario again. The key values
> > > are all the same.
>
> > Not necessarily. It is possible (and would indeed sometimes be
> > important) to reallocate node ids when sending a graph from one RDB to
> > another, to avoid clashes.
>
> If you want to merge two databases with different semantics,
> you'll have to remap some of the values in the database. You
> might have to do this with all kinds of values, not just keys.
> The remapping of keys is just a particular (and not in any
> way special) case of the fact that you have to have a new
> unified semantics for the merged databases, and remap
> the old ones into it. (Or you might keep the semantics of
> one of them and just remap the other.) If you are not
> merging two databases, then the remapping you bring
> up is inapplicable. If you *are* merging databases,
> then you have to do this remapping whether or not
> you are serializing the database as well. So this point
> says nothing about serializing databases.

> > > The point is, pointer values are transient, ephemeral,
> > > temporary, living only as long as the *process* they
> > > are embedded in. Whereas the key values in the
> > > relations are a first class part of the data, and so
> > > live as long as the data does.
>
> > The important thing is that pointer values are only meaningful in the
> > address of the process, and node identifiers are only meaningful in
> > the "address space" of the RDB. You argue that pointer values are
> > transient, ephemeral, temporary, but that is only because processes
> > tend to be transient, ephemeral, temporary.
>
> > What would you say if those relations are declared on the stack frame
> > within the implementation of a function? Now those relations seem
> > rather transient, ephemeral and temporary.
>
> Irrelevant to my point, a fact which I explained here:
>
> > > The point is not the lifespan of the data; the point
> > > is the lifespan of the pointer values relative to the
> > > lifespan of the data. With keys, it's the same; with
> > > pointers, it's not.
>
> > What does "lifespan of the data" mean?
>
> Try it this way:
>
> "The point is not the lifespan of the database; the point
> is the lifespan of the pointer values relative to the
> lifespan of the database. With keys, it's the same; with
> pointers, it's not."

Who says node identifiers have to have the same lifespan as the database? The only rule is that the lifetime of a given node identifier is tied to the lifetime of the node that is identified using that node identifier. This is analogous to nodes addressed using memory pointers.

It is often the case that physical pointers tend to be reused as the physical space is reused by different objects that exist in the physical space over time, whereas it tends to be optional for logical pointers. Received on Tue Nov 13 2007 - 07:36:24 CET

Original text of this message