Re: RM and abstract syntax trees

From: David BL <>
Date: Sun, 11 Nov 2007 08:35:13 -0800
Message-ID: <>

On Nov 11, 1:08 am, "David Cressey" <> wrote:
> "David BL" <> wrote in message

> > The mathematician in me axiomatizes the concept of pointer in terms of
> > its abstract properties, which (ignoring null pointers) are
> > 1) the existence of an associated address space, which is just a
> > set of objects
> > 2) a bijection between objects in the address space and pointer
> > values
> > In one direction this bijection is "address of" and in the
> > other
> > it is "dereference"
> > 3) the ability to compare pointer values
> > This encompass physical address spaces, virtual address spaces, C++
> > smart pointers, persistent OIDs and pointer swizzling etc.
> > It also encompass node identifiers of an AST if we regard the DB as
> > defining an address space of AST nodes, and an appropriate select
> > query represents a dereference that logically binds to precisely one
> > node of the AST.
> > My perspective conflicts with your statement above.
> First, the term "pointer" originates in computer science, not mathematics.
> This is unlike the term "relation" which originates in mathematics and is
> transferred to the world of computer science, where it presumably retains
> all the properties that made it important from a mathematical perspective.

Yes the term "pointer" originates in computer science and its meaning is not at all precise. Wouldn't it be nice if the software development discipline was generally treated as a branch of mathematics, and there were many standardised axiomatic definitions to give it the proper foundations.

> Second, your analysis overlooks the difference between address based
> addressing and content based addressing. If you retrieve a row by
> specifying an OID to be matched, you are still engaging in content based
> addressing. If you specify a pointer, there is no law that says that the
> object pointed to is going to contain a copy of the pointer you used for
> access.
> In most pointer based systems, the pointer is not included in the contents,
> being seen as redundant to its location.
> The distinction between content based addressing and pointer based
> addressing is fundamental to the comparison of databases built on the
> relational model and databases built on the graph model.

This is an interesting and important distinction. I assume you are suggesting pointers by definition never use content based addressing. However I regard the following as a counter example...

In the literature on pointer swizzling, persistent OIDs are classified as being of two different types : physical or logical. This is exactly your distinction above with different terminology. A physical OID directly specifies the location of the object on disk. For example, in the commercial product ObjectStore, the low 32 bits of the OID represents a seek position within a "cluster". By contrast a logical OID uses an indexing data structure allowing for objects to be relocated. A Log Structured Store for example will typically use this latter approach. Despite such an important and useful distinction, it is only relevant to the underlying physical implementation of the Persistent Object Store (POS). It has no impact whatsoever on the fact that clients of a POS like to think of OIDs as persistent pointers between objects. Note further that this is not merely conceptual - these products actually provide explicit pointer syntax and semantics for the application programmer - perhaps even to the extent of using hardware support for page faulting when dereferencing virtual memory pointers to perform pointer swizzling.

> It is possible, by suitable abstraction and by poor design choices, to
> create a database that offers none of the advantages claimed for the
> relational model, while appearing to conform to the relational model
> superficially. Sad to say, thousands of such databases have been
> constructed over the years, and have led many uneducated database newbies
> to believe that the relational model doesn't really offer the advantages
> that its proponents cliam.

My programming background is image processing and systems programming, and I've rarely seen a need for the RM in my work. 18 months ago I could easily have been one of those newbies that doesn't realise that RM is fundamental to data management.

Now I think the RM is great, and have been studying it as a hobby. Received on Sun Nov 11 2007 - 17:35:13 CET

Original text of this message