Re: what are keys and surrogates?
Date: Tue, 8 Jan 2008 18:17:59 -0800 (PST)
Message-ID: <d1328096-476c-4beb-a17b-59c856341266_at_e6g2000prf.googlegroups.com>
On Jan 8, 9:26 am, JOG <j..._at_cs.nott.ac.uk> wrote:
> We have some bits of paper with numbers written on (in pencil). We are
> storing info about these bits of paper in a database using the schema:
> {paperID, Value}. The key, PaperID, is a unique database generated
> hidden surrogate.
>
> We have an enumeration:
> { (paperID:1, Value:X), (paperID:2, Value:Y), (paperID:3, Value:Z) }
>
> Someone comes to you the DB admin, with 3 bits of paper and says, ok
> the boss has changed the values on some of the bits of paper. What I
> have here is one bit of paper with an A on, one with a B and and one
> with a Z. Please update the database accordingly.
Yes, in this example you can't afford to have hidden identifiers.
In November I started a thread called "RM and abstract syntax trees"
in which I suggested that RM was poorly suited for the representation,
never mind manipulation of ASTs. The problem is that the only
reasonable way to represent the structure is to introduce meaningless
node identifiers. An important principle in the RM is that a tuple
should always represent a proposition that makes sense to the problem
domain expert, so I agree with you that we cannot allow hidden
identifiers. Therefore the RM cannot help but expose the node
identifiers for all to see.
Prolog is able to parse string expressions entered by users and build
and manipulate ASTs. Behind the scenes, nested functor expressions
are usually implemented using dynamically allocated nodes wired up
with pointers. However, as far as the programmer is concerned, only
unification is available to decompose the structure. It seems to me
that Prolog has a more general support for data modeling than
available in the RM, to the extent that nested functor expressions
avoid the need to introduce lots of meaningless identifiers.
You could well argue that the hidden identifiers are not implicit to the data - because after all an expression has a string representation over an appropriate grammar, and in that form there is no concept of nodes and node identifiers. However, the tree representation is often more suitable for manipulation by a computer and I can imagine applications with very large amounts of such persistent data, and we could hardly expect all that data to persist as strings and need to be parsed every time it is brought into memory.
I'm interested in 3D scene graphs that support complex interactions between the parts, requiring nested expressions in the data model. RM seems to have limitations for such an application. To some extent I like to think of a program as data (like a Lisp programmer), and I think there will be exciting applications in the future that blur that distinction. The humble spreadsheet is a revealing example of how one can go beyond tables of raw values and have formulae in spreadsheet cells to represent information (at some higher level one might say). Of course I'm not suggesting that spreadsheets are generally suitable for data management! Received on Wed Jan 09 2008 - 03:17:59 CET