Re: Object-oriented thinking in SQL context?

From: rpost <rpost_at_pcwin518.campus.tue.nl>
Date: Tue, 4 Aug 2009 22:13:43 +0000 (UTC)
Message-ID: <h5abqn$132b$1_at_mud.stack.nl>


(This is old but I find it interesting ...)

David BL wrote:

>> >> Finally, your motivation is that database-valued attributes
>> >> give us the same addition in expressive power that surrogate
>> >> identifiers do, when compared to flat, surrogate-less databases.
>> >> How do you know? What do you mean by it?
>> >> (I have an idea, but this posting is growing too long.)
>>
>> >Actually I suspect that is not the case. i.e. one cannot always
>> >eliminate abstract identifiers.

I think you're right, but only in a theoretical sense; the kinds of data representations for which a hierarchy doesn't suffice are prohibitively expensive to query and update, as far as I can see. For instance, a representation of graphs such that two graphs have the same representation if and only if they are isomorphic.

>> But in that case, what is the attraction of your idea?
>
>I think all things being equal it's a good idea to minimise the number
>of abstract identifiers in the model. DVAs help to achieve that aim.

At the expense of introducing nesting.

>> >A Triangulated Irregular Network
>> >seems to be an example of that. How can a set of triangles reference
>> >shared vertices without introducing identifiers for them?
>>
>> Represent the edges as a relation of two points (or 6 coordinates)
>> or the triangles as a relation of three points (9 coordinates).
>> Vertices will occur multiple times. Updates are a little awkward
>> (single-tuple updates will be invalid) but it works.
>
>I agree it works, but I'm not comfortable with the idea.
>
>> >The only
>> >alternative seems to be to use the value (i.e. x,y,z) of a vertex to
>> >identify it and then use integrity constraints to somehow reduce the
>> >degrees of freedom in the model.
>>
>> Yes, constraints are unavoidable, but is this a problem?
>
>In theory no. In practise yes. There is a constraint that no two
>triangles overlap. A naive enforcement of this constraint has
>quadratic complexity.

Hmmm ... good point. Some additional structure would need to be maintained to express the graph's planarity (I assume it's planar), a spanning tree or something like that. But that is essentially optimization, it doesn't belong in the conceptual model.

>This constraint implies that no triangle contains (i.e. strictly
>inside) a vertex from another triangle. The converse is false, so
>the DBMS cannot use that observation as the only basis for more
>efficient enforcement of the constraint. Despite this the idea of
>testing whether a vertex is being moved into an adjacent triangle is
>useful.
>
>In fact, efficient enforcement of the constraint has a lot to do with
>what update is being applied. For example, if conceptually a shared
>vertex is being moved, there are constraints on how far it can move
>which is fairly easy to calculate.

This might even be imposed as a transition constraint on all movement.

>The DBMS would need to have access to this sort of geometrical
>understanding (i.e. artificial intelligence) in order to derive an
>appropriate implementation that makes enforcement of the integrity
>constraints practical.

Geometrical data models and query languages exist, but that's all I know.

>Other important updates would be removal of a vertex (and associated
>triangles), or introduction of a vertex (with creation of appropriate
>triangles).
>
>I think this is an example where faith in "physical independence" is
>pi in the sky. I think a TIN is too complex a domain type to expect
>the DBMS to implement it for us based on general purpose techniques
>centred on the RM.

Perhaps. But to what extent the problems are due to static mismatches (inadequate data structures) and to what extent they are with dynamic aspects isn't clear to me.
>
>> >It's not clear to me whether it
>> >could be practical. For example, when a user selects and moves a
>> >vertex around in a GUI, it would be necessary to update multiple
>> >appearances of the same (x,y,z) in the model. Failing to do so would
>> >typically cause an integrity constraint violation.
>>
>> I don't see how introducing OIDs (surrogates) would help here.
>> The discomfort is in that tuples aren't "entities" in the sense that
>> single-tuple updates are invalid, but adding OIDs won't change that.
>
>I'm wondering whether it helps because it makes the conceptual idea of
>moving a shared vertex more explicit in the model.

I don't see any advantage. IDs associated with vertices may be useful if, e.g., the maps are populated with objects that hop from vertex to vertex.

>> >A TIN may not be a very good example because in practise it is usually
>> >based on a Delaunay triangulation. In other words the triangles are /
>> >calculated/ from the vertices. Therefore it is sufficient to only
>> >model the vertices, and so we just have a set of (x,y,z) and no need
>> >to introduce abstract identifiers.
>>
>> Yes, except that SQL probably isn't good at Delaunay triangulation
>> (I haven't tried). I suppose materialized views can help here.
>
>I vaguely recall Bob Badour saying he implemented Delaunay
>triangulation (or was it minimum spanning tree?) using a relational
>approach. I can't remember.

I don't think FOL can give you a spanning tree, not even all spanning trees.

>> But you could take other vector-based data as an example.
>>
>> For motivating OIDs I'd look at modelling problems for which
>> a fixed relational schema with natural keys is hard or impossible.
>
>I agree there are much better examples.

It's sort of an anti-example: it demonstrates that not all concepts are best expressed as entities.

-- 
Reinier
Received on Wed Aug 05 2009 - 00:13:43 CEST

Original text of this message