mAsterdam <mAsterdam_at_vrijdag.org> wrote:
>Mark Johnson wrote:
> > But there are important considerations
> > addressed by database builders having to
> > do with integrity (floating, loose records), and simply the rate at
> > which queries can be performed. That was the reason for many looking
> > to an explict/full/materialized path for such tables which otherwise
> > violate the basics of the RM. The idea was that tables derived from
> > the RM would not be ordered. But these hierarchical tables are
> > explicitly ordered and by self-referential links. It's even worse than
> > mere 'storage order'. And if one were to suggest - but wait, it's
> > basically a bunch of normalized tables all overlayed on the same
> > logical table for convenience, and because it lends itself easily to
> > automated programmatic control - they'd probably be having none of
> > that, particularly because one is not constrained to view any
> > particular list, but could pick any tree, or the table as a whole,
> > intentionally or by mistake.
>I guess that if I spend some time on this, I may extract some sensible
>statements out of this.
Are you saying that you find the above, confusing? Let me break it
down:
"there are important considerations addressed by database builders
having to do with integrity (floating, loose records), and simply the
rate at which queries can be performed."
It is recommended that SQL be used to perform the basic operations on
the dataset. Some object model coding, instead, might be used with
these hierarchical, self-referential table schemes, which was the
subject, here. That could introduce error. One sort of error might
have to do with integrity, which typically refers to a loss of
reference among particular elements; something called 'dangling
records', or such. And more importantly, in order to reconstruct the
structure, an access path must be 'chased' which can take time,
depending on the speed of the routine which performs what would be
this frequent, routine function. Oracle, for example, provides such an
extension called, Connect By, and related extensions.
"That was the reason for many looking to an explict/full/materialized
path for such tables which otherwise violate the basics of the RM."
In order to avoid the use of such 'chasing' functions, many have
suggested using an explicit path. One way to preserve storage is to
minimally represent that path as the concatenation of sort values. If
a list is a super-tree, a tree above or superior to another, contained
four items in order, with an attribute to record that ordering, then
the first position on the path would be that number, taken from that
attribute. And that path itself would be stored in another attribute,
which could also serve as a primary key. A node on its subtree, also
in a particular order, would take the number just found for the
superior node, add a dot or slash or whatever, some separator, and
then it's own sort value, for its primary key, to continue describing
the explict path. As for the basics of the RM, the very idea of
sorting the relation, or relationship, could well not correspond to
either the notion of a relation or relationship. For while the row
might in some sense correspond to an ordered tuple, and as represented
in a grid it would be placed in one column of the same domain, and not
another, the tuples or elements themselves are not supposed to be in
any particular order. That's the point of contention.
> > Proponents of the RM against such schemes would certainly argue that
> > they only have in mind the reduction of confusion, of error, and an
> > increase in the speed and simplicity of retrieval of the data by
> > whatever ordering. And if Connect By is burdensome or slow, even just
> > in that they have a point, never mind that it is proprietary. Perhaps
> > if it were as easy to transparently create tables without overhead or
> > explicit design, as it is to add subtrees and leaves to a hierarchies,
> > it might be less of an issue. But even in a well-known organization,
> > with few anticipated changes in structure and order, surely some
> > things come up, if just forced on them by changes made by large
> > clients or vendors. I think that's why Oracle does have a Connect By,
> > in the first place. And I'm sure many companies get a lot of use out
> > of it.
>Idem.
You don't understand? Let me break it down for you:
"Proponents of the RM against such schemes would certainly argue that
they only have in mind the reduction of confusion, of error, and an
increase in the speed and simplicity of retrieval of the data by
whatever ordering."
If that's not self-explanatory, I don't know what to add.
"And if Connect By is burdensome or slow, even just in that they have
a point, never mind that it is proprietary."
Again, this is found as a proprietary feature of Oracle. A) if it
proved to be slow in certain cases, in chasing down those paths, then
if another design proved faster, that would generally be considered an
advantage - a good thing. B) it is proprietary, and some consider that
unnecessarily limiting and objectionable.
"Perhaps if it were as easy to transparently create tables without
overhead or explicit design, as it is to add subtrees and leaves to a
hierarchies, it might be less of an issue."
That is, there is a design effort, a rework effort, required to modify
an existing scheme of tables and foreign keys as the structure may
need to be rearranged. But to rearrange one of these self-referencing
tables, built on a database supposedly inspired by the RM, there
exists the row, uniquely identified, but in a structure easily changed
by simply adding, deleting, relocating or resorting on a tree display,
and which would intuitively likely correspond closely to the
organization or structure of the data being retained.
"But even in a well-known organization, with few anticipated changes
in structure and order, surely some things come up, if just forced on
them by changes made by large clients or vendors. I think that's why
Oracle does have a Connect By, in the first place. And I'm sure many
companies get a lot of use out of it."
That is, as a practical matter, as with whether Connect By is slow, or
suitable, many things cannot be anticipated. Ideally, they could be
foreseen. But as a practical matter, not. The vendors may completely
revamp their catalog. The client may require a restructuring of data,
depending on what you are doing for them. And perhaps that is why some
would turn to the easier, but problematic, self-referencing tables, at
some point, or just in part. That is, there's a reason Oracle included
this. And I stated my opinion that I suspect it gets a lot of use, in
companies and projects of all sorts. Don't read into it that I think
it's a good thing or a bad thing. It just is.
Look, if you think this isn't helpful, then fine. I tried to make
suggestions for your glossary, as well. If you think it isn't helpful,
then so be it. But I don't really see how you were that confused by
what I had written above. It just wasn't that complicated.
Nonetheless, there you have it.
Received on Sat Feb 25 2006 - 13:48:51 CET