Re: In an RDBMS, what does "Data" mean?

From: Marshall Spight <mspight_at_dnai.com>
Date: Sat, 26 Jun 2004 18:08:50 GMT
Message-ID: <SYiDc.100564$Hg2.62685_at_attbi_s04>


"Anthony W. Youngman" <wol_at_thewolery.demon.co.uk> wrote in message news:6Y95teCqQz1AFwlM_at_thewolery.demon.co.uk...
> But let's go
> back to the "list or bag" thing. If both the MV and the relational
> database contain the *same* data, then the MV version is richer because
> it has retained any order that was there.

The question is whether the order conveyed any meaning in the first place. There are two cases to consider: either did or it didn't.

If it did, and you are having the relational version of the data not retain the order information, then your comparison is invalid, because you have specified that the relational version of the data discard information that the MV version retains. If it did, and we have the relational version retain the order information, then the two views are informationally equivalent.

If it didn't, then having the relational version not retain the meaningless order information is actually advantageous, because it means the DBMS is free to reorder the data if it wants (for efficiency or whatever other reason,) whereas the MV-MS (or whatever you call it) does not have this freedom.

> If the app wants a bag, it can ignore the order.

True for both models.

> But if the app wants the original list, not only does
> the relational version have to store more data, but it has to do more
> with it - it has to sort it before handing it back to the app.

Not true. First of all, it is by no means "more data" because we are talking about the same data in either case. What you probably meant to refer to is what data is stored. It is true that if the storage format is a non-meaning-carrying sequence of (position, element) pairs, then the same data will take up more space, (*not* that there is more data) and there is a sort step. But the DBMS is free to choose any storage format it wants. It could choose meaning-carrying sequence of elements, in which case the two models have identical underlying storage.

This is the "physical independence" thing that many of us are talking about all the time.

> The app
> needs to know that it's supposed to be a list, and also has to know how
> to convert the set back to an ordered list.

Not at all. RDBMSs are declarative; the app declares what it wants, and that's what it gets. You also assuming that a dataset can have only one possible ordering, which is also not the case.

> That's what I'm trying to express - a lot of stuff is implicit in the MV
> approach, which you can ignore if you want. By explicitly forcing this
> metadata into data, a relational app needs to "know" a lot more to get
> the same result.

If you have the option to either ignore it or not, then the mv-ms does not have that option; it has to attend to it regardless of what you do, since it can't predict your future actions. This constrains implementations unnecessarily.

Marshall Received on Sat Jun 26 2004 - 20:08:50 CEST

Original text of this message