Re: In an RDBMS, what does "Data" mean?

From: Eric Kaun <ekaun_at_yahoo.com>
Date: Wed, 09 Jun 2004 20:13:32 GMT
Message-ID: <MbKxc.395$Pt.155_at_newssvr19.news.prodigy.com>


"Anthony W. Youngman" <wol_at_thewolery.demon.co.uk> wrote in message news:bXlBuYMZQPxAFwBs_at_thewolery.demon.co.uk...
> When I analyse our MV system at work, I think in terms of physical
> objects. I then decompose each physical object into normal form. I DON'T
> NEED to think about other objects while I'm decomposing the one in front
> of me.

Physical objects - how quaint. :-)

In my experience, even entities like invoices and orders become unmanageable as "physical objects" - specifically when someone placing an order needs to know some fairly complex interrelationships between the parts in that order, the warehouses and their parts, and parts previously ordered by the same customer. And in a paint formula database, the "physical objects" are both far from clear, and far different than even the users suppose they are.

> We're in the progress of porting to MS SQL-Server. The data diagram is
> an ABSOLUTE NIGHTMARE! When I look at the table diagram it's an absolute
> spaghetti of links EVERYWHERE! I don't have a clue which tables model
> which physical object, the meaning of links isn't intuitive.

Several points:
1. Diagrams can be messy, and they can be nicely-organized 2. If you were to draw a diagram that captures the functions you present to users in the dictionary, the files you use, and the files split for efficiency, you might see something similarly ugly 3. The meaning of links should be fairly straightforward if you're thinking in terms of predicates. Then again, if the database designer didn't think that way, you could be in trouble. Surely you've seen bad Pick data models?

> Trying to juggle hundreds of tables is far harder than keeping track of
> several tens of physical objects to which I can relate, even if each of
> those objects is then broken down logically into normal form.

Agreed - good modeling is difficult, and it's certainly very useful to think of things in groups. ER/win always did a good job for me - I could keep a 100-table logical data model segregated into domains [sic]. I never looked at the entire thing all at once.

> The MV database layout imposes a grouping which helps me grasp the
> system complexity.

That's a very good thing. What's not a good thing is selecting a grouping that makes sense to Function X to push its way into the definition of the data, since Function Y might have a very different idea what groupings should be imposed.

> While I can easily view the MV structure as equal to
> the relational structure, a true relational database does not give me
> the MV structure which appears much less complicated by virtue of
> appealing to the way I naturally view the world.

Again, if you only ever need to "view the objects" in one way, I envy you. Experience hasn't been that kind to me.

> Take the invoice. From the MV point of view, I see it as a SINGLE
> object. It is *TRIVIAL* for the database itself to decompose that and
> present it, via ODBC, to a relational programmer who wouldn't even
> realise that the MV back-end viewed it as a single object.

I think you're backwards on this:

  1. Seeing it as a single objects implies some degree of encapsulation
  2. Many queries need to cross encapsulation boundaries - hence the need for the Pick/MV query mechanism to directly support list attributes and sub-attributes and sub-sub-attributes... if it were really one object, you could only "get at" those pieces via the Invoice's defined operations. Encapsulation is a red herring (direct quote from Date) - every "persistence mapping" tool violates it.
  3. It's not at all trivial. Say your Invoice has 4 attributes: customer, parts, ship dates, and payments. Part[N] must have a corresponding Date[N] which is the date on which that part was received; Payments is completely separate. How would the mapping "know" that that correspondence exists? Certainly there is "meaning" there? The aggregation has traded one general form of meaning for a very app-specific form; in examples other than the somewhat-hierarchical Invoice/Order, the value of the trade diminishes even further.
  4. Given that you do various queries, and that the database answers lots of questions, what value is it for the database to "know" it's a single objects? You don't see the "lowest level" of a data model being something "egalitatian", which can support several views? It seems to me much more powerful and general to supply predicates as the foundation, and layer a "path expression" on top which can present any arbitrary view of the data. Example: atop a relational model, I can present the Invoice, as well as a Warehouse which "contains" the parts it shipped and which customers bought them? Reporting and GUI generation, here I come...
    • erk
Received on Wed Jun 09 2004 - 22:13:32 CEST

Original text of this message