Re: In an RDBMS, what does "Data" mean?

From: Eric Kaun <ekaun_at_yahoo.com>
Date: Fri, 11 Jun 2004 21:54:53 GMT
Message-ID: <NSpyc.2728$pK3.2253_at_newssvr15.news.prodigy.com>


"Anthony W. Youngman" <wol_at_thewolery.demon.co.uk> wrote in message
news:rDtNrmL7f6xAFw2a_at_thewolery.demon.co.uk...

> In message <qJJxc.385$Pt.272_at_newssvr19.news.prodigy.com>, Eric Kaun
> <ekaun_at_yahoo.com> writes
> >"Anthony W. Youngman" <wol_at_thewolery.demon.co.uk> wrote in message
> >news:9EJ4q$GofOxAFw1i_at_thewolery.demon.co.uk...
> >> >1. files
> >>
> >> This represents a "physical" object. A house. A car. A company. A
> >> building. An invoice.
> >
> >What about relationships between any of those things - e.g. cars and
houses
> >owned by companies, and the invoices for their purchases? At what point
does
> >something move from being a file to an attribute or vice versa?
>

> Just because I own a car, doesn't make the car part of me ... think
> language, and think nouns and adjectives (and gerunds).

What about sentences?

> >So if an invoice has a Parts attribute, it then needs additional
attributes
> >corresponding to each "attribute" of its use of those parts? At what
point
> >does the relationship between the two acquire enough "meaning" or enough
> >attributes of its own to warrant being in its own file? Imagine line
items
> >on an invoice became very complex, with shipment information and payment
> >information... is there a point at which you say "Enough!" and stop
adding
> >those things as sub-attributes to the Parts attribute of the Invoice?

>

> In theory, no. In practice, you might choose to split the invoice data
> across two FILES, where you've promoted sub-attributes of INVOICE to be
> primary attributes of the secondary file.

Why would you do the split, when the intent seems to be to keep things whole? Is this purely for performance optimization? I have a hard time keeping up with shifts between logical and physical, and the reasons for the splits. I understand you CAN do these things, but why and when? What are your heuristics?

> >> >4. sub-sub-attributes
> >>
> >> Simply nest sub-attributes one level deeper. :-)
> >
> >Ah, hierarchical induction. I'll just have one File in my app. :-)
> >
> Nah! FILE = noun :-)

Okay; I name my file "MyApplication." :-)

> You completely missed the point here. Where and why would you use a
> cascading delete? THINK! Be *practical*. What *works* in *reality*
> (rather than theory, which can think up a thousand impossible scenarios
> before breakfast (with apologies to "Alice in Wonderland")).

Seldom, due to business desires, but to answer the question you're getting at: when there's a foreign-key dependency, and there's one relation that is deemed "important" enough to trigger the cascade. There could be multiple, though that's rare...

> >> And then you use common sense to say "I use these fields all the time,
> >> and these fields only rarely" so you split A into two physical FILEs,
> >> and make all the colums of A-rarely into virtual columns of
> >> A-all-the-time, and vice versa. So for retrievals the user notices
> >> nothing (apart from the speed-up), although it does cost a bit extra
> >> logic when updating.
> >
> >A logical cost is a big cost. Every updating app needs to know that,
right?

>

> Yep ... but relational theory, which imposes mandatory separation of the
> logical from the physical, imposes that cost on EVERY app, not just
> those that update the data.

And it enables EVERY app with EVERY optimization. (not close, but you get my drift)

You've just successfully argued against code sharing, by the way, since if something is coded badly (either slowly, or laden with defects), then every app has to suffer, so you're better off recoding it in each app, right?

> Furthermore, by actively hindering the programmer from providing hints
> to the database, relational forces the programmer to rely on the
> database's artificial intelligence, which is quite likely to guess wrong

And you've also just argued against compilers, since they're so likely to guess wrong about the intention of your code, and therefore will produce badly-optimized machine code.

> Yup. Let's assume that the Pick database has been designed properly, and
> that within the FILEs the data has been normalised. I can now present my
> apps with a *closed* relational view!

What do you mean by that?

> My Pick application has also FORCED, by DEFAULT, my database to store
> related data close to itself (what relational calls clustering, I
> believe). It's fairly easy to prove, statistically, that this will
> optimise data retrieval from disk.

For that one access path.

> Sod AI optimisation, Pick doesn't
> have a choice and it works, which is why in any system lacking
> sufficient ram a Pick app will kick the equivalent relational app's
> butt!

I can think of several faster alternatives. Using ROWID and stashing hierarchies in Oracle tables would at least close some of the gap. Performance isn't the only point, but oh well...

> Basically, by not hiding the physical implementation from the user, Pick
> makes it easy to prove there just IS NO room for improvement.

hahahahaha

Oh - you were serious. My bad.

> By hiding
> the physical from the user, relational forces you to rely on the AI and
> you have no way of knowing whether it is efficient or not.

AI? Yes, I'd hate to rely on something like a "computer" or some other fancy "automaton" that does "logic" or some such liberal nonsense... :-)

  • erk
Received on Fri Jun 11 2004 - 23:54:53 CEST

Original text of this message