RE: Useful Oracle books - C.J. Date theory vs. practicality

From: Lex de Haan <lex.de.haan_at_naturaljoin.nl>
Date: Fri, 28 May 2004 20:00:49 +0200
Message-ID: <JFEEIGBIDOCCDALDIPLNOEOMCDAA.lex.de.haan@naturaljoin.nl>

thanks for the debate Dan -- same trick again ... (nice quoting exercise)

Cheers,
Lex.

visit my website at http://www.naturaljoin.nl

DST: So, how do we handle Spouse_ID for a single person in a Persons table?
DST: I am looking for a theory
DST: that applies well to SQL as actually implemented, and that is directly
DST: useful to developers and DBAs stuck with that implementation.

LDH> I prefer indeed to turn things around -- there is only one NULL, LDH> so overloading that NULL with multiple meaning in bound to cause trouble.

LDH> This should be the most important guidance during data modeling;
LDH> You should try to solve these issues in the data model, where the fact
LDH> that someone is married with someone else becomes an entity of its own.
LDH> Or, to quote Hugh Darwen: each table corresponds with exactly one

predicate,

LDH> and the rows are the corresponding propositions evaluating to true.
LDH> in a PERSONS table you store facts about the existence of persons;
LDH> in a MARRIAGES table you store intersection information about persons

with persons.

<snip>

DST: Yes, but what should we do in the meantime, lacking domains? Isn't using
DST: NULL best, then?

LDH> I can only agree here, unfortunately :-)
LDH> Using nulls is indeed the best thing -- but I repeat my strong opinion:
LDH> as soon as your nulls can have different meanings, you are in trouble.
LDH> try to avoid all nulls of type "inapplicable" with proper data

modeling.

> but wait, sometimes there *is* no value that could never be real for a
given
> column, so we'll have to add whole extra Boolean columns for each such
> no-longer-nullable column to handle specifying whether we really ought to
be
> ignoring the value in that other column over there ...

> LDH> why? can't you represent this by a special value too?
> LDH> (maybe I am missing your point here)
> LDH> anyway, additional logic should NOT be built into the applications,
> LDH> but rather in the methods that are associated with the domains.

DST: Yes, but what should we do in the meantime, lacking domains?

LDH> you got me here ... the only solution I can bring up (again)
LDH> is to change the data model. So far I have not seen a single example
LDH> where I was unable to model all pseudo-nulls away, so the only

remaining
LDH> meaning becomes "unknown". by the way, before you think I am too fanatic:
LDH> in case you don't care about the reason why information is missing, LDH> nulls are OK in all cases; but then you should not try to interpret them!

> but wait, what about the column that starts out seemingly safe for
> designating, say, "-1" as the special "does not apply" value, but ends up
in
> some special case needing "-1" for a *real* value - WHOOPS ...

> LDH> sounds like a serious design mistake to me;
> LDH> don't get me started about serious RDBMS design mistakes :-)

DST: In the real world, with the current number of near-perfect database DST: designers, can design mistakes be made rare enough that we needn't worry
DST: about what we'll do when we discover them?

LDH> I assume this is a rhetorical question, right?
LDH> but that's still not an excuse ;-) and certainly not a reaso
LDH> to blame the relational model ot the three-valued logic ;-)

> Imagine that you want data from bar-code readers at a supermarket to go
> straight into a database.
>
> LDH> the "famous" cat food example ;-)
> LDH> I think this is already the wrong way to start;
> LDH> why insist that this data should go straight into the database?
> LDH> this looks like a transaction design mistake to me.
> LDH> what if the customer wants to cancel (forgot his wallet)?

DST: What's wrong with
DST: ROLLBACK;
DST:
DST: ? Seriously, though, for performance reasons, I agree with clumping all
DST: the inserts until they're ready to happen all at once, at the end of

the
DST: transaction, but that still doesn't mean it's necessarily worth the DST: coding hassle to presort and pre-aggregate (now at the application layer)
DST: to avoid duplicates, when duplicates are rare enough that the extra rows

DST: are virtually cost-free, and the rows are never read except as
DST: aggregations. I run into a similar problem when I snapshot performance
DST: data into my own tables, with simple SQL and shell scripts - the data
DST: rarely produces a duplicate row (which really means that the snapshot

saw

DST: the something twice, placing double weight on the event), and it'd be a
DST: hassle (and for what benefit?!) to eliminate the duplicates with such
DST: simple scripting tools, unless I added a purely artificial, useless
DST: Data_Row_ID column, where the data are only ever queried in aggregate.

LDH> duplicates are the biggest mistake ever made in SQL.
LDH> this is why tables are no sets -- so we have to talk about multisets,

LDH> this is why all kind of sound principles suddenly don't hold anymore, LDH> this is why certain query transformations cannot be performed by the optimizer,
LDH> this is why we end up with all those exceptions and special case handling...

LDH> I agree they are virtually cost-free in the context you describe,
LDH> but they can become very expensive in other contexts.
LDH> and again, you are asking for trouble if duplicate rows "sometimes"

happen
LDH> to mean something. if they really mean something, there is an attribute missing;
LDH> and if they don't mean anything, it is useless and dangerous to store any row twice.

A very good book about all this (indeed talking about possible future implementations)
is "the third manifesto" by Chris Date and Hugh Darwen, ISBN 0-201-70928-7. the first attempts to implement their "tutorial D" language are in the works.

By the way, I guess we should take further discussion offline, before guys like Mladen start intervening ;-)

Please see the official ORACLE-L FAQ: http://www.orafaq.com

To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line.

-- Archives are at http://www.freelists.org/archives/oracle-l/ FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html -----------------------------------------------------------------

---------------------------------------------------------------- Please see the official ORACLE-L FAQ: http://www.orafaq.com ---------------------------------------------------------------- To unsubscribe send email to: oracle-l-request_at_freelists.org put 'unsubscribe' in the subject line. -- Archives are at http://www.freelists.org/archives/oracle-l/ FAQ is at http://www.freelists.org/help/fom-serve/cache/1.html -----------------------------------------------------------------

Received on Fri May 28 2004 - 12:58:08 CDT