Re: In an RDBMS, what does "Data" mean?

From: Dawn M. Wolthuis <dwolt_at_tincat-group.com>
Date: Wed, 9 Jun 2004 18:40:08 -0500
Message-ID: <ca874t$g9t$1_at_news.netins.net>


"Eric Kaun" <ekaun_at_yahoo.com> wrote in message news:5k2xc.6205$4b2.1710_at_newssvr32.news.prodigy.com...
> "Bill H" <wphaskett_at_THISISMUNGEDatt.net> wrote in message
> news:awmwc.12018$%F2.11413_at_attbi_s04...
> > [SNIP]
> > I've noticed that many people aren't interested in a better proposal, or
> > even a different proposal. Dogma rules. :-)
>
> A fun movie... :-)

indeed
[I'm sure I've missed a bunch since my ISP first had nntp down and then seemed to reinitialize the database (is that the right term?) but I'll read a bit before a long weekend away from news again.]

> > The main reason others use different data models is that they allow a
much
> > closer interaction between the language of dbms and applications and the
> > environment they're designed to operate in (mostly the business
> community).
> >
> > Because of this, the cost of development, maintenance, and
administration
> is
> > significantly lower than those models having additional expertise and
> > liaison requirements.
>
> I am all for lowering this cost - decreasing the "impedance mismatch", so
to
> speak. However, I think my ideas move in the opposite direction - making
> application languages more relational, rather than DBMSs more procedural
(or
> OO, if you like).

And the likelihood of that is ... NIL (choosing not to use that NULL set designation). Why? Because people tend to choose solutions that work. If there were overwhelmingly good evidence that you get a better bang for the buck by using relational theory, that would be a different story. I'd strongly suggest we nudge relational databases toward pragmatism ;-)

> > Now, this advantage may not be what you are looking for. It may not be,
> for
> > that matter, what the CIO of a large company is looking for. However,
in
> > the world of small to meduim sized businesses (SMBs) this cost advantage
> > means something.
>
> Agreed - however, while my experience comes from a large company, it's
work
> done for a relatively small business unit. I was the only developer on
> several of the projects, and my user base was fairly small. I was DBA,
> developer, customer support, etc. And I still found the relational
metaphor
> (even though I had to use SQL) much easier than XML.

Didn't some of that have to do with having to perform conversions to and from XML which might not have been necessary if the data were stored in the way it was sent? OR was it the loosey-gooseyness of it where there are not as many texts with rules for "how to"?

> I've never used Pick -
> sounds like their environment gives them a lot of power, and while that's
> nice, I'd still never think of thinking of an invoice as a single
> proposition or "object". It's not.

Perhaps you've never seen one? ;-)

> It's a fairly complex series of them.

That too, but through how many portals would you want to have to go to collect all such? This has to do with how the "user" (application developer or dba, for example) should view the data.

> Just like an "order", an invoice is a fairly complex confluence of
> phenomena, and not even a static one (modifications / confirmations to
> various invoice "pieces" was common in my world, as an invoice was often
> correlated with multiple shipments and warehouses).
>
> > I can't speak for Anthony and Dawn, but I place more value not on the
> > original inputs but the original concept. An invoice _is_ something
that
> > usually has multiple items ordered.

Yes and I'm trying to narrow that down a bit while trying to tap into just how I do database design given that I don't start with 1NF. It has to do with people, places and things and entities that are not functional dependent on any other entities in the system. What is that top level of nodes after ENTITY in a system, such as PEOPLE PLACES THINGS.

> And I disagree. An invoice is many somethings. If your questions deal only
> with the set (e.g. presenting an invoice on a screen), then great - treat
it
> as one. But when you're attempting to analyze the distribution of parts
> across warehouses and across time, "viewing" the invoice as a number of
> components is far, far more useful.

I see where you are coming from. No, an invoice is just one of these things, but the data from the invoice is also available through other data portals (for lack of a better word -- don't make me use the word "view"!) such as warehouses and parts. I can see that one difference is that the same data from my perspective is available as an invoice and as parts-invoiced. These are different entities with the same or similar data accessed. Each portal can see everything you can "get to" from there (via declared links as one might have in a join statement).

> So it depends on your needs, but I'd far
> rather place my bet on something that allows me to scale my queries and
> reports to more detailed questions than one that restricts me. And I still
> think having to correlate multiple line-item attributes across multiple MV
> attributes in a single File is nonsensical and error-prone.

I'll grant that there are pros and cons and not everyone designs an invoice identically no matter what the database, but when you add in the virtual fields (derived data or data found elsewhere), the INVOICE vocabulary for everyone has what it needs to show an invoice.

> > It is an object in and of itself that
> > needs no "chopping up", so to speak.
>
> Yes, it does. "Analysis" means chopping up. We gain power in chopping up.

and putting back together

> Our problems are solvable when they're chopped; our solutions are scalable
> and provable when they're chopped.

again, I think you are confusing something here -- perhaps physical and logical (although I think I've ascertained that would not be like you) but perhaps it is your notion that data can only be accessed through one place - it's base relation. Remove that obstacle -- free yourself. Yes, we still divide it all up, but into wholes, not pieces.

> Domains are intellectually tractable when
> they're separated. Holism may be fine in medicine (???) where human
> psychology is involved, but any translation of a "real world" domain to an
> automated system involves "chopping up." You can either acknowledge it and
> chop in a rational way, or pay the price later on.

yes, there is some chopping up and the functional dependency thing takes you quite far for that, even if you allow for both scalar values and compound ones (such as lists).

> While I'm not dogmatic about 1NF (believe it or not), or even relational,
I
> do believe based on experience that the balance point for using relational
> is far, far sooner than critics would believe.

Someday grasshopper ...

> > This is where simpler means don't destroy the properties of the invoice
in
> > order to make the data fit into an arbitrary data model with
tautological
> > axioms and theorems.
>
> Tautological? Arbitrary? Any logical model is arbitrary; an invoice has no
> shape, or at least none beyond that of a piece of paper, and as I've said,
> if all they want to do is store the invoice, let's scan the thing into a
JPG
> and be done with it.

No, the data needs to be available to other entities as well, as you pointed out.

> "Making the data fit" is also nonsense; whatever physical and logical
model
> you choose, you're pushing the data into something. You can either push it
> into something with maximum power or a lesser degree of power. Perhaps you
> gain short-term efficiency; in my experience with XML, you gain squat.
>
> > Keep the business objects as close to what they are.
>
> So forgetting an invoice for a moment, what "is" a paint color? A paint
> formula? A carmaker code? A digital certificate store? What's their
"natural
> form"?

It is relational folks who become democratic about this and start thinking about understanding the nature of any particular noun outside of its use in "this" context. Define it based on its use and if a new use comes up, redefine it if necessary, otherwise add qualifiers to it.

> There is none. What we do is unnatural. (<insert unnatural-act joke here>)

OK and it's funny, but nevermind.

> > A data model that can do this has many advantages.
>
> That can do what - model arbitrary data in its "natural form", whatever
that
> means? I agree. If you show that to me, I'll use it.

as entities. Still working on how to show it.

> > > I suspect there are simply different expectations; I'd rather
> > > stretch the computer to avoid stretching humans in ways they're not
good
> > at
> > > (e.g. repetitive symbolic manipulation).
> >
> > I think you right here. I've been in business for many years. I would
> like
> > development to be easy for me. We can watch the pendulum swinging
towards
> > making software development easier for those of us using the software.
> > .NET, for better or worse, is attempting to make development easier (if
it
> > wasn't for the bizarre data typing and variable scoping it would be a
lot
> > easier). Hopefully dbms theory will contribute to this too.
>
> I hope so - that would be nice. I think XPath and XQuery, while
convoluted,
> are reasonable enough operators over an XML type / type generator. I just
> see far more benefit from the structures and declarative constraints of
> relational.

Have you found that when you map from xml to relational, you don't need to add anything to the information in your source, but when you go the other direction, you need to add data (such as ordering)?

> - erk

Cheers! --dawn Received on Thu Jun 10 2004 - 01:40:08 CEST

Original text of this message