Re: In an RDBMS, what does "Data" mean?

From: Dawn M. Wolthuis <dwolt_at_tincat-group.com>
Date: Sat, 12 Jun 2004 22:30:07 -0500
Message-ID: <cagho5$mto$1_at_news.netins.net>


"Eric Kaun" <ekaun_at_yahoo.com> wrote in message news:UV2yc.2595$tp6.675_at_newssvr15.news.prodigy.com...
> "Dawn M. Wolthuis" <dwolt_at_tincat-group.com> wrote in message
> news:ca874t$g9t$1_at_news.netins.net...
> > "Eric Kaun" <ekaun_at_yahoo.com> wrote in message
> > news:5k2xc.6205$4b2.1710_at_newssvr32.news.prodigy.com...
<snip>
> My wife handles the finances, since she's damn good at it. And all of the
> invoices I "see" are so... flat. Two-dimensional. Surely ripe for
> relational? :-)

doubt it (not doubting the wife's skills) -- I'm guessing at least some have header info and multiple line items ... ?

> > > It's a fairly complex series of them.
> >
> > That too, but through how many portals would you want to have to go to
> > collect all such? This has to do with how the "user" (application
> developer
> > or dba, for example) should view the data.

>

> Oh, I agree the user should have few portals. But application developers
> want to see the messy back-room (to extend the department-store metaphor).
> Or more accurately, developers are like the store managers who map many
> different suppliers' products into departments, clusters, and shelves.

Sure, the developers need to know the information required to set up the store into nice tidy departments.

> > > Just like an "order", an invoice is a fairly complex confluence of
> > > phenomena, and not even a static one (modifications / confirmations to
> > > various invoice "pieces" was common in my world, as an invoice was
often
> > > correlated with multiple shipments and warehouses).
> > >
> > > > I can't speak for Anthony and Dawn, but I place more value not on
the
> > > > original inputs but the original concept. An invoice _is_ something
> > that
> > > > usually has multiple items ordered.
> >
> > Yes and I'm trying to narrow that down a bit while trying to tap into
just
> > how I do database design given that I don't start with 1NF. It has to
do
> > with people, places and things and entities that are not functional
> > dependent on any other entities in the system. What is that top level
of
> > nodes after ENTITY in a system, such as PEOPLE PLACES THINGS.
>
> Ah, I see. Yes, I agree that those drive UIs, reports, etc. - at least for
a
> while. I focus on those technologies that will make that part easy, AND
give
> me some assurance in their consistency and that I can drive more complex
> requirements easily. And those complex ones always arise quickly, I've
> found... if I've oversimplified early (and I've done the entity/object
style
> of design before), I usually regret it. Sometimes that's warranted, if
> time-to-market is the critical success factor.

I very much agree, but seem to arrive at a different conclusion on how best to set up for handling these sudden changes to requirements.

> > > And I disagree. An invoice is many somethings. If your questions deal
> only
> > > with the set (e.g. presenting an invoice on a screen), then great -
> treat
> > it
> > > as one. But when you're attempting to analyze the distribution of
parts
> > > across warehouses and across time, "viewing" the invoice as a number
of
> > > components is far, far more useful.
> >
> > I see where you are coming from. No, an invoice is just one of these
> > things, but the data from the invoice is also available through other
data
> > portals (for lack of a better word -- don't make me use the word
"view"!)
> > such as warehouses and parts. I can see that one difference is that the
> > same data from my perspective is available as an invoice and as
> > parts-invoiced. These are different entities with the same or similar
> data
> > accessed. Each portal can see everything you can "get to" from there
(via
> > declared links as one might have in a join statement).

>

> I think we're on the same page - I just think (based on comparison with
> other things) that relational makes the best logical support for a
> multi-portal system. And if you think about it, those portals can be small
> and nested... UIs inside other UIs, etc. - whatever the user needs to get
> the job done. Since those portals start to look like mini-apps, that makes
> their common logical foundation all the more important.

More mind meld here -- similar thought processes drawing different conclusions.

> > > So it depends on your needs, but I'd far
> > > rather place my bet on something that allows me to scale my queries
and
> > > reports to more detailed questions than one that restricts me. And I
> still
> > > think having to correlate multiple line-item attributes across
multiple
> MV
> > > attributes in a single File is nonsensical and error-prone.
> >
> > I'll grant that there are pros and cons and not everyone designs an
> invoice
> > identically no matter what the database, but when you add in the virtual
> > fields (derived data or data found elsewhere), the INVOICE vocabulary
for
> > everyone has what it needs to show an invoice.

>

> And I think I'm seeing more and more value to a path-like / hierarchical
> expression as a user tool. I see it as best layered atop relational, since
I
> anticipate more views (if my data is useful, and I'm trying to help the
> business's departments interoperate) but I think we agree philosophically
> with the notion of packaging for the user.

OK, now read what the purpose of the relational model is (somewhere towards the front of Date's latest edition of the textbook). If "the user" (whether a s/w developer or an end-user) can work with data thinking entirely in this walk-our-way-through-the-vocabulary fashion for queries of any sort, then what, again was the need for the relational model in this? You are correct, however when you asked somewhere whether one can update through these portals -- not really, but it works for managers & high level designers, making anything more an implementation detail ;-)

> > > > It is an object in and of itself that
> > > > needs no "chopping up", so to speak.
> > >
> > > Yes, it does. "Analysis" means chopping up. We gain power in chopping
> up.
> >
> > and putting back together
> >
> > > Our problems are solvable when they're chopped; our solutions are
> scalable
> > > and provable when they're chopped.
> >
> > again, I think you are confusing something here -- perhaps physical and
> > logical (although I think I've ascertained that would not be like you)
but
> > perhaps it is your notion that data can only be accessed through one
> place -
> > it's base relation. Remove that obstacle -- free yourself. Yes, we
still
> > divide it all up, but into wholes, not pieces.

>

> I agree, and didn't mean to give the impression that data should only be
> accessed through base relations. Far from it. Relations are a necessary
(to
> me) but not sufficient condition for good application design.
>

> > > Domains are intellectually tractable when
> > > they're separated. Holism may be fine in medicine (???) where human
> > > psychology is involved, but any translation of a "real world" domain
to
> an
> > > automated system involves "chopping up." You can either acknowledge it
> and
> > > chop in a rational way, or pay the price later on.
> >
> > yes, there is some chopping up and the functional dependency thing takes
> you
> > quite far for that, even if you allow for both scalar values and
compound
> > ones (such as lists).
>

> For users, yes, lists are useful (I'd argue that sets are more often, and
> that relations are even better, but I'll lighten up on that). The other
> linchpin of relational, of course, is types. I distrust technologies with
> weak typing, but that's a different discussion; suffice it to say that
> having a LINE_ITEMS attribute in a file would make me far less queasy if
the
> elements of that list were real objects, with real operations defined over
> them.

How and where what rules/constraints are applied to the data is one of those topics where I'm not yet where I want to be in understanding various options and how they influence agility/maintainability. So, I can sympathize but I can't get too upset about descriptions of the data that go further than the constraints that are applied to it (that might not have made sense to anyone but me, so ignore if it didn't).

> > > > This is where simpler means don't destroy the properties of the
> invoice
> > in
> > > > order to make the data fit into an arbitrary data model with
> > tautological
> > > > axioms and theorems.
> > >
> > > Tautological? Arbitrary? Any logical model is arbitrary; an invoice
has
> no
> > > shape, or at least none beyond that of a piece of paper, and as I've
> said,
> > > if all they want to do is store the invoice, let's scan the thing into
a
> > JPG
> > > and be done with it.
> >
> > No, the data needs to be available to other entities as well, as you
> pointed
> > out.

>

> Sure, I was being facetious - so there are 2 questions:
> 1. What is the nature of the "other entities" that will need to use the
> data?

We will know in time.

> 2. In what form does the data need to be to provide those entities with
easy
> access; and even to make those entities easy to develop?

We will know in time.

But we definitely should think about what are the most likely changes on the horizon and what our strategy would be for each of those. I'm not completely in the XP camp where we only think about the requirements (stories) for this iteration of development and worry about tomorrow, tomorrow.

> I see those entities as applications (including GUIs and reports and batch
> processes), and contend that relational is the best answer for #2. But
> hierarchies are useful for #1. The impedance mismatch, though much more
> tractable at this level than object-relational mappings.

I hope to eventually agree with you on the best approach to #2. That is not the same statement as saying that I hope to eventually agree with your current opinion on the matter.

> > > "Making the data fit" is also nonsense; whatever physical and logical
> > model
> > > you choose, you're pushing the data into something. You can either
push
> it
> > > into something with maximum power or a lesser degree of power. Perhaps
> you
> > > gain short-term efficiency; in my experience with XML, you gain squat.
> > >
> > > > Keep the business objects as close to what they are.
> > >
> > > So forgetting an invoice for a moment, what "is" a paint color? A
paint
> > > formula? A carmaker code? A digital certificate store? What's their
> > "natural
> > > form"?
> >
> > It is relational folks who become democratic about this and start
thinking
> > about understanding the nature of any particular noun outside of its use
> in
> > "this" context. Define it based on its use and if a new use comes up,
> > redefine it if necessary, otherwise add qualifiers to it.

>

> Hmmm. Okay, I'm all for agility where it makes sense - still, I think a
> little extra work up front goes a long way. But if you're got your
> DB-upgrade and redeployment processes automated, and unit tests and all,
> this can work...

Yes, I agree that identifying such potential risks is a good idea, but not in terms of the semantics of the data required for this round, but rather the likelihood of various possible new requirements. <snip>

Cheers! --dawn Received on Sun Jun 13 2004 - 05:30:07 CEST

Original text of this message