Re: O'Reilly interview with Date

From: erk <eric.kaun_at_gmail.com>
Date: 9 Aug 2005 04:14:54 -0700
Message-ID: <1123586093.973270.231710_at_g14g2000cwa.googlegroups.com>


dawn wrote:
> While I don't really know WHAT it means, the term "semi-structured"
> indicates that the data are not "structured".

How can "data" not be structured in some way? Given your comments, I have no clue what "semi-structured" actually means - although I probably didn't before either. It seems to mean nothing more when used in XML than defining a CDATA node with arbitrary content, something relational would support as well (though like driving a truck with your feet, it's still not a good idea).

> I'm willing to grant that they are, or could be seen as, the same
> logical model. But from what I've learned on this forum and what I
> have read, the legitimate issues with databases based on this model way
> back when were implementation issues -- physical pointers, for example.
> I have seen nothing that explains why modeling propositions as
> relations is better than modeling propositions as trees.

Trees of what? Propositions? Or is the tree itself a propositions? Is every non-leaf node then a "composite proposition"?

The difference is that independent propositions, and the ability to derive conclusions based solely on values, is simpler. Relations can be used to define "links," should you need them. Using such a "link-based data structure" for all data, though, adds complexity and ambiguity.

> > Can you give a mathematical argument for tossing aside any approach at
> > all in software?
>
> No, but I don't claim that there are mathematical reasons for the RM
> being superior to any other data model, where I have read statements
> that either state or imply such.

Then forget the assertion of "mathematical reasons." Clearly none of the other data models can claim more of such then RM anyway. RM is closer to set theory and predicate logic than the others, and those things are, at least as far as I can determine, more tractable than the more complex logics of trees and graphs.

> > Perhaps you could define what you mean by
> > "mathematical" in this context. The primary "soft" argument I'd use is
> > that relations reduce complexity
>
> over trees? That might be true for mapping some parts of a problem
> domain to a data model, but surely it is easier (for a human being) to
> conceptualize of a family tree as a tree, right?

Developers regularly use and conceive of things that boggle the mind of users, so that's a non-argument - even a tree-based system will use other data structures and algorithms which fall beyond the pale of "common sense." A family tree might indeed be better expressed as a tree, sure (having never had to manage such data, I don't know). File systems, org charts, bills of materials, and so on are all more manageable as relations, at least if you're doing more than just editing data items (e.g. relations are more amenable to different "human views" of the data).

> For whom does a relation reduce complexity?

Everyone. The end-user is irrelevant - developers will provide (hopefully!) appropriate interfaces for them anyway. Having seen user experiences with both XML (even in context-sensitive XML editors) and SQL DBMSs (using queries in MS Access to write and modify reports), I'll choose something relation-scented any day.

And having seen my own face trying to grok the "data model" of XML (in the absence of a more sensible "standard") and its manipulation (e.g. in Java), I'll again choose sorta-relational.

> > and enhance flexibility,
>
> Is there proof of this?

No. I'd wager none is possible, and even "evidence" is scant or nonexistent.

> > I'd call it something of a cousin of Occam's Razor, though perhaps
> > there's a more formal argument a logician might use.
>
> Because simplicity is not measurable in how to model data, it is not
> useful to say "my way is mathematically simpler, therefore beter".

That's not true at all. Striving for simplicity is always useful, for even when we can't define simplicity, we typically notice its absence and can at least triangulate as we spiral in a "Heisenbergerish fashion" around the target.

> When I try to explain to my mom that it would be simpler to model the
> family tree as relations, ...

With all due respect, I don't care what your mom thinks of it (or mine, for that matter). The end user, in this choice (as in that of algorithm), is irrelevant.

> If we are looking for the simplest
> mathematics that meets all of the requirements, then we need to lay out
> the requirements, not just look for a simple theory.

The choice of data structure goes beyond the immediate needs of a single project; again, relational was originally designed for "shared data banks," and that is indeed where it shines. If I have a single application that only presents data in a single way, I need do nothing more than 'new ObjectOutputStream( new FileOutputStream("blah")). writeObject(myWholeApp);'

> In the absence of an ability to use only axioms and logic to
> prove a point, some strong emperical data would be helpful. I know of
> no experiments in this area, but if you do, please let me kmow.

Nor do I. Where do we go now? "Common sense" routes?

  • Eric
Received on Tue Aug 09 2005 - 13:14:54 CEST

Original text of this message