Re: O'Reilly interview with Date

From: dawn <dawnwolthuis_at_gmail.com>
Date: 5 Aug 2005 09:51:39 -0700
Message-ID: <1123260699.650475.271780_at_g14g2000cwa.googlegroups.com>


erk wrote:
> dawn wrote:
> > I just read the interview at
> > http://www.oreillynet.com/pub/a/network/2005/07/29/cjdate.html
> >
> > My response:
> > 1. He groups together an XML model and a semi-structured model. There
> > is no way I would lump these two together. The mapping from the
> > problem domain (subject area) to the model is very different between a
> > "structured" and "semi-structured" approach. When we are talking about
> > databases and dbms products, I would think that we are talking about
> > structured data, not semi-structured. I'm aware of "XML databases"
> > that could be used instead of a SQL-DBMS, while I'm not aware of
> > semi-structured databases for the same purpose. Such systems would be
> > more along the lines of a document management system, I would think.
>
> I'm confused -

good deal, I get to talk to erk :-)

> you say you wouldn't lump them together, but the
> semi-structured approach you describe sounds like mainstream use of
> XML.
And there are plenty of database implementations based on the RM (however loosely) that I would not want to confuse with the RM itself. An implementation that uses XML should not be confused with the XML data model.

While I don't really know WHAT it means, the term "semi-structured" indicates that the data are not "structured". Data values in an XML data model are structured. When using the XML data model as a basis for a database implementation, we are far from "unstructured". If modeling data for an XML repository, one would want to consider functional dependencies and many other typical concerns when modeling structured data. I don't know what unstructured data would have to do with that.

> How is XML different than the "semi-structured model"?

It provides a structured model for structured data.

> > Is there some other more common understanding of all of these terms
> > that would prompt Date to lump an XML database into the same cateory as
> > the Semantic Web, for example?
>
> I'm not sure that

100% agree ;-)

> > Is it, perhaps, an effort to marry the already ensconced XML
> > data model used at least for data exchange, to the not-exactly-stellar
> > record of the semistructured efforts to date, hoping to disparage the
> > xml model with this pairing?
>
> Can you elaborate? I don't understand how "marrying" XML and
> semistructured requires much effort.

It is like marrying a SQL-database of scanned images with the RM. The one can use the other, but that doesn't make these the same or even close to the same.

> > 2. The way that these alternatives are tossed aside is by marrying the
> > models to hierarchical and network and then dismissing those as having
> > failed already. I realze you cannot put everything in an interview, but
> > I have read quite a bit of Date's writings and haven't seem much more
> > rationale than what is given here.
>
> What are the major differences between XML and hierarchical, then?

I'm willing to grant that they are, or could be seen as, the same logical model. But from what I've learned on this forum and what I have read, the legitimate issues with databases based on this model way back when were implementation issues -- physical pointers, for example.  I have seen nothing that explains why modeling propositions as relations is better than modeling propositions as trees.

> > Is there a mathematical argument that shows why these approaches should
> > be tossed aside as having nothing to offer?
>
> Can you give a mathematical argument for tossing aside any approach at
> all in software?

No, but I don't claim that there are mathematical reasons for the RM being superior to any other data model, where I have read statements that either state or imply such.

> Perhaps you could define what you mean by
> "mathematical" in this context. The primary "soft" argument I'd use is
> that relations reduce complexity

over trees? That might be true for mapping some parts of a problem domain to a data model, but surely it is easier (for a human being) to conceptualize of a family tree as a tree, right? For whom does a relation reduce complexity?

> and enhance flexibility,

Is there proof of this?

> while
> mirroring logic inherent in requirements ("business rules") more
> directly than prematurely and over-constrained O-O and hierarchical
> structures.

It sounds like he could just as easily have married these two then, eh?  Although I don't see them as the same, the xml and OO data models are more similar than anything having to do with whatever "semi-structured data" would be.

> > Otherwise, this type of
> > comment, similar to what is in most college database textbooks, is all
> > I've got. The argument goes like this: We can model data based on
> > language propositions using mathematical relations. There is no
> > mathematically simpler way to model such data. Therefore, at the level
> > of the logical data model, as well as the API to a dbms, there should
> > only be mathematical relations and relational operators. This is
> > clearly not a mathematical argument.
>
> I'd call it something of a cousin of Occam's Razor, though perhaps
> there's a more formal argument a logician might use.

Because simplicity is not measurable in how to model data, it is not useful to say "my way is mathematically simpler, therefore beter". When I try to explain to my mom that it would be simpler to model the family tree as relations, ... If we are looking for the simplest mathematics that meets all of the requirements, then we need to lay out the requirements, not just look for a simple theory.

> > Is there a better argument than that or is that it?
>
> What would be?

I dunno. In the absence of an ability to use only axioms and logic to prove a point, some strong emperical data would be helpful. I know of no experiments in this area, but if you do, please let me kmow.

Cheers! --dawn Received on Fri Aug 05 2005 - 18:51:39 CEST

Original text of this message