Re: Does Codd's view of a relational database differ from that ofDate&Darwin?[M.Gittens]

From: Jan Hidders <>
Date: Mon, 27 Jun 2005 21:59:05 GMT
Message-ID: <JE_ve.131764$>

Marshall Spight wrote:
> Jan Hidders wrote:

>>seriously, who claimed that XML would be a better way to model your
>>data? I certainly wouldn't. But it's an interesting and important data
>>exchange format, and its widespread use will inevitably lead to the need
>>to manage data that is in that format. That's all. 

> What kind of things make it interesting to you?

Me personally? It raises all kinds of new and intricate research questions that seem mathematically challenging. That doesn't mean that I don't think XML is practically useful, on the contrary, but this is what primarily fascinates me about it.

>>>What is the difference between an element and an attribute?
>>Elements are ordered, attributes are not. Elements can contain complex
>>content, attributes only strings.

> How is this a useful distinction? It seems that everything
> you can do with attributes you can do with elements. How
> does the distinction between elements and attributes help
> me solve my data management problems? (Or is this question
> not appropriate?)

It's appropiate. No, this distinction is not going to help you in any way if your data isn't already XML data.

>>>Nowhere on the page did I find a single mention of a type system.
>>>How are we going to do data manegement without a type system?!
>>Sometimes you will have a schema for your XLM documents, sometimes you
>>won't. The DBMS should be able to deal with both situations. 

> I'm not clear what it would mean for a DBMS to work with data
> without a schema. Just store one big blob? That doesn't seem
> like it meets the "M" part of "DBMS." How do you manage
> integrity with no schema? How do you get any semantics?

It wouldn't be a real blob, because you could see it's structure and there would be lables that gave you a hint about the meaning of its components. Perhaps it's data that needs to be integrated with the rest but you are not exactly sure what it's exact meaning is. So in the meantime you will store it anyway because you stil need to examine it.

>>>[...] To start with, I would expect a
>>>type system, some relationship mechanism beyond simple nesting,
>>>and a standardized way to specify schema, expressed in the metamodel
>>>itself (aka a data dictionary.)
>>Yep. XML Schema does all that. It's a bit, er, bulky, though.

> So XML Schema is represented as XML, then? Is there an XML
> schema for XML schema? Does it let you specify, say, ints
> and strings and floats? What about product or sum types?
> Does it allow the specification of foreign keys or some
> other way of doing many-to-many relationships?

Oh yes. Everything you mention above is in there, including key constraints. The typing system is also very elaborate, it even allows you to define certains sets of strings described by regular expression as strings, and is user-extensible.

>>>I'd be interested to hear what sort of specific things you'd like
>>>from a next-generation data model. [...]
>>I think I would first concentrate on the global structure of the data
>>model, the associated languages depend on that. Here I have two big wishes:
>>1. I want a data model that is at an abstraction level that is
>>comparable to the Entity-Relationship Model.

> Does this mean that you'd like to be able to specify, say, two
> different entities and separately the cardinality relationship
> between them? Has anyone attempted this? It seems an intruguing
> approach.

Sure. Coming up with a formal semantics of ER-like models is simple, in fact it has already been done a few times in the past, and such constraints are also a no-brainer and have have already been researched for a while in other contexts such as description logics. It's really all lying on the shelf, waiting to be used.

>>2. I want a data model that is completely unbiased wrt. how data is
>>nested. I don't want a distinction like between the relational engine
>>and the domain/type engine. When the domains get big and complex and
>>queries combine data from different levels, then such a separation will
>>make query optimization unncessarily difficult. 

> Is this only an implementation issue, or are there logical/interface
> issues here as well? Offhand it seems that allowing nested relations
> would meet this requirement, at least the logical part.

Yes, although nested relations would still be a bit biased towards a certain way of storing things, namely nested. Of course, you could argue that this is only superficially so, and you would be right. Nevertheless, if we can avoid it, I think we should. And we can.

  • Jan Hidders
Received on Mon Jun 27 2005 - 23:59:05 CEST

Original text of this message