DTDs, specialized DTDs and XML Schema

From: Jan Hidders <jan.hidders_at_REMOVETHIS.pandora.be>
Date: Wed, 23 Feb 2005 20:28:18 GMT
Message-ID: <CH5Td.19357$yB6.2573402_at_phobos.telenet-ops.be>

Dawn M. Wolthuis wrote:
> "Jan Hidders" <jan.hidders_at_REMOVETHIS.pandora.be> wrote in message
> news:n0OSd.18586$jx1.2312308_at_phobos.telenet-ops.be...

>>
>>To give a small example, it is well known that specialized DTDs (a clean 
>>formal version of XML Schema) are equivalent with MSO. This means that if 
>>you can describe in MSO what your documents look like, then there is a 
>>specialized DTD that exactly allows those documents, and vice versa.

>
> Would you recommend use of .dtd instead of .xsd files for schema definition?
> It seems xml schema files have become the standard, haven't they. You
> indicate that you think dtds are cleaner -- in what way is that the case?

I'm afraid the terminology is terribly misleading here. Specialized DTDs are closer to XML schema than to DTDs. Let me try to explain this a a bit more. An example of (the formal simplified version of) a DTD would be the following:

cars -> used new
used -> car*
new -> car*
car -> (year model) | model

So you have a set of tag names and each of them is mapped to a regular expression that is made up of tag names and the usual operators. The regular expression describes the content of elements with the name on the left-hand side of the rule.

Now let's assume that new cars may or may not contain a year elment but used cars must always have one. You clearly cannot specify that in such a DTD because there is only one regular expression the describes the content of all "car" elements wherever in the document they may be.

You can solve this by introducing a special annotation with the tag names, such as car[old] and car[new], that allows us to distinguish two types of element that have the same tag name. You are allowed to use such everywhere you could use tag names in the old formalism, so you can write the following:

cars -> used new
used -> car[used]*
new -> car[new]*
car[used] -> year model
car[new] -> (year model) | model

So you see that his allows you to express that car elements can have different contents depending on where they are nested. It is this formalism that is called specialised DTDs an which is equivalent with MSO.

So what does this have to do with XML Schema? One of the big differences in XML Schema wrt DTDs is that you can use type names and that accomplishes exactly the same because they can be used to simulate the annotated tag names and vice versa.

> I'm beginning to think that you know everything that I want to know on the
> subject of data models -- sorry for continuing the questions, but your
> answers are very helpful.

I'm glad I could help.

Jan Hidders

Received on Wed Feb 23 2005 - 21:28:18 CET

This message: [ Message body ]
Next message: Paul: "Re: not enough rows, too many rows, and joins"
Previous message: Alan: "Re: User Defined Fields - HELP PLEASE!"
In reply to Dawn M. Wolthuis: "Re: What is Aggregation? Re: grouping in tuple relational calculus"
Next in thread: Anith Sen: "Re: Relation Definition"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message