Re: What is the logic of storing XML in a Database?

From: Marshall <marshall.spight_at_gmail.com>
Date: 30 Mar 2007 08:27:07 -0700
Message-ID: <1175268427.659601.282710_at_p77g2000hsh.googlegroups.com>


On Mar 29, 1:30 pm, Bernard Peek <b..._at_alpha.shrdlu.com> wrote:
> On 2007-03-29, Marshall <marshall.spi..._at_gmail.com> wrote:
> > On Mar 28, 11:23 pm, Bernard Peek <b..._at_alpha.shrdlu.com> wrote:
>
> >> Can I turn this around and ask you what you think XML would look like if it
> >> was more elegant?
>
> > Binary.
>
> Yes, if I had my way the data would be binary.
>
> > Simpler: no distinction between entities and attributes, for example.
>
> XML does allow multiple ways of representing attributes of entities
> but I think the distinction is necessary.

Um, for what? I can't recall ever coming across such a distinction outside of the XML world.

> > It would have a type system.
>
> XML has a type system. In fact it *is* a type system.

Since I have a good technical understanding of what a type system is, in a formal sense, I am confident that XML is not one. Can you rephrase?

> I would be happy with a system that had that option, but for my purposes the
> external schema is a requirement. I need to know that the data in the file
> has been validated, as far as possible, against one specific instance of the
> schema. Being validadted against its own internal schema may be useful but
> is not sufficient.

Well, okay, kinda. But when you get data from any external or untrusted source, you can't assume it's valid even if it says it is. (I suppose that argues both sides of this question, though.)

> > Relational support
>
> One of the advantages of XML is that it can encode data from multiple
> tables, and the relationships between them. That is the reason why it is
> possible to use an XML file as the storage format for a multiple table
> database.

I wouldn't particularly call the existence of an encoding of relations as
hierarchies "support." I am imagining something a good deal more. (For some years now the first question I ask of any system is "does it support a join operator?")

This reminds me of another point. We often say around here that data management is about "structure, integrity, manipulation." The classic neophyte mistake is to focus entirely on the first one and ignore the other two. It's easy to see how that happens because the first one is easy and the other two are hard. I note that XML is 100% about structure, and has no integrity or manipulation aspect whatsoever. These were tackled by later add-on standards, but I have yet to see any evidence that those add-ons are any good, or have any kind of simplicity or elegance or any particularly interesting capabilities.

> But a hypothetical data transefer format should also be capable of encoding
> data using object-oriented and hierarchical data too.

I'm skeptical. I'm unclear what "object oriented data" would be, since the essence of what an object is includes the ability to be updated, and when you transfer data over the wire, you necessarily transfer values and not variables. Perhaps in this context that simply means the same thing as hierarchical.

I note that hierarchical data is not required if you have relations, however I agree that it is a definite nice-to-have.

> > That's just off the top of my head.
>
> It's a good starting point.

Thanks.

Oh, I thought of another one: a principled approach to handling both ordered and unordered data.

Marshall Received on Fri Mar 30 2007 - 17:27:07 CEST

Original text of this message