Re: What is the logic of storing XML in a Database?

From: Daniel <danielaparker_at_gmail.com>
Date: 28 Mar 2007 08:45:06 -0700
Message-ID: <1175096706.706903.282710_at_b75g2000hsg.googlegroups.com>


On Mar 28, 10:51 am, "Cimode" <cim..._at_hotmail.com> wrote:
>
> I should have asked the question differently. In what does XML allow
> validation? I thought XML was supposed to be used for transport?
>
Fair enough, reasonable question. If it was purely a transport format, the only thing that would matter was that it arrived at its destination endpoint intact, and a CRC in a header would suffice. So it's a little more than purely a transport format. Their life begins a little earlier, and lasts a little longer, than their entry and exit on the endpoints. The messages are produced and consumed by applications that are typically very loosely coupled, often, as in the case of ecommerce, to completely independent firms. The validation referred to applies to the production and consumption of the messages, or some place in the middle.

> > But nobody can write and sell (or give away as open source) a CSV
> > validator that validates an arbitrary CSV file against a standard
> > schema describing that CSV file, for the simple reason that no such
> > standard schema exists. No tools exist in this category, in contrast,
> > many such tools exist in the XML space.
>
> Tools to do what again?

As an example, an XML Schema may declare that an element named dayCountFraction is of type DayCountFraction.

<xsd:element name="dayCountFraction" type="DayCountFraction">

The type DayCountFraction may restrict the values that the field dayCountFraction can take to a specific list, e.g. "ACT/ACT", "30/360", "30/365".

A schema validator tool will take an XML document, apply the schema, and will reject the message if the value is other than one in the list.
>
> The question should have been: what else does XML schema bring as
> oppose to a header?

Well formedness. If the CSV file departs from strict name value pairs, e.g. uses tags to distinguish different types of rows, a single header line no longer suffices to verify that the number of values matches the number of names. But that's a fairly trivial test anyway. An XMLSchema will validate that a datetime in a date element conforms to a valid ISO datetime, a code in a dayCountFraction falls into a supported set, that the number of leg subtrees in the message is precisely two, etc.

If you doubt the value of this, I'll just point out that a vast amount of code in applications consuming CSV files is devoted to checking precisley these things; they never completely trust the producer. XML Schemas provide a standard way of expressing these rules declaratively, once.
>
> > Do you know about XML Schema? Do you know about domains like life
> > insurance that standardize on a schema such as ACCORD, so that they
> > have a standard way of representing data?
>
> That does not answer the question. This is a specific example of a
> company that uses XML. I know many clients who totally ignore XML.
>
Not a company, rather an entire industry.

> So you are saying that XML is more verbose than CSV right?
>

Yes. For example, I have a CSV file of trades (with many duplicate fields per row) that occupies 9,884KB, and an XML file generated from that CSV file, in a standard industry format, that occupies 34,012KB.

Regards,
Daniel Parker Received on Wed Mar 28 2007 - 17:45:06 CEST

Original text of this message