Re: What is the logic of storing XML in a Database?

From: Cimode <cimode_at_hotmail.com>
Date: 29 Mar 2007 05:10:32 -0700
Message-ID: <1175170232.312631.277390_at_d57g2000hsg.googlegroups.com>


> > What is the interest of using an XML file instead of directly sharing
> > a view
> > with the destination user...
>
> That would suggest at a minimum of sharing DBMS connections with the
> customer, and likely a broker or exchange as well, which is a pretty
> strong form of coupling.
What do you exactly mean by a strong form of coupling?

> Consider one particular ecommerce
> situation: there is you, and another company that you're not
> communicating with directly, and a third party broker that is
> mediating the exchange. Messages containing trade details are flowing
> between you and the other company, through the broker, managed
> electronically, until agreement is achieved. Every day you could be
> dealing with different customers, with whom you haven't had a prior
> business relationship. Experience suggests that in today's world,
> loose coupling works better. I don't mind qualifying that by
> suggesting that in a future world, there might be superior solutions.
I don't quite understand the point you are trying to make. Are you saying that establishing a direct TCP/IP connexion between a sender and a receiver is a part of utopia? In what do network prerequisites differ between a solution involving an XML sent via an FTP on port 21 and a solution involving opening a 1433 or 1521 on a db server are different. Why exactly would you consider this as a part of future plans. I know several companies who simply share views on the same data and do not have the need to use XML at all.

> > > As an example, an XML Schema may declare that an element named
> > > dayCountFraction is of type DayCountFraction.
>
> > > <xsd:element name="dayCountFraction" type="DayCountFraction">
>
> > > The type DayCountFraction may restrict the values that the field
> > > dayCountFraction can take to a specific list, e.g. "ACT/ACT",
> > > "30/360", "30/365".
>
> > Ah OK...What process does it use to do the filtering...Does not the
> > navigation consume extra resources?
>
> Some piece of middleware or some application at the endpoints parses
> the message, and validates the message against the rules in the schema
> document. Sure, it consumes extra resources. But in many cases,
> that's not a problem. Sometimes it is, if the volumes are very high.
Let me summarize this, are you saying that XML:

--> consumes extra resources to *compile* the schema structure (validation)
--> consumes extra resources to fill the schema with data --> consumes extra resources to store it on disk because it has more verbose
--> consumes extra bandwitdth resources between the sender and the receiver because it has bigger size...(unless compressed in which case an extra consumption of CPU resources occurs) --> can be replaced by a CSV with a header, under the condition of having the schema accepted by both parties (sender and receiver)

Am I forgetting something? Are they other reasons why we should be using XML instead of let's say direct view sharing?

> > *Well formedness*. What does it add to validate for instance to a
> > constraint definition in a db?
>
> DB constraint definitions don't apply to messages because messages
> aren't structured like DB relations. Even if you believe that the DB
> constraints guarantee the correctness of the source data, they don't
> guarantee the correctness of the mapping rules that populate the
> messages.
Could you ellaborate on what you mean by *mapping the rule that populate the message*? What *rule* are you refering to? Also, should we consider that data exchanges are message sharing (like email for instance)?

> The messages have to be validated independently.
Would you care to explain on what they should be validated? I am really having trouble understanding this one.

> > > Not a company, rather an entire industry.
>
> > So you are saying that some industry implement it and some not.
>
> Communication between companies is becoming increasingly important,
> and communication requires a common vocabulary. Industry specific
> vocabularies tend to be driven by industry consortiums of the leading
> companies in an industry. While inter-company and ecommerce
> considerations tend to drive the standardization process, the
> standards tend to become adopted in new initiatives within companies,
> simply because they have to use something for internal messaging, and
> adopting an industry standard can prove to be simpler than making up
> something themselves, or evolving their legacy formats.
Mmm... I see. If I understand right there is an important standardization process going on to make all EDI's follow common guidelines for structuring schemas and exchanges.

What I do not understand is:
--> why such standard should really be XML (is it because of the reasons stated above)?
--> why EDI should follow messaging oriented guidelines for structure. After all an email serves different purpose and characteristics than an order detail file. Ther former is sent from one friend to another and contains few lines of text, while the latter may contain several million line of text and is sent from a retailer to a producer (for instance). The former is meant for saying *hello world* while the latter is meant for saying *please produce this*... Why do you think these different problems *should* and *could* be treated the same way?

> > > I have a CSV file of trades (with many duplicate
> > > fields per row) that occupies 9,884KB, and an XML file generated from
> > > that CSV file, in a standard industry format, that occupies
> > > 34,012KB.
>
> > But Bernard Peek suggests the opposite...
>
> :-)
>
> In my experience, in finance, XML files tend to be 5-10 times larger
> than their CSV or other legacy counterparts. Whether that's a problem
> or not depends on volumes. In certain very high volume businesses,
> like exchange traded equity, the larger sizes can be a big problem.
> But also note that when these files are sent over a wire, they
> compress a lot, the middleware generally supports compression.
Do you mean that it is better to use compression when using XML? Does not that constitute extra resource consumption (CPU)? Do you use another machine...

Regards... Received on Thu Mar 29 2007 - 14:10:32 CEST

Original text of this message