Re: Amazon's "Simple" Database

From: Sampo Syreeni <decoy_at_iki.fi>
Date: Wed, 19 Dec 2007 16:38:56 +0200
Message-ID: <Pine.SOL.4.62.0712191529050.13900_at_kruuna.helsinki.fi>


On 2007-12-18, Roy Hann wrote:

>> The thing that makes XML attractive to some people is not that it
>> would be a good basis on which to build a dtabase, but that it seems
>> convenient for data exchange.
>
> Is XML actually attractive to significant numbers of people?

I'd divide the answer into two. For the original use SGML/XML and related markup languages (like HTML) were meant for, i.e. markup of running text compatible with plain text editors, they still hold quite a lot of value.

The match for data is much weaker. The only real value I can find in XML there is that currently we don't really have a commonly accepted and implemented format in which to serialize datasets. At least serializing in-transit ones into a metaformat that all the other people use gives us some minimal sharing of tools, like parsers, browsers and APIs. Sometimes we even get to the point of serializing to a proper, shared, concrete format (say vCal or RSS), so that the compatibility runs deeper.

But of course that is not how most people use XML; they use it for everything. Then Bad Things Happen: using XML implies inserting an extra layer of complexity into your system. Its textual encoding entails bloat. The rich hierarchical structure which serves text so well invites all of the usual data modelling mistakes. People worry too much about the metaformat to lay the proper groundwork for the actual formats. Data ends up being locked up inside a serialization, when we'd really like random access. Tying applications to a serialization, i.e. a physical data format, makes data independence impossible to achieve. And of course the rest of the big picture, like data language design, transactional guarantees and integrity still haven't been solved, so that everybody ends up cooking their own and more often than not getting it horribly wrong. The list goes on.

XML has its uses. Unfortunately the hype causes people to misapply it.

-- 
Sampo Syreeni, aka decoy - mailto:decoy_at_iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Wed Dec 19 2007 - 15:38:56 CET

Original text of this message