Re: XML: The good, the bad, and the ugly

From: Lemming <thiswillbounce_at_bumblbee.demon.co.uk>
Date: Mon, 18 Oct 2004 01:09:44 +0100
Message-ID: <7mv5n0hes3aan2dkg426ec76a91dhgcknf_at_4ax.com>


On Sun, 17 Oct 2004 23:03:01 GMT, "Marshall Spight" <mspight_at_dnai.com> wrote:

>"Lemming" <thiswillbounce_at_bumblbee.demon.co.uk> wrote in message news:fup5n0p6omq1gq05ib0jpo6tnejl0dpv5p@4ax.com...
>> On Tue, 5 Oct 2004 16:32:59 -0400, "Laconic2" <laconic2_at_comcast.net>
>> wrote:
>> >
>> >Why do we all use binary?
>>
>> Binary is something we use because we are stuck with it. It's only
>> there because of the hardware; because the devices we use are based on
>> a model which uses a system of on/off switches to represent data.
>> And most of the time we don't even need to know that those devices are
>> working in binary.
>
>"It's only there because of the hardware" applies to every file format
>ever, including XML. As a statement, it's a no-op.

I'm not sure I agree, although I am finding it difficult to put my reasons into words.

Binary is not a file format. I'm aware that a chip designer may well see things differently but from the perspective of software the on/off switches which make up the computer's memory is the bricks and mortar from which we build everything else. Binary representation is an unavoidable limitation imposed upon us by this hardware. To use a different system, we would need new hardware.

XML, or any other file format, is not constrained by the hardware (beyond the fact that it is stored upon the hardware). XML is a conceptual solution to a logical problem. We are not constrained to use XML other than by design decisions (or as in the case I find myself, by political decisions made by men in suits who have approximately the same level of understanding of computer systems than I have of motor mechanics). A file format is a description of a way of logically structuring data. Binary is (a mental model of) the hardware's means of physically storing the data.

This post was written on a digital computer. It is stored in binary on this computer, will be transmitted in binary form around the network, and stored on various other machines again in binary. But the *file format* is plain text, written in what I hope is a reasonable imitation of the English language. That file format could equally well be represented on paper, in speech, or whatever. The medium is not the message.

>> XML, it seems to me, is a format designed for general data transfer.
>
>It's a format designed to markup text with presentation, which
>is part of why it's so *bad* for general data transfer.
>
>
>> It is a Jack of All Trades.
>
>Yes, it's bad at everything (except possibly marking up text with presentation.)

Semantics, surely? The presentation of the data comes elsewhere.

>> It is strong at providing data in a
>> standard format readable by a wide variety of different systems.
>
>It's no better at this than any other format. Just because many
>computers already have 500k XML parsers installed doesn't
>mean the format is easy to parse.

I've never tried to write an XML parser, but I've thought about it. It seems to me that it should be reasonably easy to write a non-validating parser as long as the language supports recursion. A validating parser would seem to be slightly more complex, but not significantly more. It seems to me that the hard part is, as with anything, knowing what to do with the data once you've finished parsing!

>> Where it falls down is as a data transfer medium for large volumes of
>> data between specific systems. Just like Jack, it knows a little
>> about many things, but does not know very much about any of them
>> individually.
>
>XML is an appalling disaster. It's what comes of thinking that
>because something is good for one purpose, it must be good
>for every purpose.

Which was where this part of the discussion began - with me asking why we had to have a single one-size-fits-all format, which led to Laconic's Zen question about binary.

>It isn't typed, for crying out loud! It doesn't
>have schemata!

This statement goes over my head. I thought the point of XML Schema was to provide type information as well as structure.

Lemming

-- 
Curiosity *may* have killed Schrodinger's cat.
Received on Mon Oct 18 2004 - 02:09:44 CEST

Original text of this message