Re: Resiliency To New Data Requirements

From: dawn <dawnwolthuis_at_gmail.com>
Date: 16 Aug 2006 09:59:43 -0700
Message-ID: <1155747583.831952.130030_at_75g2000cwc.googlegroups.com>


Marshall wrote:
> dawn wrote:
> >
> > Maybe a little, but the www is still a very large distributed database
> > of sorts.
>
> It totally isn't. What makes up a database? Structure, integrity,
> manipulation to start with.

I didn't say that it was a DBMS. Your definition of "database" is not the norm, it seems, but I went to the cdt glossary and found this

"[Database]
  "A logically coherent collection of related real-world data   assembled for a specific purpose." -- rephrased from
"Fundamentals of Database Systems", Elmasri & Navathe.

  1. Deluxe filesystem
  2. Shared databank (E. Codd) "

I think the web fits within these definitions.

> HTTP+HTML has *none* of those,
> let alone more advanced things we might consider part of
> a dbms.
>
>
> > It has structured data (in spite of what others might call
> > it),
>
> No it doesn't. It has markup.

You see no structure in the marked-up data?

> Markup is not structure; there is
> no schema.

If we limit it to the xhtml pages, would you then say it is structured data?

> If HTML is structured data, then troff is structured
> data. No schema: no structure. The level of things you can
> do with HTML are: put this word in bold.
>
> There's no DML. There isn't even a query language. GET is
> not a query language. There are no integrity rules.

Perhaps there is a theory definition of the word structure that you are using to draw the conclusion that the web does not have structured data. My take is that a single page can be a node/attribute with a value that is the html, for example, with directional paths to other nodes for which there are links in the page. Structure, no?

> HTTP+HTML doesn't even remotely qualify as a data
> management system.

Agreed. It doesn't fit my def of a DBMS.

> It's a distributed document retrieval
> system. They are not the same thing. I'm not even sure
> on what basis one could claim they were related.

Every attribute value is a document, of sorts, however small.

> > persisted on secondary storage devices, accessed by people.
>
> If this is your definition, then 3x5 cards is data management.

Not DBMS, but, yes, it would be a database by my definition (and the def of many others as I understand it).

>
> > There isn't a great query langauge, I'll grant.
>
> There isn't *any* query language. Retrieving a document by
> a key isn't a query language. It's a cheapo function call.

Fine. Again, perhaps the industry has done something with the English word "query" so that when I put a word into google and retrieve a list of "keys" from which to choose that would not be a query. It seems like a query to me, but surely not like an SQL query.

> > The requirements are not
> > identical to those of a DBMS, but the model for the data ought to be
> > taken seriously and moved forward accordingly.
>
> No, it shouldn't. There is no data model to take seriously.

Perhaps not until "it" (a di-graph of tree nodes, perhaps?) replaces that which currently is considered the only possible data model, eh? Since the model of data that I typically work with (in what most would call a DBMS) could be seen as a di-graph of trees, we know it is possible to do data management with such collections, even if we don't want to call the abstraction of this approach a data model. I'll grant it is not the same. But I will stand my ground that it should be taken seriously by those researching data models.

> Efforts to retrofit one have been embarrassing. If we want
> to do data management, and we are studying HTML+HTTP,
> then we should consider it a negative example.

For some things, yes, e.g. 404s. Cheers! --dawn Received on Wed Aug 16 2006 - 18:59:43 CEST

Original text of this message