Re: Two examples of semi structured data.
Date: Mon, 23 Aug 2004 06:35:45 -0400
"mAsterdam" <mAsterdam_at_vrijdag.org> wrote in message news:41270edb$0$21106$e4fe514c_at_news.xs4all.nl...
> You may recall from the recent thread
> "It don't mean a thing if it ain't got ..."
> how I think about the widespread but futile
> thinking of data as potentially meaningless.
> Datacapture is tough. It is one of the most essential
> steps in getting from text to information.
> It compares nicely to the capture step in audio-visual
> production - think camera's, microphones,
> synthesizers, filters, signal levels, delays
> and recording.
My favorite example is the capture of data inside the black boxes that airliners carry.
Most of that data is worthless, if not meaningless. None of it helps the airliner get to its destination. But data from earlier black boxes does, indirectly.
> Now to the document, "Querying Semi-Structured Data".
> When I read texts like: "Some of this data is C<raw>
> data, e.g., images or sound." I infer that the author
> talks about potentially meaningless C<signs>, not about
> I don't have to wait very long to verify that the damage of
> this non-choice is done. "We call here C<semi-structured
> data> this data that is (from a particular viewpoint) neither
> raw nor strictly typed, i.e. not table-oriented as in a relational
> model or sorted-graph as in object databases." Well (please
> keep in mind I am making statements of taste, I am *not* refuting
> the author's argument): by lumping together sorted-graphs and tables
> in one category "strictly typed" suddenly all structure _inherent_
> in the data is out of focus.
It tastes different to me. I view the entire project of "databases",
starting with the IDS in c. 1952
as a continuing attempt to communicate with the future reader. That relies on the explicit data recorded, and on common conventions between the writer and the reader.
If those conventions are lost, and the reader is forced to infer them, I think we're into archaeology rather than history.
> "To completely structure the data often remains an elusive goal"
> I am so out of here - where is the door!
But it's true! Received on Mon Aug 23 2004 - 12:35:45 CEST