Re: Hierarchical databases and semistructured data

From: Jan Hidders <hidders_at_uia.ua.ac.be>
Date: Wed, 20 Feb 2002 18:55:17 +0100
Message-ID: <3c73e2a5$1_at_news.uia.ac.be>


"Morten" <morten_at_kikobu.com> wrote in message news:3C73C407.1090404_at_kikobu.com...

>

> Hi. I'm trying to find out when semistructured data became a
> research topic. The oldest articles I've run into, are on
> TSIMMIS from 1994.

That is correct. The first two main projects were Tsimmis that introduced the OEM model and the related Lore project, also at Stanford. The year that it really took off was 1997 because that is when the first tutorials and overviews appeared that announced it as a big subject.

> Am I correct when I assume that this is a relatively new topic
> that, despite similarities in model, have nothing to do with
> hierarchical databases.

Not completely. Yes, it is a relatively new topic, but there are certainly issues which are similar to hierarchical (or object-oriented) databases. At the same time there are also some special features that make it sometimes a different cup of tea altogether. You have the heterogeneous sets, the possible lack of a schema, the large text fields, the arbitrary nesting depth, ordered sets, et cetera.

> I assume this as the latter has a strict
> notion of schemas.

Well, there are things such as DTD's and XML Schemas, so schemas are not completely absent. But the attitude is really different. In hierarchical DBs you have the schema and the data has to fit or else you cannot store it in the database. In semistructured DBs (in principle at least) you can store anything if it is well-formed, and the validity of the data wrt. a certain schema is more of an afterthought. Also the schemas sometimes leave certain things open, e.g., the specify that a certain field my be there but the type of its contents is left open.

> If I'm incorrect, pointers to tutorials on
> the subject are greatly appreciated.

The book I like the most is "Data on the Web" by Abiteboul, Buneman and Suciu. All three top-researchers in the field of database theory. For tutorials see:

- S. Abiteboul, "Querying semistructured data", In ICDT 1997.
- P. Buneman, "Semistructured data", PODS 1997.
- D. Suciu, "An overview of semistructured data", SIGACT News, 29(4):28-38,
December 1998
  • Jan Hidders
Received on Wed Feb 20 2002 - 18:55:17 CET

Original text of this message