Re: DB schema -> graph, good strategies

From: Jan Hidders <hidders_at_REMOVE.THIS.uia.ua.ac.be>
Date: 26 Oct 2002 01:23:45 +0200
Message-ID: <3db9d281$1_at_news.uia.ac.be>


Martin Christensen wrote:
>
>DISCOVER is perhaps the best example, and definitely the one that I know
>best of the mentioned systems. [...]

I see. And the basic idea is rather simple, isn't it; if you are looking for a set of key words then they don't all have to be in the same column or the same tuple, but they can also be in related tuples where relatedness is measured in the number of foreign-key jumps. Looks very much like taking proximity into account, only generalized to more than two dimensions. I wonder if you could use the same techniques.

>>> The papers I've read are at best superficial on why they use the
>>> strategy that they do for building schema graphs. If
>>> primary/foreign key relationships map directly to edges, in which
>>> direction should they go, if they're directed at all?
>Jan> The direction matters of course, but whether you want that
>Jan> information depends of course on what you want to do with the
>Jan> graph, so, er, what *do* you want to do with the graph?

Now that I know that I would say that in general the direction doesn't matter. If two tuples are related then it wouldn't make sense to say that the first is related to the second but the second not to the first. Having said that I could imagine that for certain relationships where you know what they mean (e.g. "contains", or "links to") there could be some asymmetry, but that would be a semantic issue and not something you could tell from looking at the schema and certainly not the direction of the foreign key reference. The same would probably be true for the importance of certain relationships (some might be "small jumps" and others could be "big jumps") but also that is a semantic issue.

>Imagine we have some ER model that represents our data as we prefer to
>think of it. Anyone with basic knowledge of databases know that when
>transfering this model to a relational database we need to do data
>normalisation.

I'm not so sure of that. It is my experience that if you make a good ORM / ER diagram you hardly ever need to normalize.

>[...] However, since there are several levels or normalisation and many
>equivalent ways of normalising the same ER model, translating a schema into
>the original ER model is by no means a straight-forward task. Nevertheless,
>I am convinced that to the degree it can be done, it will be helpful. That
>is what I propose to do with the graph.

To the extent that it can be done that's actually quite simple. What you want is to find out whether a table corresponds to an entity type, a relationship type or (if you allow those) a mixed type. You can scratch possibilities using the rules that hold for the ER model you use, until no more possibilities can be scratched. For example it is true in almost all models that foreign keys only arrive in entity types and mixed types but never in relationship types. This gets more difficult/ambiguous if you have rules that say something like you cannot have cycles with at least on entity type in them.

  • Jan Hidders
Received on Sat Oct 26 2002 - 01:23:45 CEST

Original text of this message