Re: DB schema -> graph, good strategies

From: Jan Hidders <hidders_at_REMOVE.THIS.uia.ua.ac.be>
Date: 29 Oct 2002 10:21:45 +0100
Message-ID: <3dbe5329$1_at_news.uia.ac.be>


Martin Christensen wrote:
>>>>>> "Jan" == Jan Hidders <hidders_at_REMOVE.THIS.uia.ua.ac.be> writes:
>
>[Directionality of schema graphs]
>Jan> Now that I know that I would say that in general the direction
>Jan> doesn't matter. If two tuples are related then it wouldn't make
>Jan> sense to say that the first is related to the second but the
>Jan> second not to the first. Having said that I could imagine that
>Jan> for certain relationships where you know what they mean
>Jan> (e.g. "contains", or "links to") there could be some asymmetry,
>Jan> but that would be a semantic issue and not something you could
>Jan> tell from looking at the schema and certainly not the direction
>Jan> of the foreign key reference.
>
>This is exactly the core issue I'd like to explore further, although
>I haven't been able to find an obviously good place to start. Given
>the directionality fk -> pk, a graph
>
>A <- B -> C
>
>with a schema like A: (foo, bar, id1), B: (id1, id2), C: (baz, quux,
>id2) would obviously very likely be a plain many-to-many relationship
>between A and C.

Would it? Remember that doing science is about critical thinking. Is it really "obviously very likely"? Perhaps B was en entity type with a 1:M relationship with A and another 1:M relationship with C.

>In most cases it would make more sense for the relationship between A and C
>in this case to be closer than if the graph had been A -> B -> C.

Really? Why?

>>> Imagine we have some ER model that represents our data as we prefer
>>> to think of it. Anyone with basic knowledge of databases know that
>>> when transfering this model to a relational database we need to do
>>> data normalisation.
>Jan> I'm not so sure of that. It is my experience that if you make a
>Jan> good ORM / ER diagram you hardly ever need to normalize.
>
>That depends on who makes them. If you get a DBA to make them, then
>they'll likely need less normalisation than if you get non-techie
>business folks to do it.

No, that is not my experience. But this is perhaps besides the point.

>Jan> To the extent that it can be done that's actually quite
>Jan> simple. What you want is to find out whether a table corresponds
>Jan> to an entity type, a relationship type or (if you allow those) a
>Jan> mixed type. You can scratch possibilities using the rules that
>Jan> hold for the ER model you use, until no more possibilities can be
>Jan> scratched. For example it is true in almost all models that
>Jan> foreign keys only arrive in entity types and mixed types but
>Jan> never in relationship types.
>
>This assumes that we have the original model, which is an assumption
>I'm not overly fond of.

No, it doesn't. All it starts from is the tables and the foreign keys. When I speak about "the rules that hold for the ER model you use" I mean the rules of the meta model.

>But I caught myself not asking a very fundamental question: assuming
>we could recreate the original model, what then? Is it just good for
>ranking and/or crawling?

My guess would be: no. It doesn't really give you any more information about relatedness than you already had. But it might give you a model that is easier to understand for the user so he or she can indicate the relatedness properties of the relationships.

>Jan> This gets more difficult/ambiguous if you have rules that say
>Jan> something like you cannot have cycles with at least on entity
>Jan> type in them.
>
>I'm afraid I don't understand what you mean.

The previous rule could be implemented using a fixpoint algorithm that scratches impossibilities. That often gives you a poynomial algorithm. But the latter rule does not allow that and doesn't even define a unique result, so finding an efficient algorithm is going to be hard. This latter rule is not uncommon in meta models of extended ER models.

  • Jan Hidders

>
>Martin
>
>- --
>Homepage: http://www.cs.auc.dk/~factotum/
>GPG public key: http://www.cs.auc.dk/~factotum/gpgkey.txt
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.0.6 (GNU/Linux)
>Comment: Using Mailcrypt+GnuPG <http://www.gnupg.org>
>
>iEYEARECAAYFAj29i00ACgkQYu1fMmOQldU0FQCeNzgkNDW+nuMYQS94axBiU0kq
>Cj4AoOVgUC6BoOG02XUZ5OKd54anskoh
>=IoSe
>-----END PGP SIGNATURE-----
Received on Tue Oct 29 2002 - 10:21:45 CET

Original text of this message