Re: Parts explosion with repeated subtrees

From: Costin Cozianu <c_cozianu_at_hotmail.com>
Date: Sat, 14 Dec 2002 11:46:45 -0800
Message-ID: <atg1nv$1376ao$1_at_ID-152540.news.dfncis.de>


Damjan S. Vujnovic wrote:
> Costin Cozianu wrote:
> : It looks to me that it's a badly designed schema to begin with.
> :
> : (Food_Item -> Item_Type) and food_item is not candidate key.
> :
> : Technically what you have is a DAG (directed acyclic graph) and not a
> : forest. What it boils down to is the transitive closure. What's about
> : this problem that is different from any other parts explosion problem ?
> :
> : Transitive closure should have been part of SQL and SQL based products
> : ages ago, and probaby it's not yet, but can be solved relatively painful
> : using Db2 recursive queries, Oracle CONNECT BY operator, or using hand
> : made "materialized view" tables updated via triggers. I don't know
> : what's the problem with "trees". Vendors don't pay attention to the
> : users (they're busy implementing other UFOs like XML, "objects" and
> : object pointers), the SQL committee doesn't pay attention to anyone. I
> : mean recursive queries might be a cool trick in theory but it can't
> : address the basic needs of the user: a safe and simple way to resolve
> : the transitive closure.
> :
> : Joe, you have to confess that you orchestrated this whole thing, so that
> : you can publish a whole book full of twisted and mind boggling Joe Celko
> : style SQL, solving a problem that in a normal world would be absolutely
> : trivial :)
> :
> : How about SQL committee specify a TRANSITIVE CLOSURE JOIN, and vendors
> : implement it correctly, would you still publish a book on trees in SQL ?
>
> How about performance issues? Optimization? Even if something like
> TRANSITIVE CLOSURE JOIN (TCL from now on) becomes a standard, the alternate
> hierarchy representations would still make sense (of course, depending on
> what do you wanna do with your data). Can you imagine the horror of
> (ab)using? SQL (as is) is "adapted" (sorry for my English) to squeeze the
> best performance from relational model. The goal is to design our tables in
> such a manner that we need simple SELECT on one table, and not monsters like
> TCJ to make the things done.
>

Performance issues ? Optimization ?

I guess you should bring the separation between physical and logical models into discussion first. One you distort the logical model you present to the user, to make room for an "optimization" you are more than likely to suffer much more than you gain.

Once a TCLOSE operator is in place, encoded trees and the whole Celko's book are an absolute no-no.

How do you achieve performance with TCLOSE ?

   CREATE INDEX I ON "Relation" (Parent_id, Child_id)

	OPTIMIZE FOR TRANSITIVE CLOSURE
	// ... eventually with other options

Or CREATE MATERIALIZED VIEW AS TCLOSE("Relation")

> :
> : best regards,
> : Costin Cozianu
> :
> : Is it not proven that transitive closure is impossible in SQL ?
>
> Yes, but just because of that, you change the way you are representing your
> "relation" (the one you want to find it's transitive closure) and all of a
> sudden simple old SELECT is quite good enough.
>

Create a transitive closure view and a simple old SELECT is good enough.

> Don't misunderstand me, I'm not against something like TCL (but I'll stay
> away from it if I can), but IMO even after introducing TCL into SQL,
> alternate representation would still make sense.
>

No they wouldn't make any sense. It is abusing and damaging the logical schema, it is a lot of duplication of code and effort by the end user for a thing that should be solved elementarily by the vendor.

> regards,
> Damjan S. Vujnovic
>
> University of Belgrade
> School of Electrical Engineering
> Department of Computer Engineering & Informatics
> Belgrade, Yugoslavia
>
> http://galeb.etf.bg.ac.yu/~damjan/
>
> P.S. I haven't give it a second thought, but I think Oracle's CONNECT BY
> operator won't save us here because one node can have more than one
> "parent"...
>

I need to check for that. I am sure it will fail on cycles however. But probably neither recursive views in DB2 or SQL 99 are not good for the general case because a cycle can blow them away. One can limit the "LEVEL" say where LEVEL < 100 and ... pray for the best :) Very far from what we can call decent software engineering.

Computing transitive closure is an assignment to first year students on Algorithms and Data Structures 101. It is ridiculous that multi billion dollar vendors abdicate from their responsibility to give their users what they need. It is also a big disaster in economical terms. A vendor could probably solve this problem easily once and for all (don't know maybe 1 million - 10 million dollars effort ?). Users instead suffer great losses in terms of lost time, brittle software, bugs, data corruption, etc, etc and they have to reinvent the same broken wheel time and time again. No user can do this right because the only place where this can be done right is at the DBMS engine level.

Best regards,
Costin Cozianu Received on Sat Dec 14 2002 - 20:46:45 CET

Original text of this message