Re: Real world issue: How to "split" queries in presence of record replication and replication sensitive aggregate functions ?

From: <pamelafluente_at_libero.it>
Date: 17 Sep 2006 08:21:40 -0700
Message-ID: <1158506500.093548.23650_at_b28g2000cwb.googlegroups.com>


Alexandr Savinov ha scritto:

> > David has 1 fact and Marshall 2 facts.
> >
> > "questionable" select:
> >
> > SELECT
> > One.[Name] AS [Name],
> > count(Trans.Fact) AS [FactCount],
> > count(SubTrans.SubFact) AS [SubFactCount]
> > FROM
> > One,
> > Trans,
> > SubTrans
> > WHERE
> > One.[Name] = Trans.[Name] AND
> > Trans.Item = SubTrans.Item
> > GROUP BY
> > One.[Name];
>
> Here is a possible problem formulation and solution.
>
> Given a sequence of n tables T1,T2,...,TN where each next table
> references the previous table in its column ci, i=1,2,...,N. Find all
> records from the last table TN which indirectly reference records from
> the first table T1.
>
> Query for 2 tables:
>
> Deproject2 =
> select T2.id
> from T2, T1
> where T2.c2 = T1.id // Or, alternatively, via JOIN
>
> Solution for N tables is defined recursively:
>
> DeprojectN =
> select TN.id
> from TN, (DeprojectN-1) as TN-1
> where TN.cN = TN-1.id
>
> Notes:
>
> 1. A column for aggregation should be selected in the last table instead
> of the id column.
>
> 2. Additional conditions for selecting records from the first table can
> be added to the query.
>
> 3. This query is used only to demonstrate the main logic (actually, it
> demonstrate how bad is SQL for solving such problems).
>
> 4. In order to use groupby you need to add the corresponding column and
> may be some other options.

Thanks!! Alexandr

finally here we go! This is just the direction I meant. Just working with formal properties of the design!

It's true that somehow we are coping with SQL limits, but it is also true that it is expected (from programs) that they give the best answer within these limits.

Your recursive procedure (which assume a sort of "cascade" pattern) and proceeds backwards in direction N-1 is fine to me. Actually, so fine that I have in mind (at an intuition stage) a generalized version of it, to produce the subqueries, which (I hope) could work with any "pattern" of tables/relationships and any choice of functions.

The general idea is to do a "double scan": one forward in direction 1-N to cut the design into parts which "cannot safely stay togheter" (could give wrong result on functions due to duplication), and another backward in direction N-1 to build the union of subqueries...

Would you like me to try to explain it, so that you theorists can help me to fix possible flaws and suggest corrections ? If it works I will write the corresponding program...

-P
>
> --
> http://conceptoriented.com
Received on Sun Sep 17 2006 - 17:21:40 CEST

Original text of this message