Re: Real world issue: How to "split" queries in presence of record replication and replication sensitive aggregate functions ?
Date: 15 Sep 2006 03:00:05 -0700
Message-ID: <1158314405.126558.316070_at_b28g2000cwb.googlegroups.com>
Alexandr Savinov ha scritto:
> pamelafluente_at_libero.it schrieb:
This is why to have a meaningful use case is so important.
Thanks Alexandr,
I like your suggestion and I want to work on this direction.
[About the meaningful part I still believe that is irrelevant. My purpose is to implement something that gives the best output given a certain input. It's up then to the user to judge if this makes sense or not. But the program must be anyway work. It cannot be involved in semantics.]
>
> > The input of the problem are:
> >
> > - Tables (in abstract sense, as a list of field) and a list of their
> > field including datatype
> > - Relationships
> > - Structure (list of fields) of the report with properties of involved
> > functions:
> > - Function is aggregate: Yes/No
> > - Function is sensitive to replication (eg. sum): Yes/no
> >
> > That's it. Simple to state.
>
> I like your problem formulation. But in this form it hardly can be
> solved. And the main reason is that providing "a list of fields" as a
> report specification is in most cases ambiguous, i.e., it cannot be
> translated into one query unambiguously.
>
> A solution to your problem could be as follows:
>
> 1. Choose one *primary* table with entities listed in the report.
>
> Notice 1.1 No fields are chosen yet from this and other tables.
>
> Notice 1.2 If this table does not exist then it has to be somehow
> defined, e.g., as a view or via query.
I have begun doing some attempts. I think the first step is to attain a
"partioning" of the relationships so that we are able to identify the
tables which will go in each subqueries.
The final subqueries clearly might have tables in common.
I was thinking to proceed as follows:
that are, say, on side 1 of a 1-N relationship
But what I cannot believe is that this problem has not been systematically settled up by someone and we must be reinventing this wheel. As it seems to me that is the most basic issue one finds out when extracting data from a database. Isn't this a known problem in theory with some name attached to it ? There must be literature references somewhere ...
-P
>
> 2. Define properties of the primary entities by indicating fields from
> other tables.
>
> 2.1 Properties may have 3 major types: from supertables via dimension
> (column) names, from subtables via inverse dimensions (columns taken in
> the opposite direction), zigzag properties which combine the previous
> two types.
>
> Notice 2.2 Sometimes specifying a field is not enough and it is
> necessary to provide a hint for choosing an appropriate access path.
>
> Notice 2.3 An aggregation function can be applied to properties
> returning a collection (see 2.1.2 and 2.1.3)
>
> This approach can be more or less easily implemented but here I would
> like to warn you that the main problem is not here. It appears when you
> start imposing restrictions in your report (say, via WHERE and HAVING
> clause in SQL). You may easily get cyclic dependencies which need to be
> somehow resolved. But it is already another story...
>
> >> -P
> >
>
> --
> http://conceptoriented.com
Received on Fri Sep 15 2006 - 12:00:05 CEST