Re: Extending my question. Was: The relational model and relational algebra - why did SQL become the industry standard?

From: Paul Vernon <paul.vernon_at_ukk.ibmm.comm>
Date: Mon, 3 Mar 2003 18:18:28 -0000
Message-ID: <b406dp$faa$2_at_sp15at20.hursley.ibm.com>


"Lauri Pietarinen" <lauri.pietarinen_at_atbusiness.com> wrote in message news:3E631F99.8010404_at_atbusiness.com...
> Steve Kass wrote:
[snip]
> > 2. [You said] Why can't it transform this query
> >
> >
> > SELECT DISTINCT P#, PNAME
> > FROM V_S_SP_P;
> >
> > [assuming you admit DISTINCT is ok]
>
>
> yes yes...
>
> > into
> >
> > SELECT P#, PNAME
> > FROM P
> > WHERE EXISTS
> > (SELECT * FROM SP
> > WHERE P.P# = SP.P#)?
> >
> >
> >
> > 2. [I'm hearing] You can perhaps live with the requirement
> > that DISTINCT be added, but you want SQL Server to
> > figure out from the metadata that because of FK constraints
> > and NOT NULL constraints and correspondence of types,
> > the optimization should happen. You think that if the query
> > without DISTINCT "eliminated duplicates", then it would
> > be easier to generate an efficient plan.
> >
> > I'm fairly sure but not absolutely sure, that the faster EXISTS
> > query is identical to the DISTINCT query,
>
> It is, assuming the obvious constraints are in place.

FYI. DB2 does do some optimisation (I.e. does not join to S), but does not go the whole hog in it's rewritten SQL (see below) However it does set it's 'Early Out Flag' in the access plan on the join between P and SP, which would AKAIK have the same effect as the EXISTS check in the 'more fully' optimised SQL suggest above.

Now, if it were possible to have a SQL constraint such that every Part has to be supplied by at least one supplier, then the query could be optimized to

SELECT P#, PNAME
FROM P
;

But, without such a constraints being definable in [DB2] SQL, I think DB2's optimisiation is OK.

SET CURRENT SCHEMA DB2
;
CREATE TABLE S
( S# INTEGER NOT NULL UNIQUE
, SNAME CHAR(20) NOT NULL UNIQUE
)
;
CREATE TABLE P
( P# INTEGER NOT NULL UNIQUE
, PNAME CHAR(20) NOT NULL UNIQUE
)
;
CREATE TABLE SP

( S# INTEGER NOT NULL REFERENCES S(S#)
, P# INTEGER NOT NULL REFERENCES P(P#)
, QTY INTEGER NOT NULL CHECK (QTY > 0)

, UNIQUE(S#, P#)
)
;
CREATE VIEW V_S_SP_P
( S#, SNAME, QTY, P#, PNAME)
AS
SELECT S.S#, S.SNAME, SP.QTY, P.P#, P.PNAME FROM S, SP, P
WHERE S.S# = SP.S#
AND SP.P# = P.P#
;
explain plan for SELECT DISTINCT P#, PNAME FROM V_S_SP_P ;
db2exfmt -d sample -1 -o explain.out
;

Original Statement:



SELECT DISTINCT P#, PNAME
FROM V_S_SP_P Optimized Statement:

SELECT DISTINCT Q2.P# AS "P#", Q2.PNAME AS "PNAME" FROM DB2.SP AS Q1, DB2.P AS Q2
WHERE (Q1.P# = Q2.P#) Access Plan:
        Total Cost:             25.338
        Query Degree:           1

                Rows
               RETURN
               (   1)
                Cost
                 I/O
                 |
                 88
               HSJOIN
               (   2)
               25.338
                  1
           /------+-----\
        88                137
      TBSCAN            IXSCAN
      (   3)            (   4)
      25.1328          0.182339
         1                 0
        |                 |
        88                137
  TABLE: DB2        INDEX: SYSIBM
 P                 SQL0303031750371



[snip]

        2) HSJOIN: (Hash Join)
                Cumulative Total Cost:          25.338
                Cumulative CPU Cost:            511088
                Cumulative I/O Cost:            1
                Cumulative Re-Total Cost:       25.338
                Cumulative Re-CPU Cost:         511088
                Cumulative Re-I/O Cost:         1
                Cumulative First Row Cost:      25.338
                Estimated Bufferpool Buffers:   2

                Arguments:
                ---------
                BITFLTR : (Hash Join Bit Filter used)
                        FALSE
                EARLYOUT: (Early Out flag)
                        TRUE
                TEMPSIZE: (Temporary Table Page Size)
                        4096

                Predicates:
                ----------
                2) Predicate used in Join
                        Relational Operator:            Equal (=)
                        Subquery Input Required:        Yes
                        Filter Factor:                  0.0113636

                        Predicate Text:
                        --------------
                        (Q1.P# = Q2.P#)


Regards
Paul Vernon
Business Intelligence, IBM Global Services Received on Mon Mar 03 2003 - 19:18:28 CET

Original text of this message