Re: Newbie Query Question

From: Douglas Hawthorne <douglashawthorne_at_yahoo.com.au>
Date: 20 Apr 2004 18:33:07 -0700
Message-ID: <cf15dee4.0404201733.7b03bd85@posting.google.com>

"JES" <jeswoff_at_sbcglobal.net> wrote in message news:%Ojfc.16669$xd3.4562_at_newssvr22.news.prodigy.com...
> Thanks Douglas,
>
> I have much to learn. I am definitely going to give the pivot table a
try.
> I just wanted to note that in my original post I was posting more of a
> pseudo code to give the reader an idea of what I was doing; sorry, I was
so
> sloppy with my syntax. Of the two methods originally posted, I also
wanted
> to point out that the method that currently works the quickest for me is
> Method 2. If it makes any difference, I did leave off a set of WHERE
> statements that I believe are -
>
> WHERE tbl1.DATE = tbl1_tmp.DATE AND
> tbl1.DATE = tbl2_tmp.DATE AND
> tbl1.DATE = tbl3_tmp.DATE
>
> Maybe I am just getting lucky that I am getting data out of this query?
> Essentially the jist of Method 2 in my newbie mind is that instead of
having
> a select statement in the SELECT block I put the select statement in the
> FROM block to make three tables where tbl1_tmp only has DATANAME1 data,
> tbl2_tmp only has DATANAME2 data, etc...
>
> Another question...
> Would the analysis change significantly if tbl1 and its associated _tmp
> objects are actually a view of a raw data table? I have made a temporary
> table to hold the data from the view and then referenced the table instead
> but did not see significant time savings.
>
> Thanks,
>
> JES
JES, It may be fun to speculate on what the CBO might or might do. I would argue that idle speculation without experimentation is a waste of time. You will need to learn to look at the various tracing options starting with AUTOTRACE and PLAN TABLES before moving onto SQL_TRACE and 10046 traces.

The summary of the analysis so far is from the TKPROF output:

Test Run           elapsed  disk      query  current
--------           -------  ----  ---------  -------
Method #1: No PK    348.28     0  6,062,169        0
Method #1: PK         1.49   177     81,711        0
Method #1: IOT        0.67     0     42,609        0
Method #2:            0.20     0      1,361        0
Pivot: Heap           0.34   110        155        2
Pivot: IOT            0.17     0      1,013        0

You will note that these results were obtained on a lightly loaded system (single-user running one (1) query at a time) and the data was inserted in order of the primary key (so the clustering factor is at the minimum).

These results do not include the overhead of populating the TBL1 table from the TBL1_TMP table. If this is taken into account, then the speed advantage of the Pivot method is even more pronounced.

Method #2 Analysis

I looked at your second method. Even with the supplied WHERE clause, I still got syntax errors. I rewrote the query as follows:

SELECT /* Method 2: without primary key */

      tbl1.datadate,
      tbl1_tmp1.datavalue AS dataname1,
      tbl1_tmp2.datavalue AS dataname2,
      tbl1_tmp3.datavalue AS dataname3
   FROM
      (
         SELECT
               datadate,
               datavalue
            FROM
               tbl1_tmp
            WHERE
               dataname = 'DATANAME1'
      ) tbl1_tmp1,
      (
         SELECT
               datadate,
               datavalue
            FROM
               tbl1_tmp
            WHERE
               dataname = 'DATANAME2'
      ) tbl1_tmp2,
      (
         SELECT
               datadate,
               datavalue
            FROM
               tbl1_tmp
            WHERE
               dataname = 'DATANAME3'
      ) tbl1_tmp3,
      tbl1
   WHERE
         tbl1.datadate = tbl1_tmp1.datadate
      AND
         tbl1.datadate = tbl1_tmp2.datadate
      AND
         tbl1.datadate = tbl1_tmp3.datadate

;

The TKPROF output is as follows:

call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ------

Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 870 0.17 0.19 0 1361 0 13033
------- ------ -------- ---------- ---------- ---------- ---------- ------

total 872 0.18 0.20 0 1361 0 13033

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 73

Rows Row Source Operation

-------  ---------------------------------------------------

13033 HASH JOIN (cr=1361 pr=0 pw=0 time=362954 us) 13034 TABLE ACCESS FULL TBL1_TMP (cr=155 pr=0 pw=0 time=26252 us) 13033 HASH JOIN (cr=1206 pr=0 pw=0 time=253282 us) 13034 TABLE ACCESS FULL TBL1_TMP (cr=155 pr=0 pw=0 time=26223 us) 13034 HASH JOIN (cr=1051 pr=0 pw=0 time=154423 us)

  13035     TABLE ACCESS FULL TBL1 (cr=27 pr=0 pw=0 time=13185 us)
  13034     TABLE ACCESS FULL TBL1_TMP (cr=1024 pr=0 pw=0 time=52315 us)

This shows the CBO prefering a HASH JOIN to construct the data. Here TBL1 was deemed to be small enough to be put into memory. However, please note that we are three (3) passes over the data in TBL1_TMP.

I have not shown the trace output for the execution of Method #2 with the primary key enabled because it is the same. The CBO ignored the PK in this case.

Douglas Hawthorne Received on Tue Apr 20 2004 - 20:33:07 CDT