Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Help with sql-query

Re: Help with sql-query

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: Fri, 22 Jun 2007 03:54:05 -0700
Message-ID: <1182509645.772859.175230@k79g2000hse.googlegroups.com>


On Jun 21, 11:02 am, Matt T <puskas.d..._at_gmail.com> wrote:
> Ugly code alert - and wouldn't want to debug that.
>
> Surely the elegant (well designed!) solution is a calendar_dates
> table.....
>
> SELECT * FROM test_tab
>
> 1 21/06/2007 24/06/2007
> 2 23/06/2007 30/06/2007
>
> SELECT *
> FROM calendar,
> test_tab
> WHERE cad_code = 'TESTCAL'
> AND cad_date BETWEEN b AND c
>
> TESTCAL 21/06/2007 1 21/06/2007 24/06/2007
> TESTCAL 22/06/2007 1 21/06/2007 24/06/2007
> TESTCAL 23/06/2007 1 21/06/2007 24/06/2007
> TESTCAL 24/06/2007 1 21/06/2007 24/06/2007
> TESTCAL 23/06/2007 2 23/06/2007 30/06/2007
> TESTCAL 24/06/2007 2 23/06/2007 30/06/2007
> TESTCAL 25/06/2007 2 23/06/2007 30/06/2007
> TESTCAL 26/06/2007 2 23/06/2007 30/06/2007
> TESTCAL 27/06/2007 2 23/06/2007 30/06/2007
> TESTCAL 28/06/2007 2 23/06/2007 30/06/2007
> TESTCAL 29/06/2007 2 23/06/2007 30/06/2007
> TESTCAL 30/06/2007 2 23/06/2007 30/06/2007
Note: original response was sent eight hours ago, apparently never posted. What appears below strongly agrees with Sybrand's observerations.

That is an interesting analysis. Five years ago, I might have made the same analysis about what is elegant and well designed. However, since that time I have had to deal with the aftermath of three, four... seven years of data collecting in such tables. Any idea what happens to the performsnce of the proposed CALENDAR table as three, four... seven years of dates are added, and for more than one CAL_NAME? Let's try a little experiment. Just to keep the numbers clean, we will insert 1000 dates into the CALENDAR table, which will carry us through a bit less than three years, even if one of those years happens to be a leap year.

CREATE TABLE CALENDAR(CAL_NAME VARCHAR2(255),CAL_DATE DATE); INSERT INTO
  CALENDAR
SELECT
  'TESTCAL',
  TRUNC(SYSDATE,'YEAR') + ROWNUM -1
FROM
  ALL_OBJECTS
WHERE
  ROWNUM<=1000;

And now a test with the CALENDAR table using my T1 table containing the dates that was defined in a previous post: SELECT
  *
FROM
  CALENDAR,
  T1
WHERE
  CAL_NAME='TESTCAL'
  AND CAL_DATE BETWEEN C2 AND C3; Looks rather harmless. The DBMS Xplan for the above:



| Id | Operation | Name | Starts | E-Rows | A-Rows | A- Time | Buffers |

| 1 | NESTED LOOPS | | 1 | 1 | 8 | 00:00:00.02 | 7007 |
|*  2 |   TABLE ACCESS FULL| CALENDAR |      1 |      1 |   1000 |
00:00:00.01 |       7 |
|*  3 |   TABLE ACCESS FULL| T1       |   1000 |      1 |      8 |
00:00:00.01 | 7000 |

Predicate Information (identified by operation id):


   2 - filter("CAL_NAME"='TESTCAL')
   3 - filter(("CAL_DATE">="C2" AND "CAL_DATE"<="C3"))

Notice the two TABLE ACCESS FULL statements - that was probably expected. Now, notice the 1000 Starts for the T1 table - that means for each row in the CALENDAR table, Oracle performed a full tablescan of the T1 table. Can this be helped with an index? I think that you may have trouble with that.

Now, compare the above with the DBMS Xplan for the second suggestion that I provided:



| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
|   1 |  SORT ORDER BY      |             |      1 |    100 |      8 |
00:00:00.01 |      13 |  2048 |  2048 | 2048  (0)|
|   2 |   NESTED LOOPS      |             |      1 |    100 |      8 |
00:00:00.01 |      13 |       |       |          |
|   3 |    TABLE ACCESS FULL| T1          |      1 |      2 |      2 |
00:00:00.01 |       7 |       |       |          |
|*  4 |    INDEX RANGE SCAN | SYS_C008067 |      2 |     50 |      8 |
00:00:00.01 |       6 |       |       |          |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   4 - access("RC"."RN"<="T1"."C3"-"T1"."C2")

Take a close look at the above and see how it compares with the plan for the SQL that you provided.

Let's try again. Assume that the system has been in use for 12.5 years, and you now have historical data back to 1995. What would happen to the performance of the two methods? Let's delete the contents of the T2 and CALENDAR table and start over.

ALTER SYSTEM FLUSH SHARED_POOL; INSERT INTO
  CALENDAR
SELECT
  'TESTCAL',
  TRUNC(SYSDATE-12*365,'YEAR') + ROWNUM -1 FROM
  DUAL
CONNECT BY
  ROWNUM<=12.5*365;

4562 rows created.

INSERT INTO
  T2
SELECT
  ROWNUM-1 RN
FROM
  DUAL
CONNECT BY
  LEVEL<=12.5*365;

4562 rows created.

COMMIT; Now, gather the table and index statistics: EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'table_owner_here',TABNAME=>'T1',CASCADE=>TRUE); EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'table_owner_here',TABNAME=>'T2',CASCADE=>TRUE); EXEC
DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'table_owner_here',TABNAME=>'CALENDAR',CASCADE=>TRUE);

Here is the plan for the method that you suggested, note that the plan has changed a bit:



| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
|   1 |  MERGE JOIN          |          |      1 |     23 |      8 |
00:00:00.02 |      30 |       |       |          |
|   2 |   SORT JOIN          |          |      1 |      2 |      2 |
00:00:00.01 |       7 |  2048 |  2048 | 2048  (0)|
|   3 |    TABLE ACCESS FULL | T1       |      1 |      2 |      2 |
00:00:00.01 |       7 |       |       |          |
|*  4 |   FILTER             |          |      2 |        |      8 |
00:00:00.02 |      23 |       |       |          |
|*  5 |    SORT JOIN         |          |      2 |   4562 |     15 |
00:00:00.02 |      23 |   178K|   178K|  158K (0)|
|*  6 |     TABLE ACCESS FULL| CALENDAR |      1 |   4562 |   4562 |
00:00:00.01 |      23 |       |       |          |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   4 - filter("CAL_DATE"<="C3")
   5 - access("CAL_DATE">="C2")
       filter("CAL_DATE">="C2")
   6 - filter("CAL_NAME"='TESTCAL')

Here is the plan for the second method that I suggested:
| Id  | Operation           | Name        | Starts | E-Rows | A-Rows
| A-Time | Buffers | OMem | 1Mem | Used-Mem |
|   1 |  SORT ORDER BY      |             |      1 |    456 |      8 |
00:00:00.01 |      11 |  2048 |  2048 | 2048  (0)|
|   2 |   NESTED LOOPS      |             |      1 |    456 |      8 |
00:00:00.01 |      11 |       |       |          |
|   3 |    TABLE ACCESS FULL| T1          |      1 |      2 |      2 |
00:00:00.01 |       7 |       |       |          |
|*  4 |    INDEX RANGE SCAN | SYS_C008067 |      2 |    228 |      8 |
00:00:00.01 |       4 |       |       |          |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):


   4 - access("RC"."RN"<="T1"."C3"-"T1"."C2")

The plan has not changed, and the actual number of rows has also remained constant.

The ultimate question is, do you expect the SQL code that you generate today to be efficient once the system is in use? I suspect that others in the group can more effectively relate the performance problems related to developers not thinking about long term performance of applications.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc. Received on Fri Jun 22 2007 - 05:54:05 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US