Re: Can anyone suggest a better way to write this query?

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: Thu, 19 Nov 2009 13:33:38 -0800 (PST)
Message-ID: <4c8b6372-d531-4657-a387-c8c81d171e08_at_l13g2000yqb.googlegroups.com>



On Nov 19, 3:10 pm, C M <cmcmzz..._at_gmail.com> wrote:
> Table dir_size stores the mbytes of storage used in a given directory.
> Table directories stores various directory names which may or may not
> exist in table dir_size.
> For every directory in table directories, report the cumulative
> storage in that directory and all its subdirectories.
> This solution uses a cartesian join.  I imagine it will not scale
> well.
>
> Thanks in advance!
>
> create table dir_size (
>   dir_name     varchar2(40),
>   mbytes       number
>   );
>
> create table directories (
>   dir_name    varchar2(40)
>   );
>
> insert into dir_size values ('c:\aaa\bbb\ccc\ddd', 100);
> insert into dir_size values ('c:\aaa\bbb\ccc', 100);
> insert into dir_size values ('c:\aaa\bbb', 100);
> insert into dir_size values ('c:\aaa', 100);
> insert into dir_size values ('c:\', 100);
>
> insert into directories values ('c:\aaa\bbb\ccc\ddd');
> insert into directories values ('c:\aaa\bbb\ccc');
> insert into directories values ('c:\aaa\bbb');
> insert into directories values ('c:\aaa');
> insert into directories values ('c:\');
> insert into directories values ('c:\xxx\yyy\zzz');
>
> commit;
>
> select dir_name, sum(mbytes) from (
>   select directories.dir_name,
>     instr(dir_size.dir_name, directories.dir_name) INSTR,
>     mbytes
>   from directories, dir_size
> )
> where INSTR = 1
> group by dir_name
> order by 1;
>
> DIR_NAME                                 SUM(MBYTES)
> ---------------------------------------- -----------
> c:\                                              500
> c:\aaa                                           400
> c:\aaa\bbb                                       300
> c:\aaa\bbb\ccc                                   200
> c:\aaa\bbb\ccc\ddd                               100

This appears to be a hard problem. To avoid headaches, make certain that each of the DIR_NAMES ends with \

Let's start here:
SELECT
  'c:\aaa\bbb\ccc\ddd\' DIR_NAME,
  100 MBYTES
FROM
  DUAL;

DIR_NAME                 MBYTES
-------------------- ----------
c:\aaa\bbb\ccc\ddd\         100

In your example, you would like to put 100MB into the following directories based on the above:

c:\
c:\aaa\
c:\aaa\bbb\
c:\aaa\bbb\ccc\
c:\aaa\bbb\ccc\ddd\

You somehow need to be able to break that one row into 5 rows. The following might help
SELECT
  LEVEL L
FROM
  DUAL
CONNECT BY
  LEVEL<=20;

  L

---
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20

If we join those two row sources together we might be able to create 5
rows from the one row:
SELECT
  SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) DIR_NAME2,
  MBYTES
FROM
  (SELECT
    'c:\aaa\bbb\ccc\ddd\' DIR_NAME,
    100 MBYTES
  FROM
    DUAL) DIR_SIZE,
  (SELECT
    LEVEL L
  FROM
    DUAL
  CONNECT BY
    LEVEL<=20) C
WHERE
  SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) IS NOT NULL;

DIR_NAME2                MBYTES
-------------------- ----------
c:\                         100
c:\aaa\                     100
c:\aaa\bbb\                 100
c:\aaa\bbb\ccc\             100
c:\aaa\bbb\ccc\ddd\         100

Now, if we performed the same process for all of the rows in the
DIR_SIZE table, grouping on DIR_NAME2, we might be able to find the
SUM of the MBYTES column.

I will leave the rest for your experimentation.  You probably will not
have 20 \ characters in the DIR_NAME column, so you could optimize the
SQL statement a bit.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.
Received on Thu Nov 19 2009 - 15:33:38 CST

Original text of this message