Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: SQL challenge: can you avoid this self-join?!?!

Re: SQL challenge: can you avoid this self-join?!?!

From: <ctcgag_at_hotmail.com>
Date: 08 Aug 2003 07:01:45 GMT
Message-ID: <20030808030145.440$qO@newsreader.com>


google_at_philgra.org (Phil) wrote:
> Here's something of a challenging SQL problem. I'm trying to only
> scan a table once, yet the SQL that I know won't allow this. Does
> Oracle (or for that matter the SQL standard) have some way to help me?
>
> Consider a table ATTR with columns PARENT_ID, ATTR_ID, ATTR_VAL.
> Unique key is PARENT_ID, ATTR_ID. Assume any other indexes that you
> want, but be aware that ATTR_VAL is modestly dynamic.
>
> I can easily look for a PARENT_ID for any one ATTR_ID, ATTR_VAL pair:
>
> SELECT parent_id FROM attr
> WHERE attr_id = 1 AND attr_val = :b1
>
> I can also easily do this looking at multiple attributes when I only
> need one condition to be met with an OR, as:
>
> SELECT DISTINCT parent_id FROM attr
> WHERE (attr_id = 1 AND attr_val = :b1)
> OR (attr_id = 31 AND attr_val = :b2)
>
> But how to handle the condition where I want to have the two ATTR_ID,
> ATTR_VAL pairs "and-ed" together? I know that I can do this:
>
> SELECT DISTINCT parent_id FROM
> (SELECT parent_id FROM attr WHERE attr_id = 1 AND attr_val = :b1)
> UNION ALL
> (SELECT parent_id FROM attr WHERE attr_id = 31 AND attr_val = :b2)

I think you mean INTERSECT here, not UNION ALL. UNION ALL will cause this to do the same as the OR query, which is clearly not what you wanted.

>
> But this will necessitate looking at ATTR twice. This is maybe okay
> if there are only two conditions, but what about when there might be
> 10 or even 50? At some point this technique becomes unacceptable.

Yes, this is kind of a tough nut to crack. It seems that logically, this is identical to a table whose PK is parent_id and where each possible attr_id is a column. If implemented that way, the multi-pass problem would go away.

This might be better, as the second pass might be able to visit only a small number of rows, depending on the selectivity of the each criteria:

select parent_id from attr where attr_id=31 and attr_val=:b2  and parent_id in
   (select parent_id from attr WHERE attr_id = 1 AND attr_val = :b1)

But that would be pretty heinous for 10 or 50 criteria, also.

>
> Clearly:
>
> SELECT DISTINCT parent_id FROM attr
> WHERE (attr_id = 1 AND attr_val = :b1)
> AND (attr_id = 31 AND attr_val = :b2)
>
> won't work (each row has but one ATTR_ID). This will end up doing the
> same basic thing as the UNION-ALL (only without the sort caused by the
> DISTINCT):
>
> SELECT parent_id FROM attr a1, attr a2
> WHERE a1.parent_id = a2.parent_id
> AND a1.attr_id = 1 AND a1.attr_val = :b1
> AND a2.attr_id = 31 AND a2.attr_val = :b2
>
> but the fundamental problem of scanning ATTR twice remains.

By scanning do you mean Full Table Scan? I think this might optimize to be the same or similar to my in-list subselect above, and might use indices to make the second access pretty efficient.

> What cleverness can I apply here to only scan ATTR once?

Perhaps:

select parent_id, count(*) from attr where (blah1) or (blah2) or (blah3)....
  group by parent_id having count(*)=:number_of_blahs

Although clever wouldn't be my word for it....

Xho

-- 
-------------------- http://NewsReader.Com/ --------------------
Usenet Newsgroup Service              New Rate! $9.95/Month 50GB
Received on Fri Aug 08 2003 - 02:01:45 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US