Re: SELECT DISTINCT slow, how can I speed up

From: <fitzjarrell_at_cox.net>
Date: Thu, 14 Aug 2008 12:45:10 -0700 (PDT)
Message-ID: <12a7061b-9d7a-4832-8946-23b6f346464f@c58g2000hsc.googlegroups.com>


On Aug 14, 1:18 pm, Barry Bulsara <bbulsar..._at_hotmail.com> wrote:
> Hello. I have a question about how I can structure a SQL query to make
> the results come back faster. I have experimented a bit and include
> some results here.
>
> We have read-only SELECT access to tables for a payroll and absence
> system running Oracle 10.0.1 on (I think) a dedicated Windows 2003
> Server in the IT department on the same site. We do not have
> privileges to UPDATE, DELETE, or create anymore indexes and if I did,
> it would probably invalidate our software support contract.
>
> We have staff using Access 2003 as a front end to the Oracle tables
> for the purposes of ad hoc queries. These staff regularly enter
> queries across 5-6 tables with various bits added to the SQL to limit
> the number of rows returned and match criteria. All the rows starts to
> come back within a second of the SQL starting to run.
>
> When the SQL is like this (I have anonymised the table names).
>
> SELECT a1,a2,a3,a5,a10
>  FROM table1, table2, table3, table4
>   WHERE table1.empID=table2.empID
>      AND table2.empID=table3.empID
>      AND table3.empID=table4.empID
>      AND table4.absenceT IN (15,19)
>      AND table4.year=2008;
>
> The rows start to come back immediately. Around 1700-1800 rows
> typically come back and there is a lot of network IO. If I type
>
> SELECT COUNT(*) FROM
> (
> SELECT a1,a2,a3,a5,a10
>  FROM table1, table2, table3, table4
>   WHERE table1.empID=table2.empID
>      AND table2.empID=table3.empID
>      AND table3.empID=table4.empID
>      AND table4.absenceT IN (15,19)
>      AND table4.year=2008
> )
>
> it returns in 2.14seconds saying it found 1792 rows.
>
> If I type SELECT DISTINCT a1,a2,a3,a5,a10
>
> only 20-50 rows come back and the query takes 28 seconds and there is
> a lot less network IO.
>
> I only want the 20-50 rows but I don't want to wait 28 seconds. As
> Oracle seems to find all 1792 rows in 2.14seconds, why does it take 28
> seconds only to return the unique ones.
>
> These are ad hoc queries so I am not trying to optimize a single
> query. My question is Is there a more quick way of generally returning
> a DISTINCT set of rows other than using DISTINCT in the SELECT
> statement.
>
> Thank you
> Barry

You might try this, although I don't know how much faster, if at all, it will be:

SELECT a1,a2,a3,a5,a10
 FROM table1, table2, table3, table4
  WHERE table1.empID=table2.empID

     AND table2.empID=table3.empID
     AND table3.empID=table4.empID
     AND table4.absenceT IN (15,19)
     AND table4.year=2008

group by a1, a2, a3, a5, a10;

Possibly you should ask about increasing the hash_area_size parameter value as 10g uses hashing algorithms to process such requests, rather than the old-style sort operations.

David Fitzjarrell Received on Thu Aug 14 2008 - 14:45:10 CDT

Original text of this message