Re: SELECT DISTINCT slow, how can I speed up

From: William Robertson <williamr2019_at_googlemail.com>
Date: Sat, 16 Aug 2008 15:32:29 -0700 (PDT)
Message-ID: <31744231-86f3-4439-98d7-1dfbb512f219@r66g2000hsg.googlegroups.com>


On Aug 14, 7:18 pm, Barry Bulsara <bbulsar..._at_hotmail.com> wrote:
> Hello. I have a question about how I can structure a SQL query to make
> the results come back faster. I have experimented a bit and include
> some results here.
>
> We have read-only SELECT access to tables for a payroll and absence
> system running Oracle 10.0.1 on (I think) a dedicated Windows 2003
> Server in the IT department on the same site. We do not have
> privileges to UPDATE, DELETE, or create anymore indexes and if I did,
> it would probably invalidate our software support contract.
>
> We have staff using Access 2003 as a front end to the Oracle tables
> for the purposes of ad hoc queries. These staff regularly enter
> queries across 5-6 tables with various bits added to the SQL to limit
> the number of rows returned and match criteria. All the rows starts to
> come back within a second of the SQL starting to run.
>
> When the SQL is like this (I have anonymised the table names).
>
> SELECT a1,a2,a3,a5,a10
>  FROM table1, table2, table3, table4
>   WHERE table1.empID=table2.empID
>      AND table2.empID=table3.empID
>      AND table3.empID=table4.empID
>      AND table4.absenceT IN (15,19)
>      AND table4.year=2008;
>
> The rows start to come back immediately. Around 1700-1800 rows
> typically come back and there is a lot of network IO. If I type
>
> SELECT COUNT(*) FROM
> (
> SELECT a1,a2,a3,a5,a10
>  FROM table1, table2, table3, table4
>   WHERE table1.empID=table2.empID
>      AND table2.empID=table3.empID
>      AND table3.empID=table4.empID
>      AND table4.absenceT IN (15,19)
>      AND table4.year=2008
> )
>
> it returns in 2.14seconds saying it found 1792 rows.
>
> If I type SELECT DISTINCT a1,a2,a3,a5,a10
>
> only 20-50 rows come back and the query takes 28 seconds and there is
> a lot less network IO.
>
> I only want the 20-50 rows but I don't want to wait 28 seconds. As
> Oracle seems to find all 1792 rows in 2.14seconds, why does it take 28
> seconds only to return the unique ones.
>
> These are ad hoc queries so I am not trying to optimize a single
> query. My question is Is there a more quick way of generally returning
> a DISTINCT set of rows other than using DISTINCT in the SELECT
> statement.
>
> Thank you
> Barry

A COUNT query can often use a very different execution plan to the corresponding SELECT <actual columns>, because without the need to access the columns it may be able to get the required result entirely from indexes, avoiding some table accesses. Received on Sat Aug 16 2008 - 17:32:29 CDT

Original text of this message