Re: SQLServer beats 8i finding top-10 duplicates

From: Connor McDonald <connor_mcdonald_at_yahoo.com>
Date: Sat, 08 Jan 2000 12:07:46 +0800
Message-ID: <3876B811.383@yahoo.com>

Connor McDonald wrote:
>
> Scott Narveson wrote:
> >
> > Brian Peasland <peasland_at_edcmail.cr.usgs.gov> wrote:
> > : To me, this is comparing apples to oranges. In your Oracle query, you
> > : are doing two select statments whereas the SQLServer query only has one.
> > : The SQLServer query utilitizes a function that is not available in
> > : Oracle. If I create a system that has functionality not found in other
> > : systems and then take advantage of that functionality, is it surprising
> > : that this system will perform better than the systems that lack that
> > : functionality? I think not. I could have asked the question, "which is
> > : more stable, Oracle on Unix or SQLServer on NT"?
> > :
> > : HTH,
> > : Brian
> >
> > Clearly the two particulars of the two queries are very different.
> >
> > However, I am working on an application which must run against both
> > SQLServer and Oracle, and which would happily use a decently-
> > performing implementation of the basic query posed in my original
> > note. Vendor-specific tricks in each case are perfectly acceptable.
> > SQLServer has a one that works nicely. I was hoping that some Oracle
> > wizard could suggest Oracle-specific tweaks that could get the basic
> > question answered more quickly for Oracle.
> >
> > So far, no takers... :-)
> >
> > - Scott
> >
> > : Scott Narveson wrote:
> > :>
> > :> Background:
> > :>
> > :> Given the following table:
> > :>
> > :> create table testtable (num number(9));
> > :>
> > :> Note: no index present.
> > :>
> > :> Problem:
> > :>
> > :> Insert 1 million random numbers with value from 1 to 100,000.
> > :> Write a query that finds the top 10 most duplicated numbers.
> > :>
> > :> Oracle 8i:
> > :>
> > :> select * from (
> > :> select count(num) from testtable
> > :> group by num order by count(num) desc)
> > :> where rownum < 10;
> > :>
> > :> Time: ~ 3 minutes
> > :>
> > :> SQLServer 7:
> > :>
> > :> [After creating a similar table in its syntax...]
> > :>
> > :> select top 10 count(num) from testtable
> > :> group by num order by count(num)
> > :>
> > :> Time: 9 seconds
> > :>
> > :> Is this sort of operation just a sweet spot for SQLServer? Can I get
> > :> Oracle to match SQLServer performance (adding indexes is ok, raw
> > :> performance is the hope)? Any stupid mistakes in my Oracle
> > :> attempt...?
> > :>
> > :> Thanks for any insight anyone can offer.
> > :>
> > :> - Scott Narveson
>
> I think this functionality for Oracle will be available in 8i.2 (or
> 8.1.6)...
>
> Other options you could explore:
>
> - indexing num to allow index full scan instead of table scan
> - bitmap indexing num (if cardinality allows) to attempt the same as
> above with a smaller sized index
> - parallelise the query
>
> HTH
>
> --
> ===========================================
> Connor McDonald
> "These views mine, no-one elses etc etc"
> connor_mcdonald_at_yahoo.com
>
> "Some days you're the pigeon, and some days you're the statue."

I've just run the same on my (tiny) laptop and I think there may be something amiss on your Oracle setup somewhere... My results were:

SQL> desc testab

 Name                        Null?    Type
 --------------------------- -------- ------------
 NUM                                  NUMBER

SQL> select * from v$version;

BANNER

Oracle8i Enterprise Edition Release 8.1.5.0.0 - Production PL/SQL Release 8.1.5.0.0 - Production
CORE Version 8.1.5.0.0 - Production
TNS for 32-bit Windows: Version 8.1.5.0.0 - Production NLSRTL Version 3.4.0.0.0 - Production

real: 240

SQL> select count(*) from testab
2 /

COUNT(*)

999999

real: 671

SQL> select * from
  2 ( select count(num) from testab group by num order by count(num) desc )
  3 where rownum < 10
  4
SQL>
SQL> / COUNT(NUM)

9 rows selected.

real: 20860

ie 20 seconds. Still not as good as 7 seconds, but this was run a single CPU (300Mhz) laptop with 128M RAM and a solitary 10G IDE disk - hardly what you would describe as a good configuration for running Oracle (or any database for that matter)...

HTH
--

Connor McDonald
"These views mine, no-one elses etc etc" connor_mcdonald_at_yahoo.com

"Some days you're the pigeon, and some days you're the statue." Received on Fri Jan 07 2000 - 22:07:46 CST