Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.misc -> Re: Simple SQL Puzzle
First of all
You should remove the distinct keyword from your subquery
in () is denoting a set, and all elements of a set are unique by *formal
definition*.
In your current situation Oracle performs the same operation two times, the
optimizer is too stupid to recognize your distinct is redundant.
It will perform a sort unique for you, and your in will also perform a sort
unique for you.
That said, assuming you use 7.3.3 or higher, if you have 'myfield' indexed
and the tables analyzed, the optimizer will run automatically run a index
fast full scan for your subquery. This operation just sequentially scans the
index.
Currently you most likely are using a full table scan which results in
disastrous performance
A different shortcut, usually saving much time, although it may not apply in your case, see the recent Thomas Kyte post on this subject, is using minus (in conjuction with the index earlier mentioned) instead of not in.
But first retry the operation without the redundant distinct
Hth
-- Sybrand Bakker Senior Oracle DBA to reply remove '-verwijderdit' from my e-mail address "Chris Brady" <chrisjbrady_at_yahoo.com> wrote in message news:51604466.0110300417.6f1f05d9_at_posting.google.com...Received on Tue Oct 30 2001 - 06:30:28 CST
> Niall - many thanks for responding so quickly. In fact the reason I
> asked is that I need to extract data from one table (table1) that is
> not in table2. Both tables have identical structure but the contents
> of table2 is a subset of table1.
>
> However I had a suspicion that this didn't work as expected, hence the
> original question:
>
> SQL> select count(distinct(i.myfield)) from table1 i, table2 w
> where i.myfield <> w.myfield;
>
> Instead I've used:
>
> SQL> select distinct(i.myfield) from table1 i
> where i.myfield not in
> (select distinct(w.myfield) from table2 w);
>
> The only problem with half a million records in table1 and a few
> thousand in table 2 is that this take three hours to run!! I think
> that this does the second select for every record from the first
> select!!
>
> By-the-way I'm a newbie to Oracle and SQL as you might have realised!!
>
> Chris B.
>
> "Niall Litchfield" <n-litchfield_at_audit-commission.gov.uk> wrote in message
news:<3bdd3db5$0$225$ed9e5944_at_reading.news.pipex.net>...
> > Comments below
> > "Chris Brady" <chrisjbrady_at_yahoo.com> wrote in message
> > news:51604466.0110290314.226d2546_at_posting.google.com...
> > > Please can someone explain this? Its the same table compared with
> > > itself. The first case I can understand but why is the second case
> > > giving the same total? I would have thought that it would be zero.
> > <snip>
> > > SQL> select count(distinct(i.myfield)) from temp_data i, temp_data w
> > > where i.myfield <> w.myfield;
> > >
> > > COUNT(DISTINCT(I.MYFIELD))
> > > --------------------------
> > > 47548
> >
> > Your table has 47548 distinct values of myfield. Each of those is
clearly
> > not equal to the other 47547 values in the table. Thus each value of
myfield
> > will satisfy your where clause at least 47547 times. Assuming myfield is
not
> > in fact unique try taking the distinct keyword out of the two pieces of
sql
> > to see the results more clearly.