Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: optimal sql
"Christoph Seidel" <chris666.seidel_at_gmx.de> wrote:
> how can i find all records by pk in table 1 which are not in table 2
It's going to depend on a lot of things, the only way to know for real is to test it on real(ish) data. Are the tables about the same size and almost completely overlapping (my test case), or nearly disjoint, or drastically different sizes? (In my actual test case, the "pk" column in t2 is not primary or even unique, it's nearly unique. This could also matter). Do you have enough memory to fit a hash of all the keys?
> is this efficient:?
>
> select * from t1 where pk not in (select pk from t2)
For my test case (I used count(*) rather than *, which may make a big a difference for some things) this was pretty good, 6 seconds, and used an anti-hash.
The 'not exist' surprisingly took 26 seconds, stepping through both columns in order and applying a filter.
The 'minus' (with an outer select to do the counting) took 5.8 seconds. Its explain plan seemed to be only trivially differnt from 'not exist', so I don't know why it's so much faster.
My favorite, the left join,
select count(*) from t1,t2 where t1.pk=t2.pk(+) and t2.pk is null;
took 6.4 seconds. I does a hash join, then single-source filter
on the results.
I was surprised that none of these was optimized to be the same as another. Especially that the hash-then-filter series of the left join didn't just reduce to the anti-hash.
Xho
-- -------------------- http://NewsReader.Com/ -------------------- Usenet Newsgroup Service New Rate! $9.95/Month 50GBReceived on Tue Dec 10 2002 - 23:52:37 CST