Re: 1 Billion 11 Byte Words... Need to Check Uniqueness with Oracle

From: Pablo Sanchez <pablo_at_dev.null>
Date: Fri, 8 Feb 2002 22:40:16 -0700
Message-ID: <5D298.225$6I1.143644_at_news.uswest.net>


Is the set predetermined or will the set be trickled in to the instance? If the former, I wouldn't bother with Oracle (or SQL Server :), I'd use Unix to 'sort -u' -- which will uniquify it too, then I'd slam it into the instance using SQL*Loader.

At that point you'll show how much faster Oracle will load. Make sure to have enough disk drives configured for Oracle and not enough for SQL Server :)

--
Pablo Sanchez, High-Performance Database Engineering
www.hpdbe.com
Available for short-term and long-term contracts

"Hooty and the Blowfish" <bob_at_bob.com> wrote in message
news:v3s76uciii2742r4sgm19gruvtn5bcja9p_at_4ax.com...

>
> I have a group that is currently using Oracle but is considering
> moving to SQL Server to save some money. One of the business cases
> they're working with is the testing of 1 billion 11 character words
> for uniqueness. Apparently they've been sold on the idea that SQL
> Server will rock their monkey.
>
> I tend to disagree and believe that Oracle will handle this task
much
> more elegantly.
>
> So that I can give this group some guidance: How would you go about
> testing uniqueness on 1 Billion 11 character words using oracle?
>
> Ideas to get you started:
> 1) Write small clients to do database inserts and distribute them
> across the network on small desktops. Run Oracle with one table,
one
> column and one constraint (that it be a primary and unique key).
> Start the inserts and wait for an exception. You could play with
the
> number of inserts per commit and number of boxes submitting
> connections to the database.
>
> 2) Install Oracle on 10 boxes and split the 1 Billion words into 10
> segments. Insert the numbers into the database and check for
> uniqueness. This won't prove uniqueness across the entire set so
> you'd then have to bulk insert or import all data from each Oracle
> database into one master database that checks uniqueness. Maybe
this
> would be faster than checking uniqueness on every insert.
>
> 3) ??? Any other ideas?
Received on Sat Feb 09 2002 - 06:40:16 CET

Original text of this message