Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: 1 Billion 11 Byte Words... Need to Check Uniqueness UsingOracle

Re: 1 Billion 11 Byte Words... Need to Check Uniqueness UsingOracle

From: Bryan W. Taylor <bryan_w_taylor_at_yahoo.com>
Date: 9 Feb 2002 11:51:38 -0800
Message-ID: <11d78c87.0202091151.3256e0f@posting.google.com>


"Keith Boulton" <kboulton_at_ntlworld.com> wrote in message news:<YH498.7607$YA2.1485257_at_news11-gui.server.ntli.net>...

> > Surely some sort of file based sort program would be a lot cheaper if you
> don't.
> >
> And very, very much faster!

Done correctly it would be faster, but not by as much as you think. You are underestimating the capabilities of oracle's multithreaded IO. It would not be a trivial programming task if you want to be competitive.

The key is not to have to use disk more than is essential IO during the sort. Since you likely have more data than memory, you have to store it to disk and eventually read it back. This will be by far the slowest operation. IO managment on a multi-disk SMP machine is not trivial. Oracle has multithreaded IO built in - you'll have to write your own. If your program isn't making multiple disks read and write simultaneously, you'll lose.

The method would essentially parallel the method I outlined for oracle to use: split into separate files of managable size based on a partial ordering hash. Then sort each file in memory and scan it for repeats.

The only advantages you'll have over oracle are 1)oracle will put the data into blocks, which makes it expand and creates more IO and 2) your in-memory executable code will probably be smaller, thereby allowing more memory for the sort and potentially allowing you to use fewer pieces. Along these lines, if the data is printable characters, compression of the 11-byte words will probably help substantially. Received on Sat Feb 09 2002 - 13:51:38 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US