Re: 1 Billion 11 Byte Words... Need to Check Uniqueness UsingOracle

From: Bryan W. Taylor <bryan_w_taylor_at_yahoo.com>
Date: 12 Feb 2002 16:53:15 -0800
Message-ID: <11d78c87.0202121653.50409bc0@posting.google.com>

"Keith Boulton" <kboulton_at_ntlworld.com> wrote

> I must admit, it's many years since I did this, but a dedicated sort program
> on a mainframe handled this sort of task more than 100x faster than Oracle.

That is almost certainly the result of non-existent performance tuning by a DBA.

Someone was kind enough to point me to one of the leading sorting utilities, which is called SyncSort. Their claims are much more modest than those found in this thread, and much more in line with what I was claiming:

"Bulk loading for high-performance relational databases like Oracle and Sybase can be accelerated. Presorting data with SyncSort prior to bulk loading can cut 50% or more from the average load time." http://www.syncsort.com/sort/infosu.htm

Now mind you, this is a very sophisticated program " SyncSort's speed and efficiency are the result of years of specialized and continually advancing research and development ... "

They clearly understand the multi-CPU multi-disk issues that I've been discussing, and have worked very hard to get a 2X improvement. I'm willing to bet that with one day of preparation I decent DBA can beat a decent C-programmer at this task every time on single use hardware. If the C programmer wants to start developing a pretty sophisticated tool, eventually he probably wins, but it flat out is not as easy as people in this thread are suggesting, and 10X is a pipe dream. Received on Tue Feb 12 2002 - 18:53:15 CST