Re: 500 million rows every (8 hour) day
Date: Wed, 25 Nov 2009 04:45:09 -0800 (PST)
On Nov 25, 5:53 am, Noons <wizofo..._at_yahoo.com.au> wrote:
> Richard Last wrote,on my timestamp of 24/11/2009 1:48 PM:
> > I fully recognise i'm not providing sufficient detail for anyone to
> > provide very accurate recommendations. What I'm curious to know if
> > it's too tall an order for Oracle, or if it will require hadoop, or
> > similar technology. I've done quite a bit of SQL Server work, but not
> > nearly to this scale.
> Here are my not very accurate recommendations, for the reasons you pointed out.
> Partition by day. Make the day's tables separate, manipulate them for the daily
> reporting as needed. Add them to partitioned tables at end of day and run
> reports across days then. At end of 6 months, drop partitions off the start of
> the table.
> Make sure no one comes up with some nonsense of storing LOBs at those volumes!
> No matter what database technology is used, you'll need some serious disk
> capacity. As well as memory and CPU. Look at compression at database layer.
> You're definitely looking at 11gr1 or even r2: some improvements in partitioning
> in those releases that would definitely help.
> At these sort of volumes, the secret is always the same: storage architecture
> must avoid having to traverse data not immediately needed. Whatever the db.
In a previous life we did the the same type of load. We were getting
about the same number of
rows in a 6 hour period.
We partitioned by day and subpartitioned by 'loader'. We had 12 programs loading data and they each had a number associated with them that made up the subpartioning key. Obviously that didn't help with queries but the queries were generally over an entire day anyway. These programs were not using any type of direct path loading they were just batching up a couple of hundred rows and inserting them. Received on Wed Nov 25 2009 - 06:45:09 CST