Re: 500 million rows every (8 hour) day

From: Jack <none_at_INVALIDmail.com>
Date: Thu, 26 Nov 2009 09:25:22 +0200
Message-ID: <D3qPm.44530$La7.4476_at_uutiset.elisa.fi>


"scottgamble_at_comcast.net" <zifnabiom_at_gmail.com> wrote in message news:83513236-dd64-4273-a89c-814ed56fbeb8_at_m16g2000yqc.googlegroups.com... On Nov 25, 5:53 am, Noons <wizofo..._at_yahoo.com.au> wrote:
> Richard Last wrote,on my timestamp of 24/11/2009 1:48 PM:
>
>
>
> > I fully recognise i'm not providing sufficient detail for anyone to
> > provide very accurate recommendations. What I'm curious to know if
> > it's too tall an order for Oracle, or if it will require hadoop, or
> > similar technology. I've done quite a bit of SQL Server work, but not
> > nearly to this scale.
>
> Here are my not very accurate recommendations, for the reasons you pointed
> out.
>
> Partition by day. Make the day's tables separate, manipulate them for the
> daily
> reporting as needed. Add them to partitioned tables at end of day and run
> reports across days then. At end of 6 months, drop partitions off the
> start of
> the table.
> Make sure no one comes up with some nonsense of storing LOBs at those
> volumes!
>
> No matter what database technology is used, you'll need some serious disk
> capacity. As well as memory and CPU. Look at compression at database
> layer.
> You're definitely looking at 11gr1 or even r2: some improvements in
> partitioning
> in those releases that would definitely help.
>
> At these sort of volumes, the secret is always the same: storage
> architecture
> must avoid having to traverse data not immediately needed. Whatever the
> db.

In a previous life we did the the same type of load. We were getting about the same number of
rows in a 6 hour period.

We partitioned by day and subpartitioned by 'loader'. We had 12 programs loading data and they each had a number associated with them that made up the subpartioning key. Obviously that didn't help with queries but the queries were generally over an entire day anyway. These programs were not using any type of direct path loading they were just batching up a couple of hundred rows and inserting them.



Nowadays it would be couple of (ten) thousands rows.

Advanced compression will save some TB of storage http://www.oracle.com/database/advanced-compression.html Also disk I/O will be faster Received on Thu Nov 26 2009 - 01:25:22 CST

Original text of this message