Help Designing a Billion rows System
From: dpafu <dpafumi_at_gmail.com>
Date: 4 Aug 2006 09:26:55 -0700
Message-ID: <1154708815.346279.191950_at_m73g2000cwd.googlegroups.com>
[Quoted] Hello, I need to build a new system that will have a table that will keep data for 60 days (aprox. 1 billion rows). Once the data is there, more data will be loaded (aprox. 10M rows/day loading around 40 files with 40k rows each file).
The system needs to reject and report this new data whenever a duplicate value is found (90% of data is suppose to be valid; the other 10% is suppose to be duplicated). I also need to maintain this table with the latest 60 days of data.
What approach do you recommend?
[Quoted] I was planning to have a partitioned table per day (is it possible to have 60 partitions?), use a PK on those unique values, use external table (or SQL*Loader) to load the data and use the HINT APPEND to insert new data. Then use another process to drop old partitions and create a new one for today's data.
[Quoted] I'm not very familiar with partitioned tables, so any example and help is appreciated.
[Quoted] What do you think about this approach?
Any other ideas?
Thanks,
Diego Received on Fri Aug 04 2006 - 18:26:55 CEST
Date: 4 Aug 2006 09:26:55 -0700
Message-ID: <1154708815.346279.191950_at_m73g2000cwd.googlegroups.com>
[Quoted] Hello, I need to build a new system that will have a table that will keep data for 60 days (aprox. 1 billion rows). Once the data is there, more data will be loaded (aprox. 10M rows/day loading around 40 files with 40k rows each file).
The system needs to reject and report this new data whenever a duplicate value is found (90% of data is suppose to be valid; the other 10% is suppose to be duplicated). I also need to maintain this table with the latest 60 days of data.
What approach do you recommend?
[Quoted] I was planning to have a partitioned table per day (is it possible to have 60 partitions?), use a PK on those unique values, use external table (or SQL*Loader) to load the data and use the HINT APPEND to insert new data. Then use another process to drop old partitions and create a new one for today's data.
[Quoted] I'm not very familiar with partitioned tables, so any example and help is appreciated.
[Quoted] What do you think about this approach?
Any other ideas?
Thanks,
Diego Received on Fri Aug 04 2006 - 18:26:55 CEST