Design question

From: Matthias Hoys <anti_at_spam.com>
Date: Mon, 16 Feb 2009 20:34:59 +0100
Message-ID: <4999bf6f$0$2847$ba620e4c_at_news.skynet.be>

Hello group,

Oracle version is 10g. 11g might be possible too. I'm currently involved with the pre-analysis of a new project I'll be working on.
We will get historical environmental data (temperature, rainfall, ...) for about 70.000 different locations, and this for a period of 30 years, with 1 measurement for each day of the year. So the total amount of records will be around 750 million (70.000 x 365 x 30). This sounds pretty much like a datawarehouse, no ? ;)
The data model is very simple: 1 table for the locations, 1 table for the measurements.
Now the difficulty is that the application should be able to query the data in a very dynamic way. For example: "give me all the locations with an average temperature of 25 degrees for the month of July during the last 10 years". Or: "give me all the locations where the minimum temperature didn't drop below 15 degrees during the summer months of the last 5 years". The data is very static and doesn't change, only once a year new data will be added.
I already tried some queries on about 120 million records in a non-partitioned heap table, but performance wasn't that great ... One query using the AVG function ran about 15 minutes before I terminated it. Parallel query didn't help much. The server is rather small-ish: RHEL with 2 Intel dual-core processors, 8 GB of RAM and the Oracle datafiles on 2 scsi disks in RAID-1.
I have a bit experience with partitioning and pre-aggregation of data through materialized views, but I find it hard to find a good design. I could range partition on the measurement date, but queries for multiple years will hit several (if not all) partitions. Materialized views would be possible but it would be impossible to cover all the different query combinations and functions.
Any ideas ? OLAP cubes ? (but: expensive licenses ... ) Put everything in a index-organized table ?

Matthias Received on Mon Feb 16 2009 - 13:34:59 CST

This message: [ Message body ]
Next message: Matthias Hoys: "Re: Which SQL is the best for servers?"
Previous message: Paulie: "Re: Which SQL is the best for servers?"
Next in thread: Mladen Gogala: "Re: Design question"
Reply: Mladen Gogala: "Re: Design question"
Maybe reply: Geoff Muldoon: "Re: Design question"
Maybe reply: matthias.hoys_at_gmail.com: "Re: Design question"
Maybe reply: Mladen Gogala: "Re: Design question"
Maybe reply: Mladen Gogala: "Re: Design question"
Maybe reply: Mladen Gogala: "Re: Design question"
Maybe reply: Matthias Hoys: "Re: Design question"
Maybe reply: Geoff Muldoon: "Re: Design question"
Maybe reply: matthias.hoys_at_gmail.com: "Re: Design question"
Maybe reply: Geoff Muldoon: "Re: Design question"
Maybe reply: Matthias Hoys: "Re: Design question"
Maybe reply: Geoff Muldoon: "Re: Design question"
Maybe reply: Mladen Gogala: "Re: Design question"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message