Re: Design question

From: Noons <>
Date: Wed, 18 Feb 2009 00:02:25 +1100
Message-ID: <gnec7l$10d$> wrote,on my timestamp of 17/02/2009 7:57 PM:
>>>> Hello group,
>>>> Oracle version is 10g. 11g might be possible too. I'm currently involved
>>>> with the pre-analysis of a new project I'll be working on.
>>>> We will get historical environmental data (temperature, rainfall, ...)
>>>> for about 70.000 different locations, and this for a period of 30 years,
>>>> with 1 measurement for each day of the year. So the total amount of
>>>> records will be around 750 million (70.000 x 365 x 30). This sounds
>>>> pretty much like a datawarehouse, no ? ;)
>>> Matthias, this looks pretty much like a classic star schema to me.
>>> Locations will, of course, be the fact table and measurements will be
>>> dimensions. You will need one table for each type of measure.
>> Shouldn't that be measurement as a fact and location as a dimension?
>> GM

> Would this be ok:
> - dimension table "Location'
> - dimension table "Time"
> - dimension table "Temperature"
> - fact table "Measurements": location_id, time_id, maxtemperature_id,
> mintemperature_id etc ...

Reads good to me.

> Some questions:
> - is it better to put the Time as a separate dimension, or can we put
> it as Date column in the Measurements table ?

Go with measurements table. It's really not a category to associate with other data, it's an attribute of the data.

> - what about the primary key on the fact table ? Does it have to be a
> natural PK or can it be a surrogate key ?

I'd go with a surrogate key for fact tables: usually too hard to get a natural PK.

> - in a DWH, why are Time dimensions often split-up into different
> columns as day, week, year (number datatype) instead of just one Date
> column ?

Because many databases don't have the equivalent of function-based indexes as Oracle does, or were designed for versions of Oracle without that feature.

Queries would need predicates on week, year or day and that would mean using a date function to compare to if data stored in DATE type column. That would mean forfeiting the use of a conventional index in the DATE column. Nowadays, you can use a FBI to get around that problem. Received on Tue Feb 17 2009 - 07:02:25 CST

Original text of this message