Re: Design question
Date: Wed, 18 Feb 2009 00:02:25 +1100
Message-ID: <gnec7l$10d$1_at_news.motzarella.org>
matthias.hoys_at_gmail.com wrote,on my timestamp of 17/02/2009 7:57 PM:
>>>> Hello group, >>>> Oracle version is 10g. 11g might be possible too. I'm currently involved >>>> with the pre-analysis of a new project I'll be working on. >>>> We will get historical environmental data (temperature, rainfall, ...) >>>> for about 70.000 different locations, and this for a period of 30 years, >>>> with 1 measurement for each day of the year. So the total amount of >>>> records will be around 750 million (70.000 x 365 x 30). This sounds >>>> pretty much like a datawarehouse, no ? ;) >>> Matthias, this looks pretty much like a classic star schema to me. >>> Locations will, of course, be the fact table and measurements will be >>> dimensions. You will need one table for each type of measure. >> Shouldn't that be measurement as a fact and location as a dimension? >> >> GM
>
> Would this be ok:
> - dimension table "Location'
> - dimension table "Time"
> - dimension table "Temperature"
> - fact table "Measurements": location_id, time_id, maxtemperature_id,
> mintemperature_id etc ...
Reads good to me.
>
> Some questions:
> - is it better to put the Time as a separate dimension, or can we put
> it as Date column in the Measurements table ?
Go with measurements table. It's really not a category to associate with other data, it's an attribute of the data.
> - what about the primary key on the fact table ? Does it have to be a
> natural PK or can it be a surrogate key ?
I'd go with a surrogate key for fact tables: usually too hard to get a natural PK.
> - in a DWH, why are Time dimensions often split-up into different
> columns as day, week, year (number datatype) instead of just one Date
> column ?
Because many databases don't have the equivalent of function-based indexes as Oracle does, or were designed for versions of Oracle without that feature.
Queries would need predicates on week, year or day and that would mean using a date function to compare to if data stored in DATE type column. That would mean forfeiting the use of a conventional index in the DATE column. Nowadays, you can use a FBI to get around that problem. Received on Tue Feb 17 2009 - 07:02:25 CST