Re: 2 million row database from hell! help a newbie in over his head.

From: Carl <fakeham_at_home.com>
Date: Wed, 22 Aug 2001 05:13:53 GMT
Message-ID: <lcHg7.182793$%a.7691942_at_news1.rdc1.sdca.home.com>


Don't be concerned with the 2 million rows. If the table is designed appropriately (data types, keys, index(es)) then it won't be too much of an issue.

Take your best shot at it, then hit it with some typical queries to show where the hotspots are.

If you have a very "popular" query, such as average rainfall for last year per region, create a table for this. This way the highest percentage of end users get the good performance. You can still query the larger table for more historical data, at a slight performance penalty.

Hope this helps,

--
Carl T. Demelo
Senior DBA
Transcore


"dave rau" <dave_at_projectfx.com> wrote in message
news:11c09d6a.0108210954.347c9b16_at_posting.google.com...

> I must preempt my message with the fact that I am a total newbie to
> database design both conceptually and in practice. I have been
> learning some mySQL and PHP recently and have recently taken on a very
> large database project at my cushy guuv-ment job to publish data to
> the web.
>
> I have encountered some problems, mostly dealing with how to
> restructure this database for optimization in mysql and for
> publishing/searching on the web.
>
> The database I am working with contains a rainfall site number, a
> collection date, a collection value and the status of the data,
> whether the data was incomplete, or if the power went out on a
> station, etc.
>
> I have paired importent columns down to the following: ID (*primary
> key), Site_number, Collection_date, value, and status. I am not even
> sure if this is the correct way to pair the data down however.
>
> The biggest problem is that the database currently lists values for
> every day, for 50 years, for 450 rainfall stations!!! Currently the
> database has over 2 million rows!! Newbie or not, this seems pretty
> excessive and clunky for publishing to the web.
>
> With that said, would I be better off by eliminating the Site_number
> column and making a table for EACH rainfall station (all 450 of them)?
> Should I make a table for each year? Keep in mind that all 450+
> stations have a value for every day for 50+ years, some go back to
> 1910 (90 years). Since zero is an acceptable value for rainfall it is
> possible that much of the value columns are zero.
>
> *also, I have been looking into some books on conceptual database
> design, is database design for mere mortals a recommended book on
> anyones list? I would be interested in some others. thanks in adv
> guys!
>
> dave
Received on Wed Aug 22 2001 - 07:13:53 CEST

Original text of this message