Re: which data model is better?

From: Hans Forbrich <hforbric_at_yahoo.net>
Date: Sat, 17 Jan 2004 23:13:16 GMT
Message-ID: <4009C10C.18AB1F55@yahoo.net>

dx wrote:
>
> Hi Daniel,
>
> Thanks for your comment. But, actually for our application, in 99.99%
> there is no need for third address, I'm pretty sure there is more
> chance to add other attributes like spoken_language, birth_date, etc
> than to add a third address. So even you choose a separate address and
> phone table, the seperate table model still has almost the same chance
> to change with one table model.
>
> In my opinion, performance and ease of use is always my goal to design
> data model. Suppose a simple business question like getting all
> information of customers who live in city 'TORONTO' and business phone
> like '416%', it's always much harder to get answer from seperate table
> model than
>
> select * from customer where home_city = 'TORONTO' and business_phone
> like '416%';

Retrieving all this data is easily handled by a view on the base tables. IMO, this is not a reasonable excuse to denormalize.

>
> And how many extra logic I/O will be performed for seperate table
> model? you really want to trade performance and ease of use for this
> imagined flexibility?

Will there be inserts and updates? If so, will there be indexes across all the subsections of data?

When you are looking mainly at performance, if you look at the whole picture (including DML) you *might* find that maintaining all the indexes across a denormalized table (such as you propose) will actually hurt in comparison to maintaining the same set of indexes across several tables.

Also, having split the data across several tables, I perceive some situations where the logical IOs can be reduced. Suppose the 'selecting' table can fit into fewer blocks - a full scan will get the matching criteria faster. (Joins, being done on a primary key, should be fast and few IOs). Also, if a denormalized row ends up being extended to larger than a block, you could end up with some potential negative impacts.

>
> Yes, I would follow CODD's rule, but I don't want to rigidly stick to
> it.

This does need some detailed benchmarking in your environment, including the actual DML patterns for your environment. Especially with 8i, I've found that a balance in performance is achieved by sticking to the rules. Received on Sat Jan 17 2004 - 17:13:16 CST