Re: Regarding Database Design

From: joel garry <joel-garry_at_home.com>
Date: Tue, 12 May 2009 10:50:09 -0700 (PDT)
Message-ID: <60faed63-7ed1-494a-9bfc-fbc3d21cb284_at_k19g2000prh.googlegroups.com>



On May 10, 10:37 pm, DeB <debashish.majum..._at_gmail.com> wrote:
> Hi Faculties,
>
>    Please accept my apologies if this query is out of context of the
> discussions in this forum.
>
>    I've with me a proposed database design which is not normalized
> fully. There are some other tables that we plan to denormalize. The
> targeted database platform is Oracle Database 10g Release 2.
>
>   For ex:- We have a table called user_groups which links the users
> and the groups table. But now we are thinking to eliminate this table
> and have a group_ids column storing comma separated values of groups
> ids from the groups table instead.
>
>    The records will look like -
>
>    groups
>    ------
>
>    group_id                    group_name
>         1       Admin
>         2       Operator
>         3       Power User
>
>    users
>    ------
>
>    user_id                    user_name                group_ids
>         1       User1           1,3
>         2       User2           2,3
>
>    The targeted database will be a typical OLTP system. Please suggest
> if this approach to database design has any fundamental flaws.
>
> Thanks in advance
> DeB

While people have pointed out some issues, they haven't really said why. The why is this: modern database engines are written to expect proper normalization. When you denormalize, you often wind up fighting with basic assumptions that the engine is making, so it has to perform lots of unnecessary work, and your program has to perform even more unnecessary work, so everything winds up working more slowly the more data you have - we call this a scalability issue. You can really wind up shooting your foot, as it can appear faster to denormalize at small scales - and the scaling goes beyond mere data size and physical placement, but also includes concurrency (multiple users accessing the same data). Oracle is quite amazing when it comes to keeping things available in memory, rather than going to disk, for example. It winds up being easy to demonstrate a situation where a silly toy database is faster than Oracle... for a simple report running for one person on a laptop. A typical OLTP system has much more going on than that.

This is so fundamental, you can hardly blame people just skipping over it and calling what you propose stupid. It inevitably is stupid, and most of us have had to deal with it. There are situations where denormalization is called for - but it is the very, very last thing you do if you can't make it work some other way, and nearly always inappropriate in OLTP. Find a book by Tom Kyte with "Oracle" and "Design" in the title.

And yes, I've even let myself be talked into stupid "security enhancement tables" with users and groups and other stuff overlaid into a table with comma separation, not worth it, not worth it, not worth it, not worth it. They only do it so you get roped into having to do data entry to maintain it. Did I mention it is not worth it? Eventually the code to deal with the possiblities can rival the data size, and needs to be maintained all across the app.

jg

--
_at_home.com is bogus.
My first Corvette was very much like the antique in Star Trek.
Received on Tue May 12 2009 - 12:50:09 CDT

Original text of this message