Re: Dreaming About Redesigning SQL

From: Mike Preece <michael_at_preece.net>
Date: 3 Nov 2003 21:21:54 -0800
Message-ID: <1b0b566c.0311032121.223e93e6_at_posting.google.com>


andrewst <member14183_at_dbforums.com> wrote in message news:<3552095.1067851333_at_dbforums.com>...
> Originally posted by Mike Preece
> > andrewst <member14183_at_dbforums.com> wrote in message
> > news:<3544856.1067597724_at_dbforums.com>...
> > > Originally posted by Mike Preece
[snip]
> > > > Point of agreement: "In many cases one side of a many:many
> > > > relationship can be considered to be of primary importance. The
> > > > relational model attaches no more importance to one side than the
> > > > other. The Pick model *allows* multiple secondary values to be grouped
> > > > according to the primary value in their relationship, although it
> > > > doesn't require them to be."
> > > > Fair? I think I'm simply restating what you've already said - although
> > > > I want to be perfectly clear on this before we go on to your further
> > > > point about optimisation v penalisation in the logical models.
> > > Fair. Let's move on!
> > Right! ;o)
> > Oh, just before we do, to be 100% accurate, I think perhaps I should
> > have said: "The Pick model allows attributes that are empty, single,
> > multi or sub-multi values". (Still trying to be as clear as I can
> > here.)
> > Assuming that the requirements dictate that the Person is of primary
> > importance in the Persons:Phones relationship:
> > Pick allows - as a single item (row):
> > File: People
> > ID: Person
> > 001: First Name(s)
> > 002: Family Name
> > 003: Date of Birth
> > 004: Phone number(s)
> > Pick = Relational theory equivalent terms (roughly):
> > File = Relational Table
> > ID = Primary Key
> > Attributes 001-004... within items = Columns in the table
> > In the relational model, as in the Pick model, if everyone had just
> > one unique phone number (or none at all), the phone number could be
> > considered to be an attribute of the person in exactly the same way
> > that their name or date of birth could be.
> > If it becomes necessary to record more than one phone number against a
> > person, does the relational model dictate that the logical schema must
> > change? I don't know the correct terminology, but would the list of
> > phone numbers for a person have to be held separately in its own table
> > and "referred to"?
> > This doesn't have to happen in Pick. An attribute can be multivalued.
> > If we want the data available for a Person, we look at the Person's
> > item. All of the attributes of a Person are available - in the same
> > "row". It doesn't matter whether they have single or multiple values.
> Yes, in the relational model if you had originally allowed for only one
> phone per person by putting a phone_no column in the person table, and
> now wanted multiple phones per person, you would have to move the
> phone_no attribute to a separate table. However, this isn't something
> one has to do very often in my experience, because the process of
> designing a relational database forces one to ask and answer questions
> like "can a person have more than one phone?" up front. And clearly the
> answer is "yes". Of course, there is still a choice of solutions:
>
> 1) Don't care, we just want ONE of the person's phone numbers. This
> solution will probably be regretted in the future.
>
> 2) OK, let's allow (say) 3 columns: home_phone, work_phone,
> mobile_phone. Not an uncommon solution.
>
> 3) Separate table. As many phones as you like!

I take the point that the designer of the database should try to anticipate the requirements - although there is that thing about "best laid plans" of course.

Each of the 3 choices above can equally be applied in Pick, although it's unlikely that any of them would be - and I hope you'll see why...

> > Do you agree that a list of a person's phone numbers can and should be
> > considered an attribute of the person - just as a single number is -
> > provided an acceptable means exists to access all "rows" containing a
> > given phone number?
>
> No, I don't. Well, perhaps I am just being contrary! But if you have a
> list of phone numbers for a person, you probably need further attributes
> to differentiate them, e.g. an indication of whether the phone is home,
> work, mobile; or whether one phone is best for day time and another for
> evenings. So if this is a single attribute, it is actually a relation-
> valued attribute with columns phone_no, phone_type etc. However, I
> would prefer to normalise this data out into a separate table as it
> actually makes it simpler to query in a relational database(!)

OK. This is how it could be done in Pick:

We start with:
File: People
ID: Person

001: First Name(s)
002: Family Name
003: Date of Birth
004: Phone number

We can use exactly the same database design and begin to add second, third... any number of phone numbers to each person. I should perhaps have pointed out that there is no need to reserve nulls for everyone else if just one person has more than one phone - as there would be in a relational table.

Now then, suppose we do want to add "an indication of whether the phone is home, work, mobile; or whether one phone is best for day time and another for evenings". We could simply expand the existing file as follows:

File: People
ID: Person

001: First Name(s)
002: Family Name
003: Date of Birth
004: Phone number(s) (multivalued; controlling)
005: Phone type(s) ('H'ome, 'W'ork, 'M'obile) (multivalued; dependent
on Phone number)
006: Preferred usage period(s) ('D'aytime, 'E'vening) (multivalued; dependent on Phone number)

There is still no need to go to a separate table for this data. It might be worth reminding you that we *could* store this additional data in a separate file. With the Pick model you have more flexibility - more options. You can maintain a many:many relationships as a separate table, as a list of phones for a person or as a list of people against a phone number. You can also store data associated with a person's phone against that person - as in this example. It very much depends on the requirements as to which option is the best fit. Relevance - the real world requirement - is important, as is the relatedness. Disregarding relevance and on concentrating simply on relatedness is *not* the best way to go. It is perhaps also worth pointing out that, by storing this data as variable in both length and number of occurances for each individual, far less disk space is required on the physical level. Also, being able to access all of the data relating to a person's phone numbers, type of phone & preferred usage in one 'hit' results in optimum performance.

Do you see why I say the relational approach is a "she'll be right" approach in comparison - and definitely not the best tool for the job?

Regards
Mike. Received on Tue Nov 04 2003 - 06:21:54 CET

Original text of this message