Re: Dreaming About Redesigning SQL

From: Bob Badour <bbadour_at_golden.net>
Date: Tue, 4 Nov 2003 08:01:57 -0500
Message-ID: <HrCdnX12GIwYPjqiRVn-sw_at_golden.net>


"andrewst" <member14183_at_dbforums.com> wrote in message news:3556922.1067944594_at_dbforums.com...
>
> Originally posted by Mike Preece
>
> > > > Do you agree that a list of a person's phone numbers can and
> > should be
>
> > > > considered an attribute of the person - just as a single
> > number is -
>
> > > > provided an acceptable means exists to access all "rows"
> > containing a
>
> > > > given phone number?
>
> > >
>
> > > No, I don't. Well, perhaps I am just being contrary! But if
> > you have a
>
> > > list of phone numbers for a person, you probably need further
> > attributes
>
> > > to differentiate them, e.g. an indication of whether the phone
> > is home,
>
> > > work, mobile; or whether one phone is best for day time and
> > another for
>
> > > evenings. So if this is a single attribute, it is actually a
> > relation-
>
> > > valued attribute with columns phone_no, phone_type etc.
> > However, I
>
> > > would prefer to normalise this data out into a separate table as
> > it
>
> > > actually makes it simpler to query in a relational
> > database(!)
>
> >
>
> > OK. This is how it could be done in Pick:
>
> >
>
> > We start with:
>
> > File: People
>
> > ID: Person
>
> > 001: First Name(s)
>
> > 002: Family Name
>
> > 003: Date of Birth
>
> > 004: Phone number
>
> >
>
> > We can use exactly the same database design and begin to add second,
>
> > third... any number of phone numbers to each person. I should perhaps
>
> > have pointed out that there is no need to reserve nulls for everyone
>
> > else if just one person has more than one phone - as there would be in
>
> > a relational table.
>
> >
>
> > Now then, suppose we do want to add "an indication of whether the
>
> > phone is home, work, mobile; or whether one phone is best for day time
>
> > and another for evenings". We could simply expand the existing file as
>
> > follows:
>
> >
>
> > File: People
>
> > ID: Person
>
> > 001: First Name(s)
>
> > 002: Family Name
>
> > 003: Date of Birth
>
> > 004: Phone number(s) (multivalued; controlling)
>
> > 005: Phone type(s) ('H'ome, 'W'ork, 'M'obile) (multivalued; dependent
>
> > on Phone number)
>
> > 006: Preferred usage period(s) ('D'aytime, 'E'vening) (multivalued;
>
> > dependent on Phone number)
>
> >
>
> > There is still no need to go to a separate table for this data. It
>
> > might be worth reminding you that we *could* store this additional
>
> > data in a separate file. With the Pick model you have more flexibility
>
> > - more options. You can maintain a many:many relationships as a
>
> > separate table, as a list of phones for a person or as a list of
>
> > people against a phone number. You can also store data associated with
>
> > a person's phone against that person - as in this example. It very
>
> > much depends on the requirements as to which option is the best fit.
>
> > Relevance - the real world requirement - is important, as is the
>
> > relatedness. Disregarding relevance and on concentrating simply on
>
> > relatedness is *not* the best way to go. It is perhaps also worth
>
> > pointing out that, by storing this data as variable in both length and
>
> > number of occurances for each individual, far less disk space is
>
> > required on the physical level. Also, being able to access all of the
>
> > data relating to a person's phone numbers, type of phone & preferred
>
> > usage in one 'hit' results in optimum performance.
>
> >
>
> > Do you see why I say the relational approach is a "she'll be right"
>
> > approach in comparison - and definitely not the best tool for the job?
>
> >
>
> > Regards
>
> > Mike.
>
> No, I totally disagree with you on your last statement. You are
> completely and utterly wrong there, sorry! You have several
> misunderstandings of the relational model within the argument above.
>
>
>
> First of all, as Bob pointed out yesterday, the relational model does
> allow for relation-valued attributes, so that all the phone numbers for
> a person could be stored within the single person record. However, that
> would not be the "traditional" relational approach to this.
>
>
>
> The "traditional" relational approach is, of course, to have a separate
> table for the phone numbers, like this (NB I will use SQL syntax; SQL is
> not trully relational, but never mind):
>
>
>
> CREATE TABLE person_phones
>
> ( person_id INT
>
> , phone_no VARCHAR2(20)
>
> , phone_type VARCHAR2(1) CHECK (phone_type IN ('H','W','M')
>
> , preferred_usage VARCHAR2(1) CHECK (preferred_usage IN ('D','E')
>
> );
>
>
>
> Now, if a person has 1 phone, we will insert 1 record into this table;
> if she has 2 we will insert 2; if none, we will insert none. There is
> no need, as you say there is above, to "reserve nulls for everyone else
> if just one person has more than one phone". That would only apply if
> we had used a separate COLUMN in the person table for each phone type
> (home_phone_no, work_phone_no, mobile_phone_no) - which presumably would
> be the same in Pick if you chose that approach.
>
> In any case, those "reserved nulls" may well occupy NO space, e.g.
> in Oracle any trailing NULL values in a record are simply not
> stored at all.
>
>
>
> (Aside: some people will splutter at me mentioning Oracle (or any other
> SQL DBMS) in a discussion about the relational model. Fair enough.
> The pure relational model is better than SQL, and SQL is better than
> the rest.)
>
>
>
> So your point about saving disk space is debatable at least. In
> practice, I suspect Pick will indeed use less disk space than a DBMS
> like Oracle

Your suspicion is wrong. Pick uses some rather inefficient value encodings.

>, probably because it dates from an era when disk space
> conservation was the holy grail. However, if so it is not the
> relational model that requires more disk space, it is merely the
> physical implementation. If minimising disk space is an issue, a DBMS
> like Oracle offers strategies for doing so, such as clustering
> (physically storing the phone numbers with the person record - without
> changing the logical 2 table design). Clustering also achieves the
> "person together with all phone data in one hit" requirement.

The factor of eight performance improvement we got over Wol's example relied on packing more data into the same space.

> As far as performance is concerned, a DBMS like Oracle has far more
> tricks up its sleeve than merely storing related data physically
> together. There is caching of recently retrieved data in memory; there
> are "index-organized tables" for which person_phones is a likely
> candidate; there are hash join algorithms used when the person and phone
> data for many people is required e.g. by a report or batch process.

Physical independence is good. Pick has none.

> At the end of the day, we are doomed to disagree.

Of course you are. Mike is ignorant, stupid and proud of it. He has no concept of the difference between the logical and the physical, and he lacks any ability to conceive of the difference. His brain is damaged. The question of whether the damage is innate or caused by Pick still intrigues me.

I wonder what sort of experiment one could do to find out.

> You don't know how
> relational works, and I don't know how Pick works (beyond what I've
> learned from you).

I do know how Pick "works". One can summarize it in one word: poorly.

> But the idea that the relational approach is a
> "she'll be all right" approach is utterly absurd! I urge you to educate
> yourself about it and understand why. Unlike Pick, the relational model
> is not a product, it is a theory on which products can be based (wholly
> or partially); a theory whose purpose is to find the BEST way to store,
> manipulate and retrieve data.

You have to remember who you are talking to. Mike is stupid as well as ignorant. I do not believe him capable of educating himself. Pick people are special, and I definitely mean the short bus kind of special. Received on Tue Nov 04 2003 - 14:01:57 CET

Original text of this message