Re: Can we solve this -- NFNF and non-1NF at Loggerheads

From: <lauri.pietarinen_at_atbusiness.com>
Date: 7 Feb 2005 11:10:06 -0800
Message-ID: <1107803406.581986.297810_at_c13g2000cwb.googlegroups.com>


Paul wrote:
> lauri.pietarinen_at_atbusiness.com wrote:
> > You are correct, except for perhaps the eliminating of NULLs (if
> > you think that NULLs are a bad idea). Nulls would be represented
> > by empty sets. Representing outer joins is more natural since rows
> > with no corresponding "child rows" could be handled the same as the
> > ones with child rows. Also, if you have a "multiway" outer join
> > things get pretty tricky. Say if the person is associated with
> > n emails and m phone numbers, you will get n x m rows with the
> > person-columns repeated. These have to be eliminated in the
> > application program.
>
> Could you explain what exactly you mean by "multiway join"? - Google
> isn't being too helpful.
>

Sorry, that was just a term I invented on the fly!

Take, for example the following tables:

PERSON(person_id PK, person_nme);

--person can have multiple email-addresses EMAIL(email_id (PK), person_id(FK), email_txt)

--person can have multiple phonenumbers
PHONE(phone_id (PK), person_id(FK), phone_type, phone_number)

With the following values:

PERSON:

1, TOM
2, DICK
3, HARRY

EMAIL

1, 1, tom_at_acme.com
2, 1, tom_at_hotmal.com
3, 2, dick_at_mycomp.com

PHONE

1, 1, MOBILE, 111-222 -tom
2, 1, OFFICE, 222-333 -tom
3, 1, FAX, 444-555    -tom
4, 3, MOBILE, 666-777 -harry

The SQL to get all email-addresses and phone numbers would be something like this:

select ...
  from PERSON P, EMAIL E, PHONE PH
  where P.person_id = E.person_id and

        P.person_id = PH.person_id

The problems with this SQL is that

  1. You would get 6 rows for TOM, 2 for each phonenumber and 3 for each email address
  2. For Dick and Harry you would not get any rows

Using outer join you can correct problem 2)

*A* solution for 1) would be to nest
the emails and phone numbers, naturally
only in the result, not in the schema!

Another solution would be to use
separate gueries for each set of
emails and phone numbers, but that
WOULD add to the network traffic.

And you can weed the extra rows
out within the application program,
but that does not sound like very
elegant.

We would of course need an API in the
programming language that can understad
this construct, and if we had an IDE
it would have to understand this format
and provide controls accordingly.

>
> > Another benefit is lower network traffic (which can also be
achieved
> > via compression, of course), because the "outer table columns" are
not
> > repeated on each row.
>
> OK, but wouldn't an optimization like this be best done at the
physical
> level rather than the logical level? Premature optimization and all
that.

This optimization would come "for free"
>
> > And further, it would enable a dynamic GUI generation, i.e. a GUI
> > that is generated "on the fly"). And it would also be a "hint" to
> > a report generator which can easily figure out the break point
> > columns.
>
> You mean if the network is so slow that the query result trickle over

> very slowly? Couldn't you get the same effect by just specifying an
> appropriate sort order to the query? I can't see that it would make
much
> difference to a GUI whether the data came over in a nested relation
or not.
>

I mean that one would be able to partly generate applications based on the SQL-query.

regards,
Lauri Pietarinen Received on Mon Feb 07 2005 - 20:10:06 CET

Original text of this message