Re: Can we solve this -- NFNF and non-1NF at Loggerheads

From: Dawn M. Wolthuis <dwolt_at_tincat-group.comREMOVE>
Date: Mon, 7 Feb 2005 20:23:43 -0600
Message-ID: <cu97rq$3qv$1_at_news.netins.net>


"Paul" <paul_at_test.com> wrote in message news:4207ccb9$0$22497$ed2619ec_at_ptn-nntp-reader03.plus.net...
> lauri.pietarinen_at_atbusiness.com wrote:
>> The SQL to get all email-addresses and phone numbers
>> would be something like this:
>>
>> select ...
>> from PERSON P, EMAIL E, PHONE PH
>> where P.person_id = E.person_id and
>> P.person_id = PH.person_id
>>
>> The problems with this SQL is that
>>
>> 1) You would get 6 rows for TOM, 2
>> for each phonenumber and 3 for each
>> email address
>
> OK, but that's because the question you are asking is "show me all
> possible email-phone combinations for each person". Really you want to be
> asking: "show me all emails for each person, and also show me all phone
> numbers" and I think there's no getting away from the fact that this is
> fundamentally two queries under the relational model (i.e. using
> first-order propositions).

I still have to investigate where exactly 1st order predicate logic breaks down when there are sets within sets, but I'm quite sure that the model I'm employing is able to stick with 1st order predicate logic in spite of child tables. I believe that whatever trips the "now we have to move to 2nd order predicate logic" switch isn't required for the modeling I do, but I'll have to research that again. So, this query is just one query in the NF2 model and it might look like this:

LIST People PersonName EmailAddresses PhoneNumbers

(relation name in this language preceeds the list of columns). The output for this would not be a cross-product of phones and e-mail addresses, but would look the way many would want it to look.

>> *A* solution for 1) would be to nest
>> the emails and phone numbers, naturally
>> only in the result, not in the schema!
>>
>> Another solution would be to use
>> separate gueries for each set of
>> emails and phone numbers, but that
>> WOULD add to the network traffic.
>
> Unless you used database drivers that could handle sending multiple
> queries to the database at once. And getting the answer back as multiple
> result sets.
>
>> We would of course need an API in the
>> programming language that can understad
>> this construct, and if we had an IDE
>> it would have to understand this format
>> and provide controls accordingly.
>
> OK so you're suggesting that a nested relation is only of use for
> phsically moving data from inside the database to outside - to some
> external application?

I don't know if that is Lauri's take, but I think that nested relations are useful for logical modeling across the board.

> I still think that this kind of thing is best done outside of the
> relational model, maybe as part of the database drivers. Keep the logical
> core as simple and elegant as possible

I'll claim that the NF2 model is both simpler and more elegant from several perspectives. I'd include a proof of that, but there is no room in this margin, so the proof is for the reader.

> and relegate any possible performance-enhancing tweaks to the physical
> level.

agreed.

> I can see how what you're saying could be useful at the physical level but
> I think it's dangerous to let it creep in to the logical level.
>
> Maybe the application could use some kind of extended-SQL that had nested
> relations, but the database drivers convert it into standard-SQL behind
> the scenes before speaking to the database. And then when the answer comes
> back (maybe as several tables), the drivers can convert it into the form
> you're suggesting before passing it back to the application.

The data don't ever need to be modeled in what-used-to-be-1NF. From the modeler's brain to the logical models and then to the database schema definitions (in a nonRDBMS), logical views, and queries -- child tables can be worked with as nested structures. There are tradeoffs, however, including the fact that the NF2 implementations of which I am aware typically have constraints coded in the same procedural application language as a component of the applications and not the dbms. I know this is considered very bad by some, but it, too, turns out to provide a big bang for the buck and does not, in general, result in poorer quality data from what I have seen. Of course there are lousy systems built with the variety of tools out there, whether using RDBMS or NF2 products. Later. --dawn

> Paul.
Received on Tue Feb 08 2005 - 03:23:43 CET

Original text of this message