Re: Dreaming About Redesigning SQL

From: andrewst <member14183_at_dbforums.com>
Date: Fri, 07 Nov 2003 08:57:53 -0500
Message-ID: <3571843.1068213473_at_dbforums.com>


Originally posted by Mike Preece

> > > Why were RVAs intoduced (or accepted) into relational theory
> in the

> > > firs..., I mean, 10 years ago?

> > If you really want to understand this,

>

> Not really, to be honest. From what I've read I'm pretty sure we've

> had a much better way of doing this in Pick for more than 30 years -

> with multivalued attributes. I wonder where Date & Darwen got the

> idea.

>

> > read The Third Manifesto by C J

> > Date and H Darwen. I haven't got it with me and to be honest, I
> can't

> > remember why. I think it is more a case of accepting it as
> not

> > subverting the relational model rather than advocating doing
> it.

>

> I wonder how many more things they and others will come to accept in

> time to come. I guess it might just help someone to come up with the

> first "true" relational database. "It's relational Jim - just not as

> we know it".

>

As I have said before, the fact that RVAs are acceptable doesn't mean they should be used to model many:many relationships. I would only use an RVA if I considered the RVA data to be some kind of "blob" of data I had no more intention of interrogating in the database than I would a bitmap image. And Date and Darwen were responding to the "demand" for complex data types from the OO camp, not trying to replicate Pick. They say RVAs are "legal", they don't say they should be used instead of normalisation.

>

> > > > The traditional relational solution treats those queries
> substantially

> > > > the same.

> > > Substantially? Why is that word in there?

> > Well, obviously the 2 queries are not THE same query...

> > > > Using an RVA, the code for one query is considerably
> different from

> > > > the other.

> > > Interestingly enough, in Pick they would be:

> > > List Persons Name with PhoneNumber "12345"

> > > and

> > > List Persons "Mike" PhoneNumber

> > i.e. syntactically different:

> > List <table> <attribute1> with <attribute2> <value>

> > and

> > List <table> <Value> <attribute2>

> > In SQL:

> > select person_name from person_phones where
> phone_number='12345';

> > and

> > select phone_number from person_phones where
> person_name=Mike';

> > i.e. in both cases:

> > select <attribute1> from <table> where <attribute2> =
> <vaue>;

>

> OK. Btw - if you consider this syntax thing important (I don't), we

> could...

> List Persons PhoneNumber with a0 = "Mike"

>

Fair enough - as long as your first one was meant to be:

List Persons Name with PhoneNumber = "12345"

Then syntax can be the same. But what is "a0"? Why not "Name"?

>

> > > and would run extremely efficiently regardless of whether
> Mike's got

> > > one or a dozen phones or how many of his numbers are shared by
> other

> > > people.

> > So storing the data assymetrically has no impact on performance
> then,

> > good or bad?

>

> It's logical to store phone numbers against a person. It's also more

> efficient.

>

.. in Pick. It is not more efficient in an RDBMS. But anyway my point was that you are saying that the query by phone would also be "extremely efficient". But presumably not AS efficient as the query by name, since you say that is "more efficient".

So, you have:

Query By Efficiency

Name X

Phone Y

where X > Y

Whereas in relational we have:

Query By Efficiency

Name X

Phone Y

and we can choose whether X = Y, X > Y or X < Y according to decisions made at the PHYSICAL level, which do not require any change at the LOGICAL level, i.e. the SQL for a given query remains unchanged.

You of course will contend that the Pick X is better than the RDBMS X, and I will disagree, but we can't really test that here can we? However, I will say that in Oracle (e.g.) it is perfectly possible that the query to list "Mike" and all his phone numbers can be achieved with 1 disk read, provided the data is clustered and fits into 1 block (typically 4K or 8K); or 0 disk reads if we read the same block earlier and it is still in the cache.

>

> > > Let's look at the evidence:

> > > We agree that the person is more relevant in the persons to
> phones

> > > relationship.

> > But we disagree over whether that is necessarily relevant to
> logical

> > database design.

>

> I thought we agreed that when we were concentrating only on logical

> design.

>

No, unless you mean you agreed that it is irrelevant. All I have agreed, for the sake of argument, is that one relationship may be more relevant than the other to the USER. (I hope) I have never said that it needs to be taken into account during logical design. It does not.

>

> > We agree that it may be relevant to physical database

> > design, but you don't seem to understand or accept that logical
> and

> > physical design are different issues.

>

> It just so happens that the physical implementation is a "natural" one

> in Pick. If you decide it's logical to record phone numbers against a

> person - then that's what you do in Pick and that's how it's stored

> physically. I'm sorry if you folks have a hard time accepting that

> that makes sense. Maybe it's because it's nigh on impossible to do in

> relational without using RVAs.

>

It just so happens that Pick has no concept of logical/physical distinction. That is Pick's handicap, not relational's. Sorry if you have a hard time understanding that. You have NO CHOICE over logical design: it HAS to correspond to the physical design you select for performance reasons; we have choice: our logical design is optimised for querying, our physical design is optimised for performance. Forget RVAs.

>

> Maybe it's time for a new thread. This one has gone OT.

>

Agreed, this should come to a close. If you still don't get what I've been saying, then you never will. This is probably due as much to my shortcomings as a teacher of the relational model as to your failure to understand what I have said.

>

> > > What? We're trying to get someone's phone number here
> guys!

> > > It's simple. Pick does it simply. Someone said earlier in this
> thread

> > > that "perfection" is important to relational database people.
> Is it

> > > really?

> > If all you want is a phone number you can use a notepad. I
> should

> > bloody hope Pick does it simply! So does relational; how could
> it not?

>

> It depends whose number I'm interested in. I have a notepad, sure. In

> it I have a single entry for a person's (or organisation's) name with

> all of their phone numbers (home, work, mobile, fax) listed against

> that single entry. If the number I want isn't in there I'll probably

> use a telephone directory. They also have a single entry for a name

> with multiple phone numbers. I also have Pick running on my computer

> which allows me to store things in exactly the same way. Maybe it's

> wrong for Pick to store things that way because it's both logical and

> physical and apparently that's wrong(?). Anyway, it does allow me to

> type in something like:

>

> List Persons "Mike" Name PhoneNumber PhoneType PhoneUsage

>

> and get the information I want just about as efficiently as possible.

> I just can't think how that could be improved upon - it couldn't be

> much simpler and I can't think how you could design a DBMS that gets

> this data more efficiently than a single direct read of a 2k frame

> from disk (irrespective of the size of the Phones file). Oh I get it -

> Bob's 8-fold improvement jiggery pokery trick? Ha. Haha. Ha. Hmmm... I

> guess my sense of humour must be different to his. Seriously though -

> what is the equivalent SQL statement to get the same information?

>

SQL: select Name, PhoneNumber, PhoneType, PhoneUsage

from person_phones

where Name = "Mike";

Slightly longer, I'll grant you.

And no, we can't beat your single read of 2k from disk, but we can match it (see above).

>

> Would the statement be identical whether the data is coming from the

> PersonsPhones table or from RVAs in the Persons file?

>

No, because those are 2 different LOGICAL designs. But the statement would be identical regardless of any physical optimisations we might make such as:

  1. physically cluster the phone data with the person (like your Persons file with the phone MV attribute)
  2. physically cluster the person data with the phone
  3. store the phone data and person data independently
--
Posted via http://dbforums.com
Received on Fri Nov 07 2003 - 14:57:53 CET

Original text of this message