Re: Dreaming About Redesigning SQL

From: Steve Lancour <stevel_at_lancour.com>
Date: Mon, 27 Oct 2003 11:44:10 -0500
Message-ID: <KcWdnaY9u-hd1wCiRVn-vw_at_comcast.com>


andrewst wrote:
> Originally posted by Steve Lancour
>
>

>>In Pick I'd do it with two files (tables):

>
>
>
>>Persons Table:

>
>
>
>>Marshall

>
>
>>Mike

>
>
>
>>Phones Table:

>
>
>
>>123-4567

>
>
>>234-5678

>
>
>>222-2222

>
>
>
>>So far, we've used the same logical structure.  Where in the
>>relational

>
>
>>model you'd use a third table to represent the intersections of
>>Persons

>
>
>>and Phones I would define a multi-valued attribute (column) in Persons

>
>
>>for Phones and in Phones for Persons:

>
>
>
>>Phones attribute (column) in Persons records:

>
>
>>Marshall  123-4567]234-5678

>
>
>>Mike      234-5678]222-2222

>
>
>
>>Persons attribute (column) in Phones records:

>
>
>>123-4567  Marshall

>
>
>>234-5678  Marshall]Mike

>
>
>>222-2222  Mike

>
>
>
>>Referential integrity is the responsibility of the developer.  In a

>
>
>>poorly-designed system there may be any number of programs updating

>
>
>>these records.  Each such program must enforce the rules and any rule

>
>
>>changes must be made separately in each.  The opportunity for mistakes

>
>
>>leading to corruption is undeniable.

>
>
>
>>In a well-designed system, only one Pick-Basic subroutine handles

>
>
>>updates and deletions to the Person/Phone relations for *all* input

>
>
>>sources (other Pick programs, web interface, batch load, etc.).  This

>
>
>>subroutine may be implemented as a trigger or as part of a business

>
>
>>rules layer "talking to" a variety of "front-end" applications.  If

>
>
>>desired, the two record updates may be enclosed in a transaction to

>
>
>>guarantee integrity.

>
>
>
>
>>In this example, the query to get Marshall's phone numbers would be:

>
>
>
>>select Persons "Marshall" Phones

>
>
>
>
>
>>To query for Persons using 234-5678:

>
>
>
>>select Phones "234-5678" Persons

>
>
>
>
>
>>In both cases, Pick's record-key hashing allows a "direct hit" on the

>
>
>>records requested without the need for an index or a table scan.

>
>
>
>>Steve Lancour 

>
>
> Right, well I'm not in the business of Pick-bashing, really. But we
> could achieve the same "direct hit" performance in Oracle (the SQL
> DBMS I happen to be familiar with) using 2 hash clusters if we wanted
> to - provided we store the person/phone intersection data twice as you
> would in Pick.
>
>
>
> We wouldn't actually do that, of course, because 99.999% of the time
> ensuring data integrity is more important than shaving a CPU millisecond
> off the access time. And as you admit, storing the same data twice
> separately runs an increased risk of corruption.
>
>
>
> So yes, we will add indexes (unless we are sure our tables will always
> be very small), and a typical keyed query may involve a few logical
> reads rather than 1. If the query is performed a lot, the index
> entries are likely to be cached in the server memory, so that the
> number of physical reads may be fewer (zero if we are really lucky);
> if it isn't performed a lot, we probably won't notice the additional
> physical reads anyway.
>
>
> --
> Posted via http://dbforums.com

In my example, if I bracket the two updates inside a transaction I eliminate the possibility of corruption during the update. The opportunity for corruption comes after the fact from machine failure or from an errant program (or programmer). I can reduce the first with hardware redundancy and the second with rigorous software QA. A side benefit of storing the relationships twice is that in the event of a failure I can reconstruct Persons' phone-numbers from the Phones table or vice-versa. If need be, I can perform this reconstruction on a live system while it's in use.

As another poster noted, I could store the multi-valued field only once and create an index to handle queries in the opposite direction. As you note above, context makes all the difference in these design decisions.   I'm not trying to argue that Pick is better than relational as much as it offers similar capability and flexibility.

Steve Lancour Received on Mon Oct 27 2003 - 17:44:10 CET

Original text of this message