Re: Dreaming About Redesigning SQL

From: Steve Lancour <stevel_at_lancour.com>
Date: Thu, 30 Oct 2003 20:05:04 -0500
Message-ID: <mKudnYI38-U5KTyiRVn-gw_at_comcast.com>


Bob Badour wrote:

> "Steve Lancour" <stevel_at_lancour.com> wrote in message
> news:M_GdnSiEV8OzRAKiRVn-ug_at_comcast.com...
> 

>>Bob Badour wrote:
>>
>>
>>>Steve Lancour <stevel_at_lancour.com> wrote in message
> 
> news:<KcWdnaY9u-hd1wCiRVn-vw_at_comcast.com>...
> 

>>>>andrewst wrote:
>>>>
>>>>
>>>>>Originally posted by Steve Lancour
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>In Pick I'd do it with two files (tables):
>>>>>
>>>>>
>>>>>
>>>>>>Persons Table:
>>>>>
>>>>>
>>>>>
>>>>>>Marshall
>>>>>
>>>>>
>>>>>>Mike
>>>>>
>>>>>
>>>>>
>>>>>>Phones Table:
>>>>>
>>>>>
>>>>>
>>>>>>123-4567
>>>>>
>>>>>
>>>>>>234-5678
>>>>>
>>>>>
>>>>>>222-2222
>>>>>
>>>>>
>>>>>
>>>>>>So far, we've used the same logical structure. Where in the
>>>>>>relational
>>>>>
>>>>>
>>>>>>model you'd use a third table to represent the intersections of
>>>>>>Persons
>>>>>
>>>>>
>>>>>>and Phones I would define a multi-valued attribute (column) in Persons
>>>>>
>>>>>
>>>>>>for Phones and in Phones for Persons:
>>>>>
>>>>>
>>>>>
>>>>>>Phones attribute (column) in Persons records:
>>>>>
>>>>>
>>>>>>Marshall 123-4567]234-5678
>>>>>
>>>>>
>>>>>>Mike 234-5678]222-2222
>>>>>
>>>>>
>>>>>
>>>>>>Persons attribute (column) in Phones records:
>>>>>
>>>>>
>>>>>>123-4567 Marshall
>>>>>
>>>>>
>>>>>>234-5678 Marshall]Mike
>>>>>
>>>>>
>>>>>>222-2222 Mike
>>>>>
>>>>>
>>>>>
>>>>>>Referential integrity is the responsibility of the developer. In a
>>>>>
>>>>>
>>>>>>poorly-designed system there may be any number of programs updating
>>>>>
>>>>>
>>>>>>these records. Each such program must enforce the rules and any rule
>>>>>
>>>>>
>>>>>>changes must be made separately in each. The opportunity for mistakes
>>>>>
>>>>>
>>>>>>leading to corruption is undeniable.
>>>>>
>>>>>
>>>>>
>>>>>>In a well-designed system, only one Pick-Basic subroutine handles
>>>>>
>>>>>
>>>>>>updates and deletions to the Person/Phone relations for *all* input
>>>>>
>>>>>
>>>>>>sources (other Pick programs, web interface, batch load, etc.). This
>>>>>
>>>>>
>>>>>>subroutine may be implemented as a trigger or as part of a business
>>>>>
>>>>>
>>>>>>rules layer "talking to" a variety of "front-end" applications. If
>>>>>
>>>>>
>>>>>>desired, the two record updates may be enclosed in a transaction to
>>>>>
>>>>>
>>>>>>guarantee integrity.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>In this example, the query to get Marshall's phone numbers would be:
>>>>>
>>>>>
>>>>>
>>>>>>select Persons "Marshall" Phones
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>To query for Persons using 234-5678:
>>>>>
>>>>>
>>>>>
>>>>>>select Phones "234-5678" Persons
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>In both cases, Pick's record-key hashing allows a "direct hit" on the
>>>>>
>>>>>
>>>>>>records requested without the need for an index or a table scan.
>>>>>
>>>>>
>>>>>
>>>>>>Steve Lancour
>>>>>
>>>>>
>>>>>Right, well I'm not in the business of Pick-bashing, really. But we
>>>>>could achieve the same "direct hit" performance in Oracle (the SQL
>>>>>DBMS I happen to be familiar with) using 2 hash clusters if we wanted
>>>>>to - provided we store the person/phone intersection data twice as you
>>>>>would in Pick.
>>>>>
>>>>>
>>>>>
>>>>>We wouldn't actually do that, of course, because 99.999% of the time
>>>>>ensuring data integrity is more important than shaving a CPU
> 
> millisecond
> 

>>>>>off the access time. And as you admit, storing the same data twice
>>>>>separately runs an increased risk of corruption.
>>>>>
>>>>>
>>>>>
>>>>>So yes, we will add indexes (unless we are sure our tables will always

>>>>>be very small), and a typical keyed query may involve a few logical
>>>>>reads rather than 1. If the query is performed a lot, the index
>>>>>entries are likely to be cached in the server memory, so that the
>>>>>number of physical reads may be fewer (zero if we are really lucky);
>>>>>if it isn't performed a lot, we probably won't notice the additional
>>>>>physical reads anyway.
>>>>>
>>>>>
>>>>>--
>>>>>Posted via http://dbforums.com
>>>>
>>>>
>>>>In my example, if I bracket the two updates inside a transaction I
>>>>eliminate the possibility of corruption during the update.
>>>
>>>
>>>How do you eliminate the possibility of corruption due to flaws in
>>>your programming logic or another programmer's logic?
>>
>>I can't *eliminate* the possibility of corruption in Pick any more than
>>I could on any other system if someone with full DBA privileges makes a
>>mistake.
> 
> 
> A DBA's mistake cannot harm any data protected by a declared integrity
> constraint. You postulate two mistakes: one mistake to remove the constraint
> and a second mistake that corrupts the data. That seems rather less likely
> than an application programming error.
> 
> 

No, I postulate only one mistake. If the DBA makes a mistake in setting up the constraint the applications may be perfect and yet still corrupt data. Same as if there were a mistake in the example Pick code I posted. Received on Fri Oct 31 2003 - 02:05:04 CET

Original text of this message