Re: Guidelines to a decent support of surrogate key implementation
Date: 24 May 2007 11:27:43 -0700
Message-ID: <1180031263.364874.174190_at_q75g2000hsh.googlegroups.com>
On 24 mai, 20:14, DBMS_Plumber <paul_geoffrey_br..._at_yahoo.com> wrote:
> On May 24, 9:27 am, "Brian Selzer" <b..._at_selzer-software.com> wrote:
>
>
>
> > CON
> > Keys can change over time. Consider the following:
>
> > R{A, B, C} | AB --> C, and several other relations referencing R[AB]. R:
> > {A:1, B:2, C:3}
> > {A:2, B:2, C:2}
>
> > T1: User 1 reads row {A:1, B:2, C:3}, starts a long-running calculation.
>
> > T2: User 2 updates row {A:1, B:2, C:3} to {A:3, B:2, C:3} updates cascade
> > throughout the database. R:
> > {A:3, B:2, C:3}
> > {A:2, B:2, C:2}
>
> > T3: User 3 updates row {A:2, B:2, C:2} to {A:1, B:2, C:3} updates cascade
> > throughout the database. R:
> > {A:3, B:2, C:3}
> > {A:1, B:2, C:3}
>
> > T4: User 1 updates row {A:1, B:2, C:3} to {A:1, B:2, C:4} -- BUG!!!!
>
> All your example does is show that users who don't understand
> isolation levels will create bugs.
>
> If User 1 was planning to update the tuple they had read, then they
> damn well better ensure that their read is repeatable. This would
> block User 2 and User 3.
>
> How is this any different than the case where you choose to
> concatenate R[A,B] into a single attribute? Let me repeat your example
>
> R { A, B } A->B
>
> [ A:1, B:1 ]
> [ A:2, B:2 ]
>
> T1: User 1 Read(A:1,B:1) (Wander off to compute F(B:1) )
> T2: User 2 Update(A:1,B:1) -> (A:3,B:1)
> T3: User 3 Update(A:2,B:2) -> (A:1,B:2)
> T4: User 1 Update(A:1,X) -> (A:1, F(B:1) ) <-- Bug
>
> Let me also say that the biggest PRO of Cimode's Approach # 1 is
> purely performance. It would be simpler and smaller (fewer indices)
> and would make all kinds of nice things possible wrt.physical query
> plans.
>
> Why has it not been done? Because making this change involves
> serious surgery to the DBMS. Try getting that past the bean
> counters ....
Let me summarize the latest arguments put...
LINE1> People who think that surrogate key should be an internal
physical mechanism to the dbms and invisible to designer who could
focus on the logical selection of primary keys
LINE2> People who think that a trdbms should allow the designer to
have some control
LINE3> People who think there is nothing wrong with current surrogate
key implementations
I personally belong to first category.
What do you think are the Pro/Cons of each approach? Here is mine:
LINE1>
PRO Designer can focus better on logical design. Lesserror prone in
key implementation
Independence between the logical and physical layer isimplemented
at design time.
DBM_Plumber: Performance enhanced because of bettershowplans
CON Surrogate key implementation becoming internaladdressing pointers,
users can not find any more familiarity in surrogate keys. Assuming familiarity is an advantage, that would become a CON.
LINE2>
PRO Familiarity Brian: Keys should be subject to change (a violation of *stability* requirement for pk's) CONS Introducing a part of subjectivity in key generation process DBMS_Plumber: Increases risk of concurrency problems LINE3> PRO ? CONS Keys implemented are not primary key implementations Product oriented bias for definition of keys All consequences arising for not having identityReceived on Thu May 24 2007 - 20:27:43 CEST