Re: does a table always need a PK?
Date: Mon, 01 Sep 2003 13:05:53 +0300
Message-ID: <3F531A01.10709_at_atbusiness.com>
Paul G. Brown wrote:
>Lauri Pietarinen <lauri.pietarinen_at_atbusiness.com> wrote in message news:<bitn92$pb1$1_at_nyytiset.pp.htv.fi>...
>
>
>>Thanks for interesting posting, Paul! Could you give me more details on
>>that last paragraph? When were
>>they (set algebras) tried? By the original System-R team? At some
>>later time? Could it be that
>>it was found hard in the mid 70's but could not be tried again because
>>of SQL-dominance?
>>
>>
[snipped]
> Which gives hope that eliminating duplicates might not be cripplingly
> expensive. We just haven't figured out how to do it efficiently. Well,
> we know how to pull dupes from a given relation, but doing it efficiently
> in a query plan, particularly when time-to-first-row is a sensitive
> performance number, is much harder.
>
Paul, most of your quotes come from over 20 years ago and I am not here
to criticise things and
decisions made by those pioneers. They did, after all, do good work
with optimising stuff etc...
And I can symphatise with their concern with performance.
However, this is now 20 years later and we are fixed with SQL for the
time being. So would
you agree with me that the reasons for not "fixing" this problem
(duplicates) has more to do
with large installed base and commercial interests than technology?
You must be familiar with BS12 of which Darwen was one of the architects (http://www.mcjones.org/System_R/bs12.html).
He states:
"Because BS12 spurned duplicate rows, it was obliged to make "duplicate removal avoidance" a strong feature of its optimiser--something that SQL implementations are only now beginning to catch up with (in SQL terms, this means not always firing up the duplicate elimination mechanism just because the user said DISTINCT--you might be able instead to prove that there cannot be any duplicates)."
Of course, BS12 did not have a very long life so I suppose there is not
so much data on how
it faired in the "real world".
> And to repeat: the point here is not that duplicate values are OK. They're
> not. They complicate design, dash hopes for data consistency, and do make
> certain optimization problems harder.
>
Can you help me? Why do so many posters (some of who are
db-researchers, some of who are implementers)
claim that the bag model poses no problems? Or is this question
undecidable?
>But they have negative implications.
> And it isn't reasonable to ignore their dark side.
>
It would be interesting to see a current assesment on the negative
implications.
regards,
Lauri Pietarinen
Received on Mon Sep 01 2003 - 12:05:53 CEST
