[OT?] Value of duplicate values
Date: 18 Apr 2002 15:04:11 -0700
Message-ID: <61c84197.0204181404.7f24f292_at_posting.google.com>
Hi all
This is follow up to a previous message, I posted on comp.databases.theory, which read:
It is a wellknown fact that SQL returns duplicate values unless you use the SELECT DISTINCT option. What is almost as wellknown (i hope!) is that relational theory is based on sets. I believe that the original motivation for returning bags ("sets" with duplicate values retained) was one of performance: to eliminate the extra step, that would be required if duplicates were to be removed. My question is simple: Do you personally have code, where the retaining of duplicate tuples is essential/convenient/whatever? If so - could you provide me with an insight in to what problem is solved.
I did receive valuable information there, as to whether duplicates were needed: the answer seemed to be that:
- duplicates in base-tables were useless
- sometimes original input had duplicates in it (cash register applications were used as an example), which of course is no surprise.
- duplicates as the result of a query were accepted on performance grounds.
Then why do i pose this question again? Because I would like some more info on the subject, particularly when viewed from a pragmatic point of view. Thus I ask you practitioners out here what your practical experience regarding duplicate values is. In what situations do you use SELECT DISTINCT? In what situations have you used SELECT DISTINCT but discovered that the resulting query did perform just to bad. Have duplicate values ever caused unpleasent surprises to you or your users? What would you do if suddenly your favourite DBMS-vendor decided to be "more relationally compliant" and suddenly banned duplicate values - would you cry, smile or be indifferent?
Any comments will be greatly appreciated, and let me once again excuse if this topic is considered off topic.
Kind regards
Peter Koch Larsen
Received on Fri Apr 19 2002 - 00:04:11 CEST
