Re: consolidation of multiple rows

From: Ruud de Koter <nobody_at_internet.org>
Date: Wed, 05 Mar 2008 21:22:01 +0100
Message-ID: <47cf00e7$0$14349$e4fe514c@news.xs4all.nl>


hpuxrac wrote:

> On Mar 5, 12:49 pm, --CELKO-- <jcelko..._at_earthlink.net> wrote:

>> 1) If two of the rows in your raw data staging table have different
>> values for the same attribute, say a phone number, which one do you
>> pick or does it matter? If not, then use this skeleton:
>>
>> SELECT company_name, MAX(phone_nbr), MAX(..), ..
>> FROM RawData
>> GROUP BY company_name;
>>
>> else find the conflicts:
>>
>> SELECT company_name AS conflicting_data
>> FROM RawData
>> GROUP BY company_name
>> HAVING COALESCE (NULLIF(COUNT (DISTINCT phone_nbr), 1), NULLIF(COUNT
>> (DISTINCT ..), 1), .. ) IS NOT NULL ;
>>
> 
> Makes sense ...  perhaps the OP can give us some more background of
> how and why info gets into the raw table.
> 

>> There are a lot of games we can play with a HAVING clause on the raw
>> data. You might want to get a copy of THINKING IN SETS and look at
>> Chapter 10, "Scrubbing Data with Non-1NF Tables" for some more
>> ideas.
> 
> What is "Thinking in sets"? and who wrote it ... not familiar with
> that title.

One could hardly be familiar with the title, as it is copyrighted in 2008 (any decent internet bookstore will show you). Quite unsurprisingly, it is written by Joe Celko. I don't think he has any reason to be ashamed of some shameless self-promotion...

Keep up the good work, mr. Celko!

Ruud. Received on Wed Mar 05 2008 - 14:22:01 CST

Original text of this message