Re: consolidation of multiple rows
Date: Wed, 5 Mar 2008 09:49:13 -0800 (PST)
Message-ID: <b49a366f-7148-4f2e-84ea-f71f9377f814@e10g2000prf.googlegroups.com>
- If two of the rows in your raw data staging table have different values for the same attribute, say a phone number, which one do you pick or does it matter? If not, then use this skeleton:
SELECT company_name, MAX(phone_nbr), MAX(..), ..
FROM RawData
GROUP BY company_name;
else find the conflicts:
SELECT company_name AS conflicting_data
FROM RawData
GROUP BY company_name
HAVING COALESCE (NULLIF(COUNT (DISTINCT phone_nbr), 1), NULLIF(COUNT
(DISTINCT ..), 1), .. ) IS NOT NULL ;
2) If two of the rows in your raw data staging table have the same
value for the same attribute, do you need to know or do you just use
it?
3) If all of the rows in your raw data have NULLs for an attribute, do you use the NULL or want to now about it?
There are a lot of games we can play with a HAVING clause on the raw data. You might want to get a copy of THINKING IN SETS and look at Chapter 10, "Scrubbing Data with Non-1NF Tables" for some more ideas. Received on Wed Mar 05 2008 - 11:49:13 CST