Re: marketing database challenge

From: <joe_celko_at_my-deja.com>
Date: 2000/03/29
Message-ID: <8bt55n$upq$1_at_nnrp1.deja.com>#1/1


>> We have a normalized database table showing product relationships per
customer. The format includes customer_id and product_id. For sake of argument, let's say that the company only sells three different types of products... therefore, customers can have one, two, or three records in the product_relationship table, depending on which products they have purchased. <<

It would if you would post DDL instead of making people guess at the tables.

 CREATE TABLE Purchases
 (customer_id CHAR(10) NOT NULL,
  product_id INTEGER NOT NULL

             REFERENCES Inventory(product_id),   PRIMARY KEY (customer_id, product_id)

>> The challenge is how to create a de-normalized view of this table so
that Marketing end users can easily analyze customers by number of relationships ... However, I'm trying to find a way to make this 'dummyproof'  for end-users. <<

Good luck! "Against invincible stupidity, not even the Gods themselves can prevail" -- Roman proverb

>>One way to accomplish this might be to create a view that would look
like the below:
 customer_id prod_A_ind prod_b_ind prod_c_ind <<

You should be using a report writer, sicne SQL has to know the number of columns in advance to construct a table. But try these versions:

 SELECT DISTINCT customer_id

        (SELECT 'X'
           FROM Purchases AS P1
          WHERE P1.customer_id = P0.customer_id
            AND P1.product_id = 'A') AS product_A,
        (SELECT 'X'
           FROM Purchases AS P2
          WHERE P2.customer_id = P0.customer_id
            AND P2.product_id = 'B') AS product_B,
       (SELECT 'X'
           FROM Purchases AS P3
          WHERE P3.customer_id = P0.customer_id
            AND P3.product_id = 'C') AS product_C
   FROM Purchases AS P0;

This will give you a NULL or an 'X' in the columns, and the query schema is extensible. It will also be slow. Better to use this:

 SELECT customer_id

        SUM(CASE WHEN product_id = 'A'
                 THEN 1 ELSE 0 END) AS product_A,
        SUM(CASE WHEN product_id = 'B'
                 THEN 1 ELSE 0 END) AS product_B,
        SUM(CASE WHEN product_id = 'C'
                 THEN 1 ELSE 0 END) AS product_C,
   FROM Purchases
  GROUP BY customer_id;

This gives you the total of each product, should run fast and the query schema is extensible. If you just want a "yes/no" flag, use the SIGN() function on the SUM() to get a one or zero.

You might want to get a copy of SQL FOR SMARTIES, which is, full of these tricks.

--CELKO-- Sent via Deja.com http://www.deja.com/
Before you buy. Received on Wed Mar 29 2000 - 00:00:00 CEST

Original text of this message