Star Schema from 'Flattened' CSV

From: Aaron Peters <transreductionist_at_gmail.com>
Date: Thu, 2 Nov 2017 13:48:00 -0700 (PDT)
Message-ID: <655c599e-d8c6-4a00-a8a2-4e408748080f_at_googlegroups.com>


[Quoted] CSV Description:

[Quoted] The CSV file contains multiple columns related to the performance of a given product. It has a product ID, a product description, unit sale price, total units sold, total units returned, as well as names/quotes from several vendors giving the price per unit. In the future I expect vendors will be added and will want my Schema to handle this eventuality gracefully.

I. Consider the trivial case where the columns of the CSV reference only 1 vendor price:

CSV COLUMNS: prod_id, prod_desc, unit_price, tot_sold, tot_ret, vend_name, vend_price

STAR SCHEMA: fact_performance

  • unit_price
  • tot_sold
  • tot_ret
  • vend_price

dim_product

  • prod_id
  • prod_desc

dim_vendor

  • vend_name

II. Consider the non-trivial case where the columns of the CSV reference multiple vendor quotes:

CSV COLUMNS: prod_id, ..., vend1_name, vend1_price, vend2_name, vend2_price

STAR SCHEMA: fact_performance

  • total_sales [Quoted]
  • total_units_returned
  • total_units_sold
  • vend1_price
  • vend2_price

dim_product

  • prod_id
  • prod_desc

How do I account for multiple vendors in a dimension table?

If I can offer any clarification please lest me know. Received on Thu Nov 02 2017 - 21:48:00 CET

Original text of this message