my fault, the customer table is a dimension table, not a fact table.
Doh!
- DENNIS WILLIAMS <DWILLIAMS_at_LIFETOUCH.COM> wrote:
> Rachel
> The varchar business on your fact table worries me. I'm not
> trying to be
> critical, but to bring up some issues you may want to consider before
> you
> begin the big load. Unfortunately too many DWs end up with a flawed
> data
> model (ours included) that limit its usability. Sometimes the
> fortunate
> sites are the ones that get it so wrong it must be blasted away and
> recreated. Guess who gets to bear the brunt of that "hasty learning
> exercise"?? Yep, that's right. Here are my thoughts for what they are
> worth.
> 1. Normally the fact table is very large, so it is critical that it
> be
> designed as nearly right as possible. Especially with what is called
> the
> "granularity", the lowest level of data that is stored. You can
> always
> aggregate up, never down. Daily data can be summed to weekly, but we
> can't
> take weekly data and figure out the daily amounts.
> 2. When you say the customers "fill in the blanks", that worries me.
> That
> doesn't sound so much like a DW as an OLTP. Where is the history
> dimension?
> 3. The fact table usually is so large that the information only
> makes sense
> in aggregate. You aren't looking for the particular blue-eyed 23-year
> old
> female from Des Moines, but trying to find HOW MANY blue-eyed, etc.
> This
> means that you won't be scratching around with VARCHAR2 fields with
> query
> operators such as LIKE. Performance would be really BAD.
> 4. I don't know your application, just the minor details you've
> mentioned
> in passing, but consider something like this. The FACT table logs
> each new
> information that a user provides. VARCHAR2 fields. Never UPDATEd,
> just add a
> new record along with the date that record was added. From that we
> create an
> aggregate table CURRFACT. One row per customer. Weekly we scan the
> new
> records added to FACT, pulling new facts, updated facts into
> CURRFACT. Most
> of the fields in CURRFACT are single character flag fields. Bit map
> index
> the heck (sorry, but we have a fierce naughty word scanner) out of
> CURRFACT.
> Performance is awesome. Queries return before the users hit enter
> (just
> kidding). From to time marketing recognizes some relevant fact that
> isn't in
> CURRFACT. You add a new column to CURRFACT and start a really big
> query on
> FACT when you leave for the weekend that will populate the new
> column. There
> may be a few fields like address that you populate in CURRFACT just
> for
> convenience. But you don't search them. Also, if you ever need the
> history
> of how your customers have moved around, you have that data. And
> remember, a
> DW is all about history, never about current information.
> The modeling issues have a lot more about the performance and
> usability
> of the DW than the choices we have as DBAs such as LMT.
> Okay, I'll quit prattling on here. You probably didn't even get a
> say in
> the data model. They never ask the DBA. But if you raise the issues
> beforehand it'll amuse you more when they come back and ask you to
> redo
> everything. Hey, I just noticed that you won't be the production DBA
> on
> this! No worries!
>
> Dennis Williams
> DBA
> Lifetouch, Inc.
> dwilliams_at_lifetouch.com
>
>
> -----Original Message-----
> Sent: Tuesday, September 03, 2002 1:44 PM
> To: Multiple recipients of list ORACLE-L
>
>
> We may end up reworking the extent sizes, right now they are smaller
> than those but we are still in stage one, haven't gone live yet (and
> I
> don't even want to think about what a pain it will be to change
> things
> when we do go live).
>
> Data load test coming up soon, so I'll have a better idea of what I
> need to change things to, if I need to change them.
>
> There is no way the customer row will remain the same size unless I
> change all the varchar fields to char. We have VERY sparse data as
> yet
> and expect to be able to entice customers to "fill in the blanks"
> which
> will cause rows to grow.
>
> We are allocating WAY more space than we need at the moment and will
> be
> closely monitoring growth (once a day data loads) and I can always
> turn
> on autoextend if I need it. But then again, I am not the DBA who will
> be responsible for the production site, at least not for any space
> issues on the production DW.
>
> Rachel
>
> --- DENNIS WILLIAMS <DWILLIAMS_at_LIFETOUCH.COM> wrote:
> > Rachel - I have been using LMTs with uniform extents
> > (Oracle-recommended
> > variation) for a couple of years now with Oracle 8.1.6, and now
> 9.2.
> > The
> > Oracle-recommended extent sizes are 128k, 4m, and 128m. As far as
> > PCTFREE,
> > et. al., these are at the table level, so my opinion would be that
> > the
> > guidelines for these are unchanged from the pre-LMT days. The key
> > issue with
> > the highly updated customer table would be whether the size of the
> > row is
> > changing. If you can keep the row size constant, then you won't
> wind
> > up with
> > chained rows. The biggest issue facing you is whether you turn
> > AUTOEXTEND
> > on. I did that and have encountered relatively few problems. Well,
> > one
> > problem. I had tables set with large NEXT extents to minimize
> > extents, and
> > when one extended boy did my sys admin get excited. I changed that.
> > A bigger issue in building your data warehouse is whether you
> can
> > use the
> > partitioning option. Most of our queries were taking more than 2
> > minutes and
> > I was able to partition and bring that down below 10 seconds. The
> > users were
> > pretty excited.
> >
> > Dennis Williams
> > DBA
> > Lifetouch, Inc.
> > dwilliams_at_lifetouch.com
> >
> >
> > -----Original Message-----
> > Sent: Tuesday, September 03, 2002 10:49 AM
> > To: Multiple recipients of list ORACLE-L
> >
> >
> > time for me to ask the experts again.
> >
> > My data warehouse will be 9.2, with all locally managed
> tablespaces.
> > We
> > will be following what I have taken to calling the "Goldilocks"
> > principle -- that of small, medium and large tablespace extent
> sizes,
> > with variations in that we will separate indexes and data, and will
> > have even more separation for our fact tables into partitioned
> tables
> > and tablespaces.
> >
> > However, now comes the time for me to work out storage clauses. And
> a
> > quick read through the docs leaves me wondering if I should just
> turn
> > on automatic segment-space management and not worry about setting
> > PCTFREE, PCTUSED and FREELIST parameters. I can't find any real
> > information or bugs on MetaLink either.
> >
> > Does anyone have any experience, good OR bad, with using this
> > feature?
> > If you are doing data warehouse work, what are good values for the
> > parameters if I DO use them? One fact table is likely to be highly
> > updated (customer info) as we collect more and more specific
> > information from customers. The rest will be, as you would expect
> > from
> > a DW, mostly inserts.
> >
> > Help?
> >
> > Thanks!
> >
> > Rachel
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Yahoo! Finance - Get real-time stock quotes
> > http://finance.yahoo.com
> > --
>
=== message truncated ===
Do You Yahoo!?
Yahoo! Finance - Get real-time stock quotes
http://finance.yahoo.com
--
Please see the official ORACLE-L FAQ: http://www.orafaq.com
--
Author: Rachel Carmichael
INET: wisernet100_at_yahoo.com
Fat City Network Services -- (858) 538-5051 FAX: (858) 538-5051
San Diego, California -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from). You may
also send the HELP command for other information (like subscribing).
Received on Tue Sep 03 2002 - 15:38:39 CDT