Re: On the subject of Data Warehouses, Data Cubes & OLAP....
Date: Thu, 23 Oct 2003 12:48:12 -0400
Message-ID: <yv2cncJdW_MemwWiU-KYiQ_at_golden.net>
"DataMan" <dataman_at_ev1.net> wrote in message
news:vpfms541cpv8ec_at_corp.supernews.com...
>
> "Bob Badour" <bbadour_at_golden.net> wrote:
> >"DataMan" <dataman_at_ev1.net> wrote in message
> >news:vpedvrc3kbf74_at_corp.supernews.com...
> >>
> >> "Bob Badour" <bbadour_at_golden.net> wrote:
> >> >"DataMan" <dataman_at_ev1.net> wrote in message
> >> >news:vpdppi7ccdn0f1_at_corp.supernews.com...
> >> >>
> >> >> "Bob Badour" <bbadour_at_golden.net> wrote:
> >> >> >"DataMan" <dataman_at_ev1.net> wrote in message
> >> >> >news:vpdad0mimma52e_at_corp.supernews.com...
> >> >> >>
> >> >> >> Most OLTP systems do support the information requested. It's
just
very
difficult
> >> >> >> to retrieve.
> >> >> >
> >> >> >That is a remarkable and very interesting assertion. What can you
offer
to
> >> >> >support such a remarkable assertion?
> >> >> >
> >> >> The support information or difficult to retrieve statement?
> >> >
> >> >I do not find the "support information" assertion remarkable, and I
doubt
> >> >anyone would. What can you offer to support your remarkable assertion
that
> >> >information is difficult to retrieve?
> >> >
> >> I have no idea what you find remarkable.
> >
> >I already told you what I found remarkable. Can you offer any support for
> >your remarkable assertion?
> >
> >
> >> The world is full of remarkable
> >> people all finding different things remarkable. I find it remarkable
that
> >> you find no value in dimensional data warehouses.
> >
> >Did I say that?
> >
>
> No, but since you make assumptions, I thought I would too.
I suppose we all make assumptions, and I try to make mine explicit. To which assumptions do you refer?
> >> As your opinion on this
> >> matter is already formed, and many books on the subject have
unsuccessfully
> >> shown you the value of data warehouses, I don't believe I can sway your
opinion
> >> on the subject.
> >
> >Did I say that?
> >
>
> ibid.
Again, to which assumptions do you refer?
> >> If you would like some recommendations on books that cover
> >> the material in depth, I would be happy to provide some.
> >
> >That seems a very long way of saying you have nothing to offer to support
> >your remarkable assertion. Apparently, you simply assumed it was true
> >because you heard it somewhere, and now you are out of your depth.
> >
>
> That's quite possible. But this is fun, so I continue. Please reference
> the attached data model. This is the standard movies data model supplied
> by CA when purchasing ERwin. If I wanted a report that provided the
amount
> of rentals customer's are making using electronic payments by store, the
> resulting sql would be more complex than a dimensional model built around
> a transaction fact table with associated dimensions for customer, store,
> etc.
I don't see any attachments. Regardless, how does the requirement exceed the capabilities and features of a snapshot?
> >> >> >> Additionally when the business requires enterprise level
reporting
> >> >> >> across the various functional applications, a warehouse becomes
a
necessity.
> >> >> >> You know the scene, large company with 10 different order entry
systems
> >> >> >> for each functional area of the company (i.e. cost center). One
for
small
> >> >> >> biz, one for consumer, one for z product, one for y product, etc.
And
of
> >> >> >> course there was never any upfront effort by a data management
organization
> >> >> >> to ensure consistent representation of enterprise level entities.
So
you
> >> >> >> end up with a hodge podge set of data that must be abstracted to
make
consistent
> >> >> >> within a DW environment. And since you're already there go ahead
and
put
> >> >> >> it in a dimensional format so somebody can understand it and
easily
pull
> >> >> >> data/reports.
> >> >> >
> >> >> >What exactly is a dimensional format? How does it differ from
relational
> >> >> representation?
> >> >> They are both relational. Assuming both the application database
and
the
> >> >> data warehouse are both stored in relational databases. Was your
question
> >> >> directed to differences between a data warehouse database (let's say
OLAP
> >> >> for convenience) and an application database (let's say OLTP for
convenience)?
> >> >> The primary difference is purpose. OLTP must support consistent
real
time
> >> >> data. OLAP only need support historical reporting.
> >> >
> >> >So then, OLAP is just another word for "snapshot" as in "a derived
stored
> >> >relation" ?
> >> >
> >> I'm sorry, I don't understand your meaning in "a derived stored
relation".
> >
> >What part do you not understand? Do you know what a relation is? Do you
know
> >the difference between something that is stored versus something that is
not
> >stored? Do you know what it means to derive something?
> >
>
> The derived stored relation part. Yes. Yes. Yes.
If you know what a relation is and you know what a derived relation is and you know what a stored relation is, what causes you trouble with understanding a derived stored relation? It all seems quite clear to me.
> >> I was trying to illustrate one of the basic differences between an
application
> >> database and a dimensional data warehouse.
> >
> >You appear to have regurgitated meaningless pap. If you were illustrating
> >anything, you could offer support for your assertions.
>
> I thought it was meaningful and remarkable. Oh well. I guess in a
nutshell,
> the only support I can offer is empirical. Most of the business community
> I have dealt with have difficulty navigating and understanding what is
shown
> in complex, highly normalized data models.
Why would anyone navigate something that does not require navigation? The idea seems somewhat absurd to me. Where can I find the raw data for your empirical measurements? Can you email them to me?
> In fact, most business folks
> don't care.
I agree. Most of them accept meaningless pap without objection.
> They just want the application to work and they want their reports.
Is there supposed to be some meaningful point in the above observation?
> Those same folks who previously needed the IT shop to develop and write
> reports for them are now using BI tools to do their own data mining, data
> queries against dimensional models. Why or how they can do one and not
the
> other is obviously a matter for debate. I see the success of the DW
industry
> and it's prevalence in corporate world and decide that people see value in
> the technology.
What technology is that? Be specific: We are all engineers here.
> >> And that difference was that
> >> the application database is constantly changing over time as the
business
> >> data changes. OTOH, in the data warehouse the data is never updated.
> >
> >Really? Now that is a truly remarkable statement! Do you realise this
>
> Sweet. Now I'm making progress!
>
> >fundamentally alters every cosmogeny I have ever heard: "In the beginning
> >was the data warehouse..." One immediately wants to ask whence this
eternal
> >data came, but Ockham already took care of that one.
> >
>
> Great statement! But I had to look up Ockham. You are a scholar! The
jury's
> still out on the gentleman part.
Definitely not a gentleman. The jury came back on that verdict ages ago.
> >> So
> >> a report run in a data warehouse today will be the tomorrow and the
month
> >> after that.
> >
> >What would be the point in running the same report on unchanging data?
> >
>
> You want two copies?
Okay. But wouldn't a photocopier or Kinkos be cheaper?
> End of year summary?
Since nothing gets updated, what would be the point?
> Year to date?
Since nothing changes, what would the year to date show? Why wouldn't one just report on the data at the beginning of the year?
> >> >> >How or why can it improve understanding? How have you measured
this
improved
> >> >> understanding?
> >> >> I didn't say that it does.
> >> >
> >> >"And since you're already there go ahead and put it in a dimensional
format
> >> >so somebody can understand it and easily pull data/reports."
> >> >
> >> >It sounds to me like you did. If you did not say that, what the hell
did
you
> >> >intend to say?
> >> >
> >> Hmmm, I guess I did say that. Sorry I was busy at work and responded
too
> >> quickly to the post. A fully normalized model for a large application
can
> >> have 100's of entities. A dimensional model is built to support
specific
> >> reporting and consists of a fact table and its related dimension
tables.
> >
> >Yes, potentially hundreds of dimension tables... Does a data warehouse
> >comprising multiple data marts have only a single fact table or is it not
a
> >dimensional model?
> >
>
> I guess it's possible for a single fact table to have hundreds of
dimensions.
> I've not seen one. I usually see 5 to 15 dimensions. A dimensional
model
> supports only specific reporting. As such you would need to create a fact
> for each type of reporting required.
I see. We have lots of dimensions and lots of fact tables. How does one know which dimensions and facts to use?
> >> Occasionally, a dimension will also have a relationship to another
table
> >> resulting in the snowflake term.
> >
> >So, if we rearrange our entities into a snowflake pattern, we get a data
> >warehouse?!?
> >
>
> I'm not quite sure how you came to that conclusion. There is the
application
> database(s) and there's the data warehouse. "Snowflaking" entities in an
> application database doesn't make a DW.
Then what does make a DW? Received on Thu Oct 23 2003 - 18:48:12 CEST