Re: On the subject of Data Warehouses, Data Cubes & OLAP....
Date: Thu, 23 Oct 2003 13:45:41 -0000
Message-ID: <vpfms541cpv8ec_at_corp.supernews.com>
"Bob Badour" <bbadour_at_golden.net> wrote:
>"DataMan" <dataman_at_ev1.net> wrote in message
>news:vpedvrc3kbf74_at_corp.supernews.com...
>>
>> "Bob Badour" <bbadour_at_golden.net> wrote:
>> >"DataMan" <dataman_at_ev1.net> wrote in message
>> >news:vpdppi7ccdn0f1_at_corp.supernews.com...
>> >>
>> >> "Bob Badour" <bbadour_at_golden.net> wrote:
>> >> >"DataMan" <dataman_at_ev1.net> wrote in message
>> >> >news:vpdad0mimma52e_at_corp.supernews.com...
>> >> >>
>> >> >> Most OLTP systems do support the information requested. It's just
very
difficult
>> >> >> to retrieve.
>> >> >
>> >> >That is a remarkable and very interesting assertion. What can you
offer
to
>> >> >support such a remarkable assertion?
>> >> >
>> >> The support information or difficult to retrieve statement?
>> >
>> >I do not find the "support information" assertion remarkable, and I doubt
>> >anyone would. What can you offer to support your remarkable assertion
that
>> >information is difficult to retrieve?
>> >
>> I have no idea what you find remarkable.
>
>I already told you what I found remarkable. Can you offer any support for
>your remarkable assertion?
>
>
>> The world is full of remarkable
>> people all finding different things remarkable. I find it remarkable
that
>> you find no value in dimensional data warehouses.
>
>Did I say that?
>
No, but since you make assumptions, I thought I would too.
>
>> As your opinion on this
>> matter is already formed, and many books on the subject have
unsuccessfully
>> shown you the value of data warehouses, I don't believe I can sway your
opinion
>> on the subject.
>
>Did I say that?
>
ibid.
>
>> If you would like some recommendations on books that cover
>> the material in depth, I would be happy to provide some.
>
>That seems a very long way of saying you have nothing to offer to support
>your remarkable assertion. Apparently, you simply assumed it was true
>because you heard it somewhere, and now you are out of your depth.
>
That's quite possible. But this is fun, so I continue. Please reference the attached data model. This is the standard movies data model supplied by CA when purchasing ERwin. If I wanted a report that provided the amount of rentals customer's are making using electronic payments by store, the resulting sql would be more complex than a dimensional model built around a transaction fact table with associated dimensions for customer, store, etc.
>
>> >> >> Additionally when the business requires enterprise level reporting
>> >> >> across the various functional applications, a warehouse becomes
a
necessity.
>> >> >> You know the scene, large company with 10 different order entry
systems
>> >> >> for each functional area of the company (i.e. cost center). One
for
small
>> >> >> biz, one for consumer, one for z product, one for y product, etc.
And
of
>> >> >> course there was never any upfront effort by a data management
organization
>> >> >> to ensure consistent representation of enterprise level entities.
So
you
>> >> >> end up with a hodge podge set of data that must be abstracted to
make
consistent
>> >> >> within a DW environment. And since you're already there go ahead
and
put
>> >> >> it in a dimensional format so somebody can understand it and easily
pull
>> >> >> data/reports.
>> >> >
>> >> >What exactly is a dimensional format? How does it differ from
relational
>> >> representation?
>> >> They are both relational. Assuming both the application database and
the
>> >> data warehouse are both stored in relational databases. Was your
question
>> >> directed to differences between a data warehouse database (let's say
OLAP
>> >> for convenience) and an application database (let's say OLTP for
convenience)?
>> >> The primary difference is purpose. OLTP must support consistent real
time
>> >> data. OLAP only need support historical reporting.
>> >
>> >So then, OLAP is just another word for "snapshot" as in "a derived stored
>> >relation" ?
>> >
>> I'm sorry, I don't understand your meaning in "a derived stored relation".
>
>What part do you not understand? Do you know what a relation is? Do you
know
>the difference between something that is stored versus something that is
not
>stored? Do you know what it means to derive something?
>
The derived stored relation part. Yes. Yes. Yes.
>
>> I was trying to illustrate one of the basic differences between an
application
>> database and a dimensional data warehouse.
>
>You appear to have regurgitated meaningless pap. If you were illustrating
>anything, you could offer support for your assertions.
>
I thought it was meaningful and remarkable. Oh well. I guess in a nutshell,
the only support I can offer is empirical. Most of the business community
I have dealt with have difficulty navigating and understanding what is shown
in complex, highly normalized data models. In fact, most business folks
don't care. They just want the application to work and they want their reports.
Those same folks who previously needed the IT shop to develop and write
reports for them are now using BI tools to do their own data mining, data
queries against dimensional models. Why or how they can do one and not the
other is obviously a matter for debate. I see the success of the DW industry
and it's prevalence in corporate world and decide that people see value in
the technology.
>
>> And that difference was that
>> the application database is constantly changing over time as the business
>> data changes. OTOH, in the data warehouse the data is never updated.
>
>Really? Now that is a truly remarkable statement! Do you realise this
Sweet. Now I'm making progress!
>fundamentally alters every cosmogeny I have ever heard: "In the beginning
>was the data warehouse..." One immediately wants to ask whence this eternal
>data came, but Ockham already took care of that one.
>
Great statement! But I had to look up Ockham. You are a scholar! The jury's still out on the gentleman part.
>
>> So
>> a report run in a data warehouse today will be the tomorrow and the month
>> after that.
>
>What would be the point in running the same report on unchanging data?
>
You want two copies? End of year summary? Year to date?
>
>> >> >How or why can it improve understanding? How have you measured >this
improved
>> >> understanding?
>> >> I didn't say that it does.
>> >
>> >"And since you're already there go ahead and put it in a dimensional
format
>> >so somebody can understand it and easily pull data/reports."
>> >
>> >It sounds to me like you did. If you did not say that, what the hell
did
you
>> >intend to say?
>> >
>> Hmmm, I guess I did say that. Sorry I was busy at work and responded
too
>> quickly to the post. A fully normalized model for a large application
can
>> have 100's of entities. A dimensional model is built to support specific
>> reporting and consists of a fact table and its related dimension tables.
>
>Yes, potentially hundreds of dimension tables... Does a data warehouse
>comprising multiple data marts have only a single fact table or is it not
a
>dimensional model?
>
I guess it's possible for a single fact table to have hundreds of dimensions. I've not seen one. I usually see 5 to 15 dimensions. A dimensional model supports only specific reporting. As such you would need to create a fact for each type of reporting required.
>
>> Occasionally, a dimension will also have a relationship to another table
>> resulting in the snowflake term.
>
>So, if we rearrange our entities into a snowflake pattern, we get a data
>warehouse?!?
>
I'm not quite sure how you came to that conclusion. There is the application database(s) and there's the data warehouse. "Snowflaking" entities in an application database doesn't make a DW.
>
>> Certainly, a fact table in conjunction
>> with its dimensions is more comprehensible than a large normalized
application
>> database.
>
>Since it is so certain, you will have no difficulty offering some support
>for the assertion.
>
>
>> I would offer that the comprehension is solely related to the
>> business reporting and not the business rules stored in the application
data
>> structure.
>
>Business rules have no meaning or effect on reports?!? Or are you saying
>that people understand the report without understanding the business?
>
>
>> Thank you for allowing me to clarify that statement.
>
>Um, you're welcome, I guess.
>
>
>> My purpose
>> was not to sell data warehouses or dimensional models. It was just a
comment
>> to the complexity inherent in some of today's applications and their
corresponding
>> data structures.
>
>What complexity?
>
>
>> >> >What features does it provide that facilitate reporting?
>> >> >
>> >> The data structure allows for easier retrieval.
>> >
>> >Since the logical data structure is a relation in both cases, I fail
to
see
>> >how it alters anything with respect to retrieval. Perhaps you would like
to
>> >clarify your remarkable statement?
>> >
>> >
>> Again I don't understand your statement "the logical data structure is
a
>> relation in both cases".
>
>Did you not claim both models are relational? In the relational model, the
>only logical structure for representing data is the relation. If both are
>relational, the data structure for both is the relation. Is it not?
>
>
>> The data structure of a dimensional model is different
>> than the corresponding application data source(s) structure.
>
>If the data structure for dimensional models is not the relation, what is
>it?
>
>
Received on Thu Oct 23 2003 - 15:45:41 CEST