Re: On the subject of Data Warehouses, Data Cubes & OLAP....
Date: Thu, 23 Oct 2003 21:16:54 -0000
Message-ID: <vpgha67etvtd6f_at_corp.supernews.com>
"Bob Badour" <bbadour_at_golden.net> wrote:
>"DataMan" <dataman_at_ev1.net> wrote in message
>news:vpfms541cpv8ec_at_corp.supernews.com...
>>
>> "Bob Badour" <bbadour_at_golden.net> wrote:
>> >"DataMan" <dataman_at_ev1.net> wrote in message
>> >news:vpedvrc3kbf74_at_corp.supernews.com...
>> >>
>> >> "Bob Badour" <bbadour_at_golden.net> wrote:
>> >> >"DataMan" <dataman_at_ev1.net> wrote in message
>> >> >news:vpdppi7ccdn0f1_at_corp.supernews.com...
>> >> >>
>> >> >> "Bob Badour" <bbadour_at_golden.net> wrote:
>> >> >> >"DataMan" <dataman_at_ev1.net> wrote in message
>> >> >> >news:vpdad0mimma52e_at_corp.supernews.com...
>> >> >> >>
>> >> >> >> Most OLTP systems do support the information requested. It's
just
very
difficult
>> >> >> >> to retrieve.
>> >> >> >
>> >> >> >That is a remarkable and very interesting assertion. What can you
offer
to
>> >> >> >support such a remarkable assertion?
>> >> >> >
>> >> >> The support information or difficult to retrieve statement?
>> >> >
>> >> >I do not find the "support information" assertion remarkable, and
I
doubt
>> >> >anyone would. What can you offer to support your remarkable assertion
that
>> >> >information is difficult to retrieve?
>> >> >
>> >> I have no idea what you find remarkable.
>> >
>> >I already told you what I found remarkable. Can you offer any support
for
>> >your remarkable assertion?
>> >
>> >
>> >> The world is full of remarkable
>> >> people all finding different things remarkable. I find it remarkable
that
>> >> you find no value in dimensional data warehouses.
>> >
>> >Did I say that?
>> >
>>
>> No, but since you make assumptions, I thought I would too.
>
>I suppose we all make assumptions, and I try to make mine explicit. To which
>assumptions do you refer?
>
>
>> >> As your opinion on this
>> >> matter is already formed, and many books on the subject have
unsuccessfully
>> >> shown you the value of data warehouses, I don't believe I can sway
your
opinion
>> >> on the subject.
>> >
>> >Did I say that?
>> >
>>
>> ibid.
>
>Again, to which assumptions do you refer?
>
>
>> >> If you would like some recommendations on books that cover
>> >> the material in depth, I would be happy to provide some.
>> >
>> >That seems a very long way of saying you have nothing to offer to support
>> >your remarkable assertion. Apparently, you simply assumed it was true
>> >because you heard it somewhere, and now you are out of your depth.
>> >
>>
>> That's quite possible. But this is fun, so I continue. Please reference
>> the attached data model. This is the standard movies data model supplied
>> by CA when purchasing ERwin. If I wanted a report that provided the
amount
>> of rentals customer's are making using electronic payments by store, the
>> resulting sql would be more complex than a dimensional model built around
>> a transaction fact table with associated dimensions for customer, store,
>> etc.
>
>I don't see any attachments. Regardless, how does the requirement exceed
the
>capabilities and features of a snapshot?
>
What does this question have to do with the my previous point? Either the query is simpler or it's not. The board prevents uploading binary data. If you're interested, I can send it to you. You'll need ERwin to read it.
>
>> >> >> >> Additionally when the business requires enterprise level
reporting
>> >> >> >> across the various functional applications, a warehouse becomes
a
necessity.
>> >> >> >> You know the scene, large company with 10 different order entry
systems
>> >> >> >> for each functional area of the company (i.e. cost center).
One
for
small
>> >> >> >> biz, one for consumer, one for z product, one for y product,
etc.
And
of
>> >> >> >> course there was never any upfront effort by a data management
organization
>> >> >> >> to ensure consistent representation of enterprise level entities.
So
you
>> >> >> >> end up with a hodge podge set of data that must be abstracted
to
make
consistent
>> >> >> >> within a DW environment. And since you're already there go ahead
and
put
>> >> >> >> it in a dimensional format so somebody can understand it and
easily
pull
>> >> >> >> data/reports.
>> >> >> >
>> >> >> >What exactly is a dimensional format? How does it differ from
relational
>> >> >> representation?
>> >> >> They are both relational. Assuming both the application database
and
the
>> >> >> data warehouse are both stored in relational databases. Was your
question
>> >> >> directed to differences between a data warehouse database (let's
say
OLAP
>> >> >> for convenience) and an application database (let's say OLTP for
convenience)?
>> >> >> The primary difference is purpose. OLTP must support consistent
real
time
>> >> >> data. OLAP only need support historical reporting.
>> >> >
>> >> >So then, OLAP is just another word for "snapshot" as in "a derived
stored
>> >> >relation" ?
>> >> >
>> >> I'm sorry, I don't understand your meaning in "a derived stored
relation".
>> >
>> >What part do you not understand? Do you know what a relation is? Do you
know
>> >the difference between something that is stored versus something that
is
not
>> >stored? Do you know what it means to derive something?
>> >
>>
>> The derived stored relation part. Yes. Yes. Yes.
>
>If you know what a relation is and you know what a derived relation is and
>you know what a stored relation is, what causes you trouble with
>understanding a derived stored relation? It all seems quite clear to me.
>
Great. What's your point then?
>
>> >> I was trying to illustrate one of the basic differences between an
application
>> >> database and a dimensional data warehouse.
>> >
>> >You appear to have regurgitated meaningless pap. If you were illustrating
>> >anything, you could offer support for your assertions.
>>
>> I thought it was meaningful and remarkable. Oh well. I guess in a
nutshell,
>> the only support I can offer is empirical. Most of the business community
>> I have dealt with have difficulty navigating and understanding what is
shown
>> in complex, highly normalized data models.
>
>Why would anyone navigate something that does not require >navigation? The
idea seems somewhat absurd to me.
And to alot of other people. Some people figured out how to get around all the navigation.
>Where can I find the raw data for your
>empirical measurements? Can you email them to me?
>
Isn't that an oxymoron?
>
>> In fact, most business folks
>> don't care.
>
>I agree. Most of them accept meaningless pap without objection.
>
>
>> They just want the application to work and they want their reports.
>
>Is there supposed to be some meaningful point in the above observation?
>
>
>> Those same folks who previously needed the IT shop to develop and write
>> reports for them are now using BI tools to do their own data mining, data
>> queries against dimensional models. Why or how they can do one and not
the
>> other is obviously a matter for debate. I see the success of the DW
industry
>> and it's prevalence in corporate world and decide that people see value
in
>> the technology.
>
>What technology is that? Be specific: We are all engineers here.
>
Ummm, data warehousing?
>
>> >> And that difference was that
>> >> the application database is constantly changing over time as the
business
>> >> data changes. OTOH, in the data warehouse the data is never updated.
>> >
>> >Really? Now that is a truly remarkable statement! Do you realise this
>>
>> Sweet. Now I'm making progress!
>>
>> >fundamentally alters every cosmogeny I have ever heard: "In the beginning
>> >was the data warehouse..." One immediately wants to ask whence this
eternal
>> >data came, but Ockham already took care of that one.
>> >
>>
>> Great statement! But I had to look up Ockham. You are a scholar! The
jury's
>> still out on the gentleman part.
>
>Definitely not a gentleman. The jury came back on that verdict ages ago.
>
>
>> >> So
>> >> a report run in a data warehouse today will be the tomorrow and the
month
>> >> after that.
>> >
>> >What would be the point in running the same report on unchanging data?
>> >
>>
>> You want two copies?
>
>Okay. But wouldn't a photocopier or Kinkos be cheaper?
>
>
>> End of year summary?
>
>Since nothing gets updated, what would be the point?
>
All 12 months in one report. Don't you get one of those from you CC company? Do you keep it or throw it away?
>
>> Year to date?
>
>Since nothing changes, what would the year to date show? Why wouldn't one
>just report on the data at the beginning of the year?
>
How about trending? Don't folks at your company want to see how sales of a certain product are trending on a monthly, quarterly basis? This is real easy to do with a dimensional model BTW. Try it in a normalized data model (if they keep history at all).
>
>> >> >> >How or why can it improve understanding? How have you measured
this
improved
>> >> >> understanding?
>> >> >> I didn't say that it does.
>> >> >
>> >> >"And since you're already there go ahead and put it in a dimensional
format
>> >> >so somebody can understand it and easily pull data/reports."
>> >> >
>> >> >It sounds to me like you did. If you did not say that, what the hell
did
you
>> >> >intend to say?
>> >> >
>> >> Hmmm, I guess I did say that. Sorry I was busy at work and responded
too
>> >> quickly to the post. A fully normalized model for a large application
can
>> >> have 100's of entities. A dimensional model is built to support
specific
>> >> reporting and consists of a fact table and its related dimension
tables.
>> >
>> >Yes, potentially hundreds of dimension tables... Does a data warehouse
>> >comprising multiple data marts have only a single fact table or is it
not
a
>> >dimensional model?
>> >
>>
>> I guess it's possible for a single fact table to have hundreds of
dimensions.
>> I've not seen one. I usually see 5 to 15 dimensions. A dimensional
model
>> supports only specific reporting. As such you would need to create a
fact
>> for each type of reporting required.
>
>I see. We have lots of dimensions and lots of fact tables. How does one
know
>which dimensions and facts to use?
>
The one that you need. Is this a trick question?
>
>> >> Occasionally, a dimension will also have a relationship to another
table
>> >> resulting in the snowflake term.
>> >
>> >So, if we rearrange our entities into a snowflake pattern, we get a data
>> >warehouse?!?
>> >
>>
>> I'm not quite sure how you came to that conclusion. There is the
application
>> database(s) and there's the data warehouse. "Snowflaking" entities in
an
>> application database doesn't make a DW.
>
>Then what does make a DW?
>
>
That's really a good question. Because there are lots of people who do data
warehousing and probably really are not. Let me think about this for awhile
and see if I can come up with the characteristics of a data warehouse versus
an application database. What are the characteristics of your DW?
Received on Thu Oct 23 2003 - 23:16:54 CEST