Re: Modeling Data for XML instead of SQL-DBMS

From: mAsterdam <mAsterdam_at_vrijdag.org>
Date: Fri, 27 Oct 2006 01:51:20 +0200
Message-ID: <45414961$0$330$e4fe514c_at_news.xs4all.nl>


dawn wrote:
> mAsterdam wrote:

>> dawn wrote:
>>> mAsterdam wrote:
>>>> ... the logical model is the most complete, detailed level
>>>> you can get to /without/ specifying the implementation plan.
>>>> I don't think I should unlearn that.

[snip (un)muddle]

> In order to clarify, my question would be whether the logical model is
> data model independent. The conceptual data model is data model
> independent. The logical data model could be defined as
> data-model-dependent or could still be independent of data model
> employed by the target DBMS. My understanding was that it was
> data-model dependent and pretty much resembled the implementation data
> model (which might be adjusted specific to the toolset used, however).
>
> So, I guess I'm interested in the very first data model that is no
> longer agnostic about the target environment. Is that or is it not the
> logical data model?

I may have learned definitions a long time ago, but I never felt the need to fall back on them. I just don't use these terms in a clean, universally correct, ivory tower way.
I do try to use appropriate wordings, taking the purpose and context into account.

I could use them in a design session like so: 'Hey guys, that's an implementation issue, we 'll deal with that later. For now we have to limit our discussion to the logic of the data itself.', relying on the audiences connotations with 'logic' and 'implementation'.

Even your 'very first data model that is no longer agnostic about the target environment' makes me wonder - can there be such a model; data model and non-agnostic about the target environment at the same time? It is not about the data, it is about how to handle the data.

When the implementation is in SQL the schema can be very close to the logical data model so the distinction isn't important most of the time. In other environments you can have a physical model, elements of which will have to be thoroughly associated with elements from the logical data model - but I would not call this physical model a data model - I'd call it a storage model if I'd have to classify it. If somebody else would call it a physical data-model, I would not interrupt, the message is clear. Calling it a logical model /would/ make me object; it's wrong: the physical model is not /about/ the logic, it presupposes the existence of a logical data model.

Another point:
"
 >>>> ... the logical model is the most complete, detailed level  >>>> you can get to /without/ specifying the implementation plan. "
is not a complete definition (and does not try to be). It is just demarcation of the boundary between logic and implementation. The demarcation on the other side, conceptual vs. logical, is more complicated and at the same time it's IMHO less important to have a strict line there, see below.

[snip]

>>>>> It is common to let "logical
>>>>> data model" refer to this implementation data model -- the model of the
>>>>> data as specified to the API used for retaining data beyond the
>>>>> run-time of a particular software application, for example.

>>>> In which circles? Can you provide a reference?

>>> I think it comes from the Date/Darwin/Pascal side of the house, but I
>>> at this point I'm just looking at Pascal's paper to verify that (so
>>> Date and Darwin might suggest otherwise).
>> Well, the way you are using (or, better /were/ using, you promised :-)

>
> Yes, but I do need a corrected definition so that I am not guessing,
> OK?

Maybe I am overlooking the obvious, but I can't come up with a (to me) satisfactory definition at this time.

By now it is clear that your question is about implementation strategy, not about logical models. Doesn't that stop you from having to guess?

Finding a good clean definition may be more work than is called for - unless of course someone else has an acceptable one for your purpose.

[snip]

>>>>> The requirement to retain data beyond a particular application run-time
>>>>> is a requirement that mixes the two, it seems.
>>>> I don't think so. I think it is where /sharing/ starts - as soon
>>>> as the next run-time incarnation may differ from an earlier one.

>>> OK, if that is how you define "sharing" then yes, the data are to be >>> shared.

>> How would you define sharing?

> From prior discussions, I was understanding that "sharing" as in "large
> shared data banks" indicats that the database was to be shared by
> multiple points with no assumption that any entity controlled all
> entities who are sharing.

I could go with that.

> Each entity sharing this database would need
> to be able to do so without assuming any coordination with others who
> are sharing it. How do you define sharing?

A try:
Sharing data: Use of the same data from more than one point of view.

[snip]

> OK, I'll review all the feedback and come back with a revised question
> (once I have proper definitions for the logical data model and know
> precisely for which model we need to know whether persistence will be
> handled with UniData or DB2, for example).

[snip]

>> Say we have a logical model - now we decide to implement using
>> hierarchical tools /without/ specifying which one (IMS, Lotus Domino,
>> XML, just to name a few alternatives) - now what? Which choices
>> do we have to make?

>
> Yes, yes, this is very close to my question. Conceptual model is
> independent of any target environment. Then there is a logical data
> model. If that is independent of any target environment as well (I
> still need a def), then we could have a subsequent question (if you are
> as old as me, then you can put it in a diamond shape with the words
> "relational model" and a question mark) of whether we are using a
> product that implements the relational model or not. If yes then we
> would take the logical model and prepare a relational implementation
> model from it, putting data in 1NF, addressing such issues as the SQL
> NULL. If no, then ... (this is where my question is).

Dunno about the 1NF/list diamond, but NULLs to me are not only markers for the absence of (a) value, they are also the sign of the absence of sufficient effort put into the logical data model.

> Now, if "logical data model" is defined to assume the relational model,

I did not assume that. Some do, but I don't do that. I have really seen complete logical data models which served non-SQL implementations.

> which is the way I was using the term (apparently incorrectly), then we
> need to move the diamond shape with the question mark in it between the
> conceptual and logical models, which is where I started.
>

>> Is that what your question is about?

>
> Yes!
>
>> I could imagine useful treatment of this problem
>> in the abstract, but I am not aware of such treatment.

>
> Nor am I, so I'm asking around.
>
>>> I don't think we have cdm, ldm, pdm or implementation data model >>> in our glossary, but I'm not looking at it to verify that. Is there
>> I see no need to include them. The most basic misunderstanding
>> in the OP (as I see it) is specifically about the distinction
>> between logical and implementation-specific, no need to mix
>> in even more types of models.

>
> OK, then if you could just define them for me, it would be most
> helpful. I'll see if I should revise my understanding (as indicated in
> my Naked Model blog entry), based on your definition.

I have seen this used for the distinction between conceptual and logical data model:
A conceptual model can be from one point of view (application or process), the logical model has to cater for all points of view. In this case I would choose not to use these terms, though. I'd say for instance: (single) process data model versus integrated model.

>>> someone who has laid out defs that you like so I can start there in >>> forming the question?

>>>>>> How can that possibly help?
>>>>> The implementation data model for data that a software component passes
>>>>> to an SQL-DBMS is often quite different from the implementation data
>>>>> model for that same conceptual data model when software other than a
>>>>> SQL-DBMS is used to store and retrieve said data.

>>>> That implementation is relevant is no reason for mixing
>>>> it with logical model issues. You say you accept the terminology >>>> change, but you seem reluctant to do away with the old one.
>>> I tried to change to say "implementation data model" instead of
>>> "logical data model."
>> I don't think "implementation data model" as a term
>> helps (no objection to casual use, of course).

>
> I'm open to whatever works for you. I just need those definitions.

If you do - I can't help you there,
but I don't think you do. Surely clean definitions could be helpful, but there are more ways to do scope cutting and disambiguation.

[snip submarine]

>> Just work with /real/ examples (user-validated sentences) instead of
>> abstract things, and you will immediately reap some benefits without
>> the need to deeply study ORM.

>
> Yes, and I do like working with user-validated sentences.
>
>> Now if you bump against specific
>> modeling difficulties using that approach search for that
>> problem on the ORM sites - or even ask here; I think Hugo is
>> still lurking here :-)

>
> Yes, I prepared a small ORM diagram to test it out and it seemed too
> complex for sharing with a user compared to an ERD (or simple UML for
> that matter). Users like to see properties grouped with entities just
> as I do. OK, OK, propositions -- I mean that users like to see many of
> the nouns from some of the sentence collected together.

Yes, please don't let the overload of graphical stuff push away the central issue: use /real/ facts.

[snip bang for the buck]

>> In SQL, order is supposed not to carry meaning by itself.

>
> Yes, unlike sentences.

Indeed.

>> If some order has a meaning, it has to be made explicit, e.g. by
>> using a rank attribute. A presented set can have a differently
>> ordered second presentation, without having a different set.
>> In documents, if the order changes, you have another document.

>
> Agreed.

[snip]

> Only because I need to rephrase the question and am apparently using
> the term Logical Data Model incorrectly, yet I'm not certain whether if
> you and I are both given the same conceptual data model and you are
> implementing it in Oracle and I in UniData, whether we might have the
> same logical data model, although different implementation data models,
> or whether our logical data models would differ. Mine would include
> multi-valued attributes, for example. Thanks for any clarification.

Mine do, to - in theory. When everybody knows we'll implement in SQL, MVA's do tend to get avoided. This is letting implementation guide the logic - strictly a no-no. Received on Fri Oct 27 2006 - 01:51:20 CEST

Original text of this message