Re: Order & meaning in a proposition

From: Lemming <thiswillbounce_at_bumblbee.demon.co.uk>
Date: Tue, 06 Apr 2004 13:55:43 +0100
Message-ID: <dk8570p9lapb8kh7uar9psihtat6ojgh07_at_4ax.com>


On Mon, 5 Apr 2004 19:02:30 -0500, "Dawn M. Wolthuis" <dwolt_at_tincat-group.com> wrote:

[snip]

>Sample proposition:
>
>Pat is the host who seated the President and the Secretary of the Interior
>
>If we have a relational model for this proposition, we will end up splitting
>this proposition up and will undoubtedly lose the order of those who were
>seated. If Pat seated others too, we will also lose the fact that these two
>seemed to have been seated together or in close proximity of time or place.
>There is nothing explicit about the ordering, nor is it considered
>important, perhaps, for our software application. However, there is an
>ordering here that is not arbitrary -- the President was listed first as an
>indication of the relative importance of the two who were seated.

I think you are seeing information here which isn't present. The only information here is that Pat, a host, seated two specified persons. The order Pat did it is not explicit, nor is there any information as to whether they were seated close together in time or in space. We may infer those other aspects, but we cannot do so with certainty.

Pat, for example, could be a long-time "host" whose claim to fame is on one occasion having seated the President and on another completely separate occasion having seated the Secretary of the Interior.

Hence, a normalised model loses none of the information you list above, since it wasn't there in the first place.

We don't even know who it was that Pat seated. Was the President George Bush, Bill Clinton, or even Abraham Lincoln? The trick is in recognising the difference between what information *is* there, and what information *might* be there.

>Even if
>Pat seated the Secretary of State later, it is likely relevant that such
>information is in a separate proposition from the one above.

Quite.

>Once we split apart a proposition in such a way that we cannot get the
>original proposition back, even if we THINK we are getting the important
>aspects of it back, we have lost some of the meaning we intended to capture.

If you lose important information, then you are doing it wrong. If you need to know, for example, where and when the president was seated, you need to capture it. But as I said, that information is not present in the original proposition. In order to capture it with any degree of certainty, we need more information.

>This is an off-the-top-of-my-head example of where one might lose
>information when normalizing data and likely not a very good example
>compared to what might be lost in a typical business application. However,
>the point is that the process of normalizing data makes it sometimes
>impossible to retrieve the original propositions, thereby losing some
>information.

If a statement is information rich, then a good analyst will be able to extract the relevant information and capture it in such a way that the aspects of the information which are of importance are preserved. He would overcome uncertainties with might-be-there information by going back to the users and asking them if that information is important. If it was, he'd ask them to define it more accurately, perhaps by reference to a seating plan or on what occasions Pat seated the two persons.

>A data modeling process that respects the integrity of the stored
>propositions so that they can be retrieved again has something going for it
>that the relational model lacks, it seems. Any thoughts? Thanks. --dawn

I'm intrigued -- a modelling process with no entropy sounds a nice thing to have. Do you have a particular method in mind?

Lemming

-- 
Curiosity *may* have killed Schrodinger's cat.
Received on Tue Apr 06 2004 - 14:55:43 CEST

Original text of this message