Re: Lucid statement of the MV vs RM position?

From: dawn <dawnwolthuis_at_gmail.com>
Date: 2 May 2006 10:15:50 -0700
Message-ID: <1146590150.439574.310940_at_y43g2000cwc.googlegroups.com>


JOG wrote:
> dawn wrote:
> > JOG wrote:
> > > dawn wrote:
> > > > JOG wrote:
> > > > > dawn wrote:
> > > > > > Jan Hidders wrote:
> > > > > > > dawn wrote:
> > > > > > > > Jan Hidders wrote:
> > > > > > > > > dawn wrote:
> > [time for a snip]
> > > So your remaining objections with the RM establishment (as I have read
> > > them) seem to distill to whether a statement such as:
> > >
> > > "Barney is_colour green and is_colour purple"
> > >
> > > translates best to which of the following propositions:
> > >
> > > 1) colour(Barney, green) && colour(Barney, purple)
> > > 2) colour(Barney, {green, purple} )
> > >
> > > Would you not say that with bit of logical manipulation we could
> > > probably show that the first is preferable? J.
> >
> > No. If there were no human beings in the mix, no input, and no output
> > to and from human beings.
>
> Well queries interact with the logical model, not humans

I beg to differ. The logical model does not initiate anything. There is an asker and a responder. The asker is either a software program, written by a person (or generated by specs written by a person) or a person.

> or their
> notion of an entity directly. It's a layer down.

Whatever layer it is in, somewhere in the mix is a human being, often a software developer (found in some layer of humanity).

> I have read and
> understand your standpoint on this, but I really think there is a
> distinction to be made - I try and describe it below.

>

> > and no interpretation of the meaning of the data ever
>

> I don't think this follows: both mean the same thing - the original
> proposition.

I don't recall what my above statement was about in order to respond to this response.

> > then 1) has the charm of mathematical simplicity (1st order
> > logic) with no downside.

>

> Consider that the propositions:
>

> 1) "Barney has green fur and purple fur",
> "Oscar has green fur and black fur"
>

> are logically identical to:
>

> 2) "Barney and Oscar have green fur",
> "Barney has purple fur",
> "Oscar has black fur"
>

> If we're agreed there then consider that the first - in MV style -
> would give something like:
> colour( Barney, {green, purple} ) &&
> colour( Oscar, {green, black})
>

so far, so good

> But the second set of propositions in MV style gives us:
> colour( green, {Barney, Oscar} ) &&
> colour( purple, {Barney}) &&
> colour( black, {Oscar})

Not if you use solid multi-value data modeling. Barney and Oscar are name attributes for strong entities, in this case of type Character. I would not model strong entities in this way (an example of a best practice). Strong entities are modeled as "entities" translating in the case of MV to "files.". Properties of entities such as colour [sic ;-) ] would go away if the strong entity were gone. So, I would model that as a property list. The second proposition then becomes

Character( Barney, {green, purple} )
Character( Oscar, {green, black} )

> The first "human interface" refers to entities "Barney" and "Oscar".
> The second, has an "interface" consisting of the entities
> "green"-ness, "purple"-ness and "black"-ness.

Not entities. Entities are things. It is a person, place, thing or event. Person, place, and event (e.g. transaction) are easy enough to identify. For things, if you cannot hit it with a stick or print it out on paper, then think twice about whether it is an entity or a property of an entity.

> Which is correct? Neither
> and both,

The first, not the second.

> because they are artifices. Who knows which will be
> appropriate to the user?

The systems analyst better be able to give it a good shot. If they cannot tell whether green-ness is an entity or a property of an entity wrt to the organization's requirements, I would be very surprised.

> An XML or MV style prejudges that decision,

No, it makes a distinction that those modeling with the RM do not make.  I am an entity and my shoe size is a property of me. It can change. It could even go away if my feet were amputated (sheesh, apologies for the example), but only as long as I am an entity an organization cares about would that property be relevant. If I am not of interest, then, by definition, my feet are not of interest either. In other words, if you are interested in shoe size, you are interested in me because my shoe size is a property of me.

[I realize that if a company is doing a shoe size survey and has no interest in collecting other data about people, then the shoe size might be a property of a survey or some other entity.]

> and with large shared data, who's future use is unpredictable, it seems
> essential to me to avoid that interface-prison.

I disagree, as you have likely guessed. Yes, there are new requirements regularly that prompt one change or other, but I would dare say that there are more changes that would push a property like color to go from single to multi-valued, prompting a schema redesign (new table) than there are if you make color a multivalued attribute to start with.

> With RDBMS, despite thei 3VL issues, there is no prejudice as to this
> choice - both sets of propositions, 1 and 2, encode down to the same
> things - just as they should being logically identical.

>

> > I reserve the right to change my mind on that, however. Cheers! --dawn
>

> I'd reckon that this is the most important quality anyone can have in
> life. That and a high tolerance for alcohol.

I have the first even if I argue until I fully understand another position, but don't let me drink more than one martini. --dawn Received on Tue May 02 2006 - 19:15:50 CEST

Original text of this message