Re: Objects and Relations

From: David BL <davidbl_at_iinet.net.au>
Date: 21 Feb 2007 23:21:07 -0800
Message-ID: <1172128867.199095.298920_at_p10g2000cwp.googlegroups.com>


On Feb 22, 6:09 am, "JOG" <j..._at_cs.nott.ac.uk> wrote:
> On Feb 20, 2:18 pm, "David BL" <davi..._at_iinet.net.au> wrote:
>
> > [snip]
> > I still think we have a quite different understanding of what (in
> > practical terms) we mean by "entity". I think you're still
> > associating it with ERMs (even if you distinguish instance from type)
> > while I'm not. Let me explain what I mean by that...
>
> > You question whether written papers have an authors property or vice
> > versa. For me that question only arises at the point where you try
> > to create an ERM.
>
> Well yes I do see that. But I /don't/ think those issues start at the
> ERM, so I'm more than happy to address things outside the E/RM, at a
> 'real world' level (apologetic quotes for obvious reasons).
>
> > Since we agree it is generally better to go
> > directly to relational schema design it seems a moot point. We
> > understand that the many to many relationship between authors and
> > written papers is nicely represented in its own relation.
>
> > What I'm saying is that it doesn't indicate any problem with our
> > understanding of the entities as we perceive and understand them in
> > real life.
>
> Ok I've got to stop you there for a moment, because I need to get over
> one hurdle first - I am still not clear at all what the definition of
> your 'entity' concept is? And at the low level we're addressing this,
> it is vital I understand what you define the term to mean otherwise we
> just end up talking past each other again.
>
> Currently I realise it is interchangeable with 'thing', and I also
> know that you _don't_ define it is just a collection of attributes and
> values, but above that I only have the statement that "if you point at
> a set of atoms and call it Fred, then that would be an entity" from a
> post way back in the thread.
>
> Obviously I think there are serious problems with the 'pointing at
> atoms' description, but instead of jumping on it (I realise this is
> usenet and it was probably a quick post) I'd appreciate if if you
> could expand your defintion.
>
> > This distinction comes down to the difference between
> > model and what is modelled. The question of where to put an
> > attribute is yet another one of the adhoc "features" of the ER
> > modelling process.
>
> > If one is going to represent the fact that a particular author has
> > written a particular paper, surely one is comfortable with being able
> > to identify the human and the paper in real life. Otherwise why
> > bother storing the fact in the first place?
>
> I'm not ignoring your points here, I just need that definition first.
>
>
>
>
>
> > Obviously the actual human and written paper are quite different from
> > the models of those things associated with an ERM. As I see it, the
> > various and substantial limitations of ER models should not be taken
> > as evidence that the actual entities for which we want to store
> > knowledge about are "arbitrary concoctions". I have a feeling you
> > are throwing out the baby with the bath water!
>
> > Nevertheless entities like humans are too complex for us to ever
> > expect to reasonably characterise with simple mathematical models.
> > In that sense I completely agree with you that the right way to store
> > information about them is through relations, and expect many tuples in
> > many relations to be used to provide detailed information about a
> > single person.
>
> > > (or to bring it back to the OP, why would a struct ever be preferable
> > > to a relational encoding?)
>
> > > > > Does that make more sense as to the semantic difference that you are
> > > > > (perhaps) obsessing over, but now seems to preventing further
> > > > > conversation?
>
> > > > I would suggest we avoid generalised statements and use examples to
> > > > clarify what is meant. I agree that definitions of terms is a likely
> > > > cause of difference of opinion.
>
> > > I imagine it probably is.
>
> > > ('Course, don't forget I didn't think your intiial conjecture was
> > > testable, or that its comparison made sense.)
>
> > > > > > I ask again, what's you point about the difficulties of
> > > > > > classification?
>
> > > > > What difficulties?
>
> > > > You pointed out (correctly) that it is difficult to have a type called
> > > > "book" and to know what attributes it should have.
>
> > > > I regard that as a classification problem. It doesn't imply that a
> > > > particular entity is illusionary or subjective.
>
> > > > > > > If it helps given the E/R-style 'entity' terminology you are holding
> > > > > > > onto, you might consider that I view /everything/ as an "associative
> > > > > > > entity". But of course I would not call it that.
>
> > > > > > > > It is well known that classification of entities is
> > > > > > > > adhoc. Fortunately In DB systems we tend to state facts about
> > > > > > > > particular things far more often than sets of things.
>
> > > > > > > > If I were to place an actual book in front of you, you could think of
> > > > > > > > hundreds of objective propositions about it. Actually the number of
> > > > > > > > possible propositions you could state about the book would seem almost
> > > > > > > > unlimited.
>
> > > > > > > > If you were given a different book, again there would be countless
> > > > > > > > propositions you could state about it. Now the book may have some
> > > > > > > > fundamental differences. Therefore attributes relevant to the first
> > > > > > > > book may not make sense for the second book and vice versa. This
> > > > > > > > makes classification of books difficult. However we both agree that
> > > > > > > > the RM copes well with that because it can represent knowledge about a
> > > > > > > > single book across lots of different relations. RM has no need to
> > > > > > > > develop a class hierarchy in the manner of OO (or indeed E/R
> > > > > > > > diagrams).
> > > > > > > It is good we are agreed of the benefit there, and an important point
> > > > > > > not to forget in all of this.
>
> > > > > > > > What is more fundamental - facts about a particular entity, or the
> > > > > > > > entity itself? Surely the facts are secondary - at least for physical
> > > > > > > > entities.
>
> > > > > > > Well of course I don't accept there is anything but facts and values,
> > > > > > > so your question is nonsensical to me.
>
> > > > > > I guess your statement that entities are illusionary is nonsensical to
> > > > > > yourself as well then.
>
> > > > > > > Remember that there are practical evidence of this standpoint having
> > > > > > > merit. For instance Symbolic AI died in the 1970's - a very real,
> > > > > > > practical example of how tyring to manipulate these elusive 'entities'
> > > > > > > results in failure. Situated and Nouvelle AI was born from this and
> > > > > > > I'd encourage you to check this area out - "Elephants don't play
> > > > > > > chess" by Brooks, is a good starting point.
>
> > > > > No comment on this? I've offered to points indicating to how thinking
> > > > > in terms of entities can be unproductive - the lack of success of E/R
> > > > > modelling in replacing RM (which was its original goal), and the
> > > > > collapse of the entity-based manipulation of Classical AI in the 70's.
> > > > > There is insight in both of these.
>
> > > > IMO the insight in the first is the difficulty of an objective
> > > > classification of things.
>
> > > > The second I cannot properly comment on - I'm out of my depth. My
> > > > impression is that AI in general hasn't lived anywhere near to the
> > > > original promises and there are more explanations than you can poke a
> > > > stick at. One explanation I favor is that it is difficult to close
> > > > the association between reasoning and meta-reasoning, leading to the
> > > > so-called ghost in the machine. I find it telling that it's not
> > > > generally possible to apply one's own high level understanding of
> > > > "truth" to the very sentences that are used to reason about those
> > > > truths. As a simple example, 3+4 and 7 are equal and yet not equal.
>
> > > If you get chance, that Brooks "Elephants don't play chess" paper is
> > > extremely good. "The Owl and the Electric Encyclopedia" (1991 Journal
> > > of AI) is also recommended.
>
> > I intend to have a look when I get the time- Hide quoted text -
>
> - Show quoted text -

I doubt whether it's possible to define what "entity" means in a mathematical way that would satisfy you because it relates to our perceptions outside of any mathematical formalism. Therefore I can't produce an axiomatic definition of entity. Indeed my whole point is that it seems to relate to our awareness and preparedness to identify physical things around us, and even non-physical things like companies, irrespective and independent of any particular mathematical model.

Furthermore the conjecture of the OP depended on distinguishing this notion of entity, which I previously described as being "outside of the abstract machine". I can see why you might say it isn't cogent, because it's indeed outside the scope of any formalism.

It seems to me that it only makes sense to say that a relation (which is mathematically just a set of tuples) actually represents a set of facts after we apply a mapping in the manner of an interpretation from the keys in the RM formalism to the "actual" things that we perceive. Both the "actual things" and the mapping is outside of the RM formalism itself.

I apologize if this seems rather hand wavy. I feel like there is something useful in it, even if I'm having trouble expressing it. Consider the following relational schema

    person(N) :- String with identity N is a name of a person     string(S,I,C) :- String with identity S has character C at index position I

As far as the RM formalism is concerned this schema is quite respectable.

I find it interesting that as far as our perception of entities go we would tend to say that a person *must* be dealt with relationally, but the case is not so clear for strings. It seems to relate to the fact that strings are already mathematical things that can be directly represented on a computer, so it isn't crucial to deal with them only by stating facts about them.

An important observation is that the string relation completely characterises a given string, whereas we can never hope to completely characterise a given human no matter how many relations we use. That seems to be an objective difference.

We can see how strings can successfully be replaced by ADTs. You could say they can be "absorbed" into the abstract machine. I find this interesting even if I'm not sure how to express the idea formally. Furthermore I can see arguments in favor for not dealing with strings relationally. In particular I can see how a domain expert is not at all interested in thinking of strings as having identity. ADTs have the very useful and important property that they hide identity of internals behind a value type.

Another factor that motivates me is that according to our perceptions there are a finite number of humans in the world, and furthermore we can impose additional closed world constraints so that our databases are only as large as they need to be. However the set of strings is countably infinite. So what does it mean if only a subset of all possible strings is present in the relation? This is another reason to question its moral character. Received on Thu Feb 22 2007 - 08:21:07 CET

Original text of this message