Re: Entity and Identity

From: Nilone <reaanb_at_gmail.com>
Date: Wed, 5 Aug 2009 04:30:24 -0700 (PDT)
Message-ID: <9f14252c-c9fa-47f0-bebf-afdd1073b951_at_n11g2000yqb.googlegroups.com>


On Aug 5, 1:24 am, rp_at_raampje.(none) (Reinier Post) wrote:
> David BL wrote:
>
> [...]
>
> >> >In OO, objects are state machines, not "entities" about which we may
> >> >want to record information.
>
> >> Now you're exaggerating.  In (class-based) OO, classes are
> >> abstract data types, that *can* be used to model stateful objects,
> >> (e.g. state machines), but objects can be stateless just as well.
> >> This is why an OO language is general purpose in nature.
>
> >I want to clarify what you mean by stateless objects.  Note firstly
> >that objects with no members or only immutable members can be deemed
> >to have a single state. It's not clear to me whether that means they
> >are "stateless".
>
> Immutable.  No variables.  Constructed and yielded, but never changed.
> I believe this is becoming more and more popular.

Stateless is a misnomer. Immutable is descriptive. I prefer to call them values, and the classes that create them, types.

>
> >Also can't they be regarded as trivial state machines?
>
> Single-state machines, perhaps.

I prefer to keep the two separate. Subtyping is viable for types, but not for state machines in general. Implementation inheritance is suitable for types, but not for state machines in general. Interface polymorphism is suitable for state machines, not for types. State machines have methods and state, types have attributes.

>
> A less extreme alternative is to use classes to define traditional
> data types, atomic or structured.  *Only* variables.  Trivial
> state machines, with the states being the values of the variables.
> This programming style is called 'procedural' or 'imperative',
> it predates OO and relational databases and it's not going to go away.

I know these objects as tuples or records, and think of them as a set of variables.

>
> Finally, ADTs can be used to express nontrivial state machines,
> i.e. in which dynamic constraints (formal or informal)
> are considered part of the specification.

Huh? Classes *ARE* the ADTs which are used to express nontrivial state machines.

>
> >Note furthermore that objects of classes with no member
> >variables that implement abstract interfaces normally have type
> >information in order to support dynamic polymorphism.  Do you regard
> >such objects as stateless?
>
> No.   And I agree with Nilone that polymorphism on state machines
> is pretty hairy.

When the concepts are clearly separated, the hairiness disappears. Polymorphism for state machines is achieved via interface inheritance, and you may only assume what the interface explicitly specifies. If it doesn't specify semantic invariants, you can't just assume them.

>
> >Also what exactly do you mean by an abstract data type?  For example,
> >is an abstract interface for an output stream an ADT?
>
> I believe most people would call it that, including the intended behavior.
> In an OO programming language, you don't typically specify behavior
> within the language.  E.g. an interface IStack with methods
> push, pop and isempty doesn't usually specify within the language
> that an implementation is supposed to behave like a FIFO rather than a
> LIFO, yet this is probably the intent of whoever wrote that interface.

I wouldn't call an interface an ADT. ADTs (as described on Wikipedia, among others) are state machines for (usually homogeneous) collections of values.

>
> >I would expect
> >an ADT to be limited to types where it can be assumed that objects of
> >that type are deemed to be variables that hold an abstract value
> >(that's what a *data* type is).  That makes me think that all classes
> >define types and/or implementations of abstract state machines but
> >only some do so for ADTs.
>
> For me it's the reverse: all classes implement ADTs,
> whereas only some define data types.

For me, classes can be used to implement state machines (which include ADTs) OR data types OR interfaces. The last two requires specific limits on how you define the class.

>
> >The first book I ever read on OO (back in the early 90's) introduced
> >the subject using a hierarchy of employee classes - and it talked
> >about how it's useful to have a polymorphic GetPay() method, e.g. so a
> >sales employee can override to add a commission on top of the
> >inherited base salary.
>
> Looks like a good illustration of the mechanism to me.

Implementation inheritance + overriding is a mistake.

<snip discussion of modeling employees>

When modeling real-world entities, you should understand the difference between types, values, interfaces and state machines. If you create an Employee type then you can create a Manager type which is a subtype of it. If you create an Employee state machine then you can have mutable state and behavior. If you create an Employee interface then you can derive a Manager interface from it, but then you can't reason about the behavior of polymorphic objects.

> >Are you saying that OO is used to process the information, but not to
> >record the information?
>
> I'm saying that processing, behavior, business logic, also needs to go
> somewhere, in an organized, maintainable way; database models can't
> express it, and OO models typically provide space for it.

You *can* express all knowledge of a state machine, including behavior, in a relational database. I previously posted a link to a paper that did exactly that. See
http://www.ohiolink.edu/etd/send-pdf.cgi/Punnam%20Pradeep%20Kumar.pdf?acc_num=kent1226606883

Since state machines are a model of computation, all behavior, business logic, etc. could be expressed as such.

The failing is in current database management systems for being unable to compile/translate your class into this form, and then to simulate/ execute these models. It is not a failing of database theory.

> Unless you don't use that space in your design, of course,
> and treat objects as structured variables, defeating their purpose.
>
> >The problem I see is that most generally tuples in a set don't have an
> >immutable identifier allowing them to be mapped to a variable in a
> >useful manner.
>
> Now you're reversing problem and solution - variables are the problem!
>
> >For example, how would a set<int> be mapped to a set
> >of variables (by analogy to an O/R mapping)?
>
> Why would you want to do that?  The fewer variables, the better!

What's wrong with variables?

>
> >How do you efficiently
> >locate a variable for a given value?  It would seem necessary to
> >record a map from int to the address of the associated variable.  When
> >a variable is updated, it is necessary to update the map.  However
> >what happens if one of the variables is changed so it holds the same
> >value as one of the other variables?  Does it suddenly disappear?  How
> >does the code avoid access violations?
>
> Yes, aliasing is a problem.  The solution is to use as few references
> as possible.  That's what the ADTs are for.

It is not references that are the problem, but references to shared mutable objects. BTW, I believe your use of ADT here is what I would call a tuple or record. Received on Wed Aug 05 2009 - 13:30:24 CEST

Original text of this message