Re: Mixing OO and DB

From: David BL <davidbl_at_iinet.net.au>
Date: Mon, 18 Feb 2008 17:44:58 -0800 (PST)
Message-ID: <bebd4b85-06ba-4940-9687-9970e6d6a95d_at_i7g2000prf.googlegroups.com>


On Feb 19, 12:41 am, "Dmitry A. Kazakov" <mail..._at_dmitry-kazakov.de> wrote:
> On Mon, 18 Feb 2008 05:58:06 -0800 (PST), David BL wrote:
> > On Feb 18, 7:23 pm, "Dmitry A. Kazakov" <mail..._at_dmitry-kazakov.de>
> > wrote:
> >> On Sun, 17 Feb 2008 21:56:18 -0800 (PST), David BL wrote:
> >>> On Feb 16, 9:03 pm, "Dmitry A. Kazakov" <mail..._at_dmitry-kazakov.de>
> >>> wrote:
> >>>> I mean that you need not to resort to pointers or more generally to
> >>>> internal representation in order to say that Rectangle value can be
> >>>> obtained from a ColouredRectangle one.
>
> >>> Agreed. However note firstly that I introduced pointers in order to
> >>> talk about substitutability of variables. Secondly, I don't deny it's
> >>> *possible* to obtain a Rectangle value from a ColouredRectangle one.
> >>> I don't think this should ever happen implicitly.
>
> >> Why not, If you can do it safely?
>
> > I believe implicit conversions should be information preserving.
>
> Yes, if not defined by user.
>
>
>
>
>
> >>>>> Perhaps the C.Date approach needs a different terminology (ie than
> >>>>> subtype) to avoid this confusion. When Date says integers is a sub-
> >>>>> type of reals, he means for example that nonzero integers have
> >>>>> inherited real-valued multiplicative inverses. In mathematical
> >>>>> definitions of integers that's inappropriate, but for software
> >>>>> development I need to see a good reason to disallow it. Can you
> >>>>> provide one?
>
> >>>> No, it is OK. That is a specialization. You constrain a type upon
> >>>> subtyping. The constraining manifests by loosing some operations. There is
> >>>> nothing wrong with that. But also there is nothing wrong with
> >>>> generalization. And also there is nothing wrong with mixed cases.
>
> >>> With C.Date's notion of sub-type you always inherit all operations
> >>> from the super-type. You can add more operations, but cannot lose any
> >>> (or redefine any).
>
> >>> For example, if 16 bit integers subtype 32 bit integers, then 16 bit
> >>> integers inherit addition modulo 2^32 (stored in a 32 bit result).
> >>> That doesn't stop 16 bit integers from introducing addition modulo
> >>> 2^16 as a distinct operator. Overloading of operator+ is just a
> >>> syntactic issue.
>
> >> It is not. The user of 16-bit integer has to know which + to take. As a
> >> counter example consider big and little endian encoded integers. One is a
> >> subtype of another. Now you have two overloaded operations + (actually more
> >> for mixed cases) one of them is rubbish.
>
> > We have conflicting definitions of the word "type". You're relating a
> > (value) type to a particular physical representation. I'm thinking of
> > it at a more logical level (as Date defines it - a value type is a set
> > of values plus operations on those values). Implementation details
> > like big versus little endian are irrelevant at the logical level.
> > For a given n, there is exactly one addition modulo n operator.
>
> How can you implement integer if you don't know endianness? It is nice to
> talk about logical views as long there is a hardware that implements them.
> But any hardware deals with a representation. A logical model just does not
> stand the trivial requirement to be able to describe itself.

Think of it this way... A string class defines both

  1. an interface (with well defined contracts on the operators); and
  2. an implementation (which defines both a physical representation as well as an implementation of all the operators that fullfills the documented interface).

Do you agree that it is meaningful to think of these aspects as distinct, and somewhat independent? For example, assuming design by contract it is possible to change the implementation without changing the interface.

You seem to be thinking type = class, whereas I'm thinking of type as by definition only relating to the public interface. More formally a value type is a set of values plus operations on those values. It is divorced from a particular implementation.

In fact it is extremely important to understand that the same (logical) value can be encoded in different ways within the same program. For example a UnitSquare value can be encoded in a manner suitable for arbitrary quadrilaterals, or for rectangles, for squares or else very specifically for a unit square.

> >>>>> I don't think you need the mapping to give an ellipse value from a
> >>>>> circle value. I would rather say the set of circle values is a subset
> >>>>> of the set of ellipse values.
>
> >>>> Disagree, that would heavily limit the system, and likely make it
> >>>> inconsistent. The only sense in which circle value should ellipse value is
> >>>> to keep the internal representation same. Why should I care?
>
> >>> It doesn't limit the system in the way you suggest. You can have many
> >>> different representations of the same (logical) value. For example
> >>> polar versus cartesian representation of a point.
>
> >> OK, but then "same logical value" is merely a class of equivalence of some
> >> "physical values." My point is that the latter are fundamental while the
> >> former are derived. .
>
> > Are you suggesting there can be any number of levels of equivalence
> > classes at ever increasing logical levels?
>
> Why not?

A value-type is by definition a set of values plus operators on those values. There can only be one concept of equivalence for a given value-type.

> > I think we're coming at this from different directions. You want to
> > start with the physical implementation and ask what it maps to at some
> > logical level - almost like reverse engineering!
>
> Sure, the model should be sufficient to capture any underlying hardware.

I'm a great believer in the principle "less is more". I want tight constraints on the semantics of the code so it is easy to understand its behavior, without sacrificing expressiveness or performance. I want to write code at a well understood logical level to give a lot of flexibility to the compiler for optimisation.

A starting point is to have a very simple and clear understanding of "type".

> > I want to do the
> > reverse which is more like design. My approach is more restrictive -
> > but IMO for good reasons.
>
> > I suggest a value type is by definition a set of (logical) values plus
> > operators on those values. There is no such thing as "physical
> > values". When a programmer defines a concrete type it is necessary
> > to provide at least one physical implementation. I have generally
> > been ignoring that in all our discussions about value-types and sub-
> > typing. A physical representation is generally something to be
> > hidden away, so that users of the type deal with it *only* at the
> > logical level. Users can specify values and use all the operators,
> > including the test for value equivalence. There can only be one
> > definition of equivalence of values of a given type. If another
> > notion of value equivalence is required then a distinct type must be
> > defined.
>
> I don't see how this could work. You will need some set of built-in pure
> "logical" types which representation cannot be described. This is
> unsatisfactory, because there will be no way to validate an implementation
> of your logical hardware against the specifications. There actually will be
> no specifications. Another problem is that I don't want any built-in types,
> at least as little of them as possible.

Yes there will be built in value types. Eg 32 bit integers. These have well defined semantics. All the programmer needs to know is that they map very directly and efficiently to the hardware. Little versus big endian encoding is irrelevant to the logical level semantic of the type.

Why do you think built-in value types are a problem? I believe they should be treated like any other type, apart from a user defined implementation being optional. For example, built-in types should take part in the type hierarchies.

> >> Note that this is same as in mathematics where Q
> >> "contains" Z, while members of Q aren't numbers at all, but sets of ordered
> >> pairs
>
> > Z is only uniquely defined up to isomorphism. I think that explains
> > why Q can be defined as sets of ordered pairs and yet contains Z.
>
> So what is wrong with handling values this way? Why isomorphism of little
> endian to big endian numbers is worse? Why do you want to have some other
> level behind it? My approach looks much simpler.

I'm not sure what you are saying - you need to elaborate.

> >>> I can imagine a language with the same goals for run time efficiency
> >>> as C++, and the following function can be compiled for many different
> >>> implementations of Rectangle, including specialisations like Square or
> >>> UnitSquare.
>
> >>> float GetArea(Rectangle r)
> >>> {
> >>> return r.Width() * r.Height();
> >>> }
>
> >> Yes, but to be able to judge about *different* representations you have to
> >> make them distinct things => different values. First you have shown
> >> equivalence, you can forget about differences. But not before that.
>
> > I disagree. In practise the specialisations come from the code
> > written at the logical level. Throughout most of the code Rectangle,
> > Square and UnitSquare are (logical) types divorced from assumptions
> > about how they are implemented. It is only at the point of definition
> > that we provide one or more possible physical representations.
>
> > I would rather write as much of my code as possible at a logical
> > level, and assume the compiler can select appropriate physical
> > implementations for me. If it can't then I would rather give it as
> > few hints as possible - because then I can make dramatic changes to
> > the physical implementation very quickly in order to fine tune
> > performance, and in such a way that it doesn't obscure the logical
> > meaning of the code.
>
> I don't see any disagreement here. So long the representations are hidden
> from public view the types are equivalent. You can develop your code
> disregarding the difference. But the compiler, who know the differences may
> optimize your code as necessary. Note that the compiler may actually change
> your representation to an equivalent etc. In order to validate all these
> changes you have to be able to describe them in some language. The point is
> that it has to be the same language.
>
>
> >>>>>> As for subtyping, to me, it is just a relation between types. I don't
> >>>>>> expect it to deliver substitutability. It cannot. What it can and should,
> >>>>>> is to help me to deal with the problem.
>
> >>>>> Can it? I'm interested to know what you have in mind.
>
> >>>> There are many cases where languages could be safer. For example:
>
> >>>> 1. multiple dispatching should ensure overriding of all cross variants
> >>>> (dispatch may not fail).
>
> >>> I wonder whether dynamic dispatch is only relevant to the LSP notion
> >>> of sub-typing, ie not to value types. Perhaps not, because union
> >>> value-types seem to need it.
>
> >> If you want to have classes of values, these will be polymorphic values,
> >> which shall dispatch. Consider:
>
> >> class Boolean
> >> {
> >> public :
> >> virtual String Image () // returns "true" or "false"
>
> >> };
>
> >> class Tristate : public Boolean
> >> {
> >> public :
> >> virtual String Image (); // returns "true", "false", "unknown"
>
> >> };
>
> > I'm afraid this gives me more questions than answers! I don't know
> > what a Boolean actually is. Can it have 3 states?
>
> No, Boolean is defined over {0, 1}. Tristate is derived from it and is
> defined over {0, 1, _|_}

I don't understand the semantics of sub-typing in this case. Can a TriState value be implicit converted to a Boolean value? If so what happens when it's in the unknown state? Received on Tue Feb 19 2008 - 02:44:58 CET

Original text of this message