Re: Questions on possreps
Date: Tue, 31 May 2011 00:26:32 -0700 (PDT)
On May 30, 5:09 pm, Erwin <e.sm..._at_myonline.be> wrote:
> On 30 mei, 05:04, David BL <davi..._at_iinet.net.au> wrote:
> > On May 29, 8:12 am, Erwin <e.sm..._at_myonline.be> wrote:
> > > Haven't gone through the other responses first, perhaps I should have,
> > > but anyway here goes (see inlined answers) :
> > > On 26 mei, 06:04, David BL <davi..._at_iinet.net.au> wrote:
> > > > 3) Is it correct that a POSSREP typically doesn't unambiguously define
> > > > a method for encoding values of type T on some physical medium?
> > > The POSSREPs as defined in the TTM literature have nothing to do with
> > > physical encoding.
> > > > 4) Consider a POINT value is encoded in computer memory using a '\0'
> > > > terminated ASCII string such as "(1.34,-6.2)", and must be parsed
> > > > according to some well defined grammar in order to retrieve the x,y
> > > > coordinates. Does this qualify as a CARTESIAN representation?
> > > It could qualify as a valid POSSREP for the POINT type, but it is a
> > > different one from the CARTESIAN possrep typically used in
> > > Date&Darwen's relational writings.
> > What do you mean it could qualify as a valid POSSREP for POINT when
> > you just stated POSSREPS have nothing to do with physical encoding?
> Sorry. Missed the "suppose ... is encoded in memory as a nul-
> terminated string".
> I meant to say that there is nothing in the concept you describe that
> would make it an "invalid" (logical) possrep. In fact, ALL types can
> "conceptually" be regarded as having a "STRING" possrep, which
> produces the "externalization" of any value of the type. More or less
> equivalent to Java's toString() method that exists for all objects.
I'm not sure what you are saying there. I'll try to describe how I'm thinking about it in more detail:
Types and hence values and operators are pure mathematical abstractions. Nested operator invocations provide an entirely sufficient means for (logically) representing values. I find it superfluous to introduce POSSREPs for this purpose (as though operators are inadequate!), as well as all the other confusing and redundant vernacular: "atomic", "encapsulated", "scalar", "structure", "selector", "dummy type" etc. In fact I'm now thinking of POSSREPS as merely a peculiar syntactic sugar for declaring operators.
Operators being pure abstractions must be distinguished from any implementation as executable routines on computers. In fact operators are the basis for data representation and not just the basis for defining calculations to be executed. It seems as though this former perspective was missed when the redundant idea of POSSREPS was introduced, and yet strangely was made apparent when each POSSREP implicitly declared a kind of operator called a selector.
Let STRING be a type. POINT and STRING are distinct types. I see no need to complicate things by trying to formalise some notion of a "possible representation" of POINT involving STRING. All that's required are /explicit/ unary operators that map POINT to STRING and vice versa.
Even though my example involved a string representation I consider it to be a physical representation of the value
tostring(cartesian(1.34,-6.2)) = "(1.34,-6.2)"
since I expressly stated that a POINT value was encoded.
The way in which a region of memory is interpreted as a value depends on some context. Given that physical representations of one type are typically implemented in terms of physical representations of other types, it is not generally possible to assume there is some unique context for interpreting a region of memory as a value.
> However, in its role as a "physical possrep", there is more to it than
> what you say. It is not sufficient to say "encoded as a nul-
> terminated string" to determine the actual bit pattern for a value.
> Because the encoding of the string itself is also relevant, and a
> possible source of differences. UTF-8 gives different bit patterns
> compared to UTF-16. Byte ordering conventions produce different bit
> patterns for any kind of integer numbers (except the single-byte ones
> of course). etc. etc. Switching the order of the X and Y components
> in an encoding that is based on the (two-component) CARTESIAN possrep,
> will produce different bit patterns too.
If I had said a Unicode string then I would agree the encoding needs to be specified as well. However I specified ASCII which is a character encoding which defines a correspondence between 7 bit binary patterns and character symbols, and RFC20 suggests embedding in an octet with the high order bit always 0, so I think it can be assumed there is no ambiguity on machines with octets as the native addressable data type.
ASCII was incorporated into the Unicode character set as the first 128 symbols, so the ASCII characters have the same numeric codes in both sets. This allows UTF-8 to be backward compatible with ASCII. However UTF-8 refers to a multibyte encoding of Unicode and therefore is distinct from ASCII.
> That was my point where I said "targeted to the machine" : specifying
> a physical possrep requires the specification of _every single detail_
> that plays a role in determining the ultimate bit pattern. So for
> this reason, your CARTESIAN possrep does not qualify as a valid
> physical possrep, because it is incomplete.
I agree that a physical representation must be well defined.
> > > > 5) Is it correct to say that in Tutorial D types are sometimes defined
> > > > in terms of possreps?
> > > Yes. In fact this holds for all of them, except then the "basic" ones
> > > such as INTEGER, BOOLEAN, ...
> > Would you say union types are defined in terms of POSSREPS?
> Oops. You're probably right. I wasn't thinking of those. UNION
> types typically even will not have a POSSREP of their own (allthough
> it is always possible to conceive a toString()-like POSSREP which
> includes both "actual" typename and externalization-of-value. But
> beware that in the Manifesto, declaring a UNION type is _not_ merely a
> matter of saying :
> TYPE PLANE_FIGURE IS ELLIPSE UNION POLYGON; /* where ELLIPSE and
> POLYGON were formerly root types */
> Strictly speaking, the Manifesto requires that "simultaneously", the
> declaration of the existing root types is changed to
> TYPE ELLIPSE IS PLANE_FIGURE ... , TYPE POLYGON IS
> PLANE_FIGURE ... ; /* note the comma, as opposed to a semicolon */
It seems awkward to need to specify one subtype relationship in two places. Received on Tue May 31 2011 - 02:26:32 CDT