Re: Questions on possreps

From: Erwin <e.smout_at_myonline.be>
Date: Sat, 28 May 2011 17:12:47 -0700 (PDT)
Message-ID: <d52cbcb5-b2e8-4576-9ebd-b4f027a3f265_at_16g2000yqy.googlegroups.com>


Haven't gone through the other responses first, perhaps I should have, but anyway here goes (see inlined answers) :

On 26 mei, 06:04, David BL <davi..._at_iinet.net.au> wrote:
> On page 116 of An Introduction to Database Systems (8th edition) Chris
> Date provides the following example:
>
>     TYPE POINT
>         POSSREP CARTESIAN { X RATIONAL, Y RATIONAL }
>         POSSREP POLAR { R RATIONAL, THETA RATIONAL };
>
> A footnote on that page states "Tutorial D uses the more accurate
> RATIONAL over the more familiar REAL".
>
> I find this quite confusing, hence the following questions:
>
> 1) In the opening paragraphs of section 5.3 which is where POSSREPs
> are described, the word 'representation' is often qualified by
> 'physical'. Is it correct to assume that a POSSREP is indeed a
> physical representation (and has nothing to do with declaring "logical
> representations" which is a term I've seen mentioned a few times on
> this news group)?

No. In fact when speaking very strictly, they never are. Physical representations are sequences of bits, directed to the machine. Logical representations are representations in terms of values (of other types, except in the case of "basic" types such as INTEGER, ...). Those values are intended for the human, and are, at some point, always "tokenized" into text strings, because that's the only way humans can ever perceive values spawned by the machine.

For example the temperature value zero degrees centigrade, might be logically defined in terms of the number 273.16..., or it might be logically defined in terms of the number 32, and showing such a value to the user will require these numbers to be "externalized"/"tokenized" into the text strings "273.16...", "32", ...

Of course in many cases, one of the logical possreps will be the same as the one on which the physical representation is based, e.g. the physical representation for the temperature value "zero degrees centigrade" might be based on the Fahrenheit temperature scale, and thus the encoding of the number 32 will determine the bit pattern for this temperature value, but strictly speaking, even then the "physical" and the "logical" representations are still distinct.

> 2) Is it reasonable for me to assume a physical representation of type
> T means some unambiguously defined method for encoding values of type
> T on some physical medium?

Yes.

> 3) Is it correct that a POSSREP typically doesn't unambiguously define
> a method for encoding values of type T on some physical medium?

The POSSREPs as defined in the TTM literature have nothing to do with physical encoding.

> 4) Consider a POINT value is encoded in computer memory using a '\0'
> terminated ASCII string such as "(1.34,-6.2)", and must be parsed
> according to some well defined grammar in order to retrieve the x,y
> coordinates. Does this qualify as a CARTESIAN representation?

It could qualify as a valid POSSREP for the POINT type, but it is a different one from the CARTESIAN possrep typically used in Date&Darwen's relational writings.

> 5) Is it correct to say that in Tutorial D types are sometimes defined
> in terms of possreps?

Yes. In fact this holds for all of them, except then the "basic" ones such as INTEGER, BOOLEAN, ...

> 6) Date states that types are by definition abstract (i.e. type means
> Abstract Data Type).  Can I assume by abstract he is referring to pure
> mathematical abstractions which are completely divorced from real
> world objects? If that is the case, is it not strange to define the
> abstract (like POINT) in terms of possible physical representations
> (like CARTESIAN or POLAR)?

See my previous remark. POSSREPs have nothing to do with physical representations. So if he requires types to be defined in terms of POSSREPs with components of other types, he is not relying on physical representations.

> 7) Is it true that the DDL doesn't separate out the abstractions from
> the physical representations, and therefore it is not possible to read
> the abstractions without being exposed to physical representation
> details?  Is it true that software developers cannot extend or modify
> the physical representations without updating the definitions of the
> abstractions?

D is targeted at defining/describing the abstractions, and nothing but that. That approach does "separate out" the abstractions from the physical, because the physical is left entirely unaddressed, and free for the implementer to choose. In "Databases, Types and the Relational Model", they further comment (something of the ilk) that "of course there will have to be some kind of language that defines the physical stuff, but we do not deal with that".

> 8) In that same footnote mentioned above it is stated RATIONAL might
> be a built-in type with more than one declared possible
> representation.  Can I assume RATIONAL is necessarily a type, and not
> a possrep?  Are possreps allowed to reference other possreps?  I would
> actually have thought it only makes sense to define a possrep in terms
> of other possreps.  E.g. in an implementation language like C we may
> write
>
>     struct Cartesian { float x,y; };
>
> In this case, on a given platform, a particular representation of a
> geometrical point is unambiguously defined in terms of particular
> representations of numbers.  If RATIONAL is a type what does it mean
> to define a physical representation CARTESIAN in terms of a pure
> abstraction RATIONAL?  Does this mean Date thinks of POSSREPS as
> abstractions?

A type definition must have at least one POSSREP definition. A POSSREP definition is a set of COMPONENTNAME/TYPENAME declarations, plus an optional CONSTRAINT declaration. Hence a POSSREP definitions depends on other types, and any constructs that are needed to build valid CONSTRAINT expressions (essentially, these constructs are the language's operators).

Of course, there is "intertwining" between type declarations and the language's operators, because any type declaration automatically brings a bunch of operators into existence, but strictly speaking, a POSSREP definition does not depend on other POSSREP definitions, at least not "directly". Of course it might depend on, e.g., the THE_ operators that are introduced into the language as a consequence of the declaration of some other POSSREP, but that is not really a "direct reference" to that other POSSREP.

> 9) Is it correct to assume that in this usage of RATIONAL it is not
> meant to be associated with a TYPE/POSSREP which has the purpose of
> representing rationals exactly using a pair of "big integers" as long
> as memory is not exhausted, since such a type would not normally
> support the operators such as SQRT, SIN, COS and ARCTAN which appear
> on pages 117,118 and typically return irrationals?

Imo, there is great confusion over this subject, even in the TTM discussion list. RATIONAL is often spoken of in a way that suggests it is indeed a nominator/denominator pair, but your remark about SQRT is valid. Date has a tendency to disregard "imprecise" stuff such as the typical mantissa+exponent types, and the operators associated with them.

> 10) Is Date probably thinking that a floating point representation is
> in practise used for RATIONAL?
> 11) Does the footnote imply Date assumes that floating point numbers
> should be regarded as representing certain rationals exactly and only
> thinks of the operators as approximate?

My personal opinion : it's hard to tell what Date "thinks" on this subject. I've never seen anything written that made that clear to me.

> 12) Am I correct in thinking that Date should instead think of floats
> as an approximate representation of the reals?  My reasons for
> thinking this are:
>
>   a) since only certain rationals can be represented exactly, we are
> forced
>      to represent most numbers only approximately.
>
>   b) given that the operations are typically approximate, the results
> of
>      calculation are typically approximate.  Therefore the floats
>      themselves are an approximation.
>
>   c) the abstract operators such as exp,log,sqrt,sin,cos,tan are
> unary
>      operators on the reals, so it makes little sense to implement
> them for
>      floating point numbers and yet regard floats as approximations
> to
>      rationals.

There may be prescriptions in the Manifesto that don't fully line up with the properties of the typical mantissa+exponent types and the operators associated with them. Perhaps (BIG perhaps !) that is the reason for his tendency to disregard such types.

> 13) Is it appropriate for assumptions about types employed in POSSREP
> declarations to be visible in the signatures of the operators at the
> logical level of discourse?  For example, that the return type of
> THE_X is RATIONAL:
>
>     OPERATOR THE_X ( P POINT ) RETURNS RATIONAL;
Don't see a problem. This operator coming into existence is a direct consequence of the fact that POINT is declared to have an X component of type RATIONAL in one of its declared possreps. So it's a direct consequence of the type definition, which is itself also at the logical level of discourse, no ?

> 14) Is there a requirement that each possrep be complete (meaning it
> can represent all possible values of the given type)?  I note that on
> page 606 it states "Part of the definition of any given type is a
> specification of the set of all legal values of that type".  Is it
> assumed this specification normally comes implicitly from one or more
> possreps, and is this a reason that on page 115 it is stated "... we
> do require that values of type T have at least one possible
> representation"?

From the manifesto :

"For all values v [of a type] and for all i [denoting the first, second, third, ... component of some possrep], it shall be possible to "retrieve" the value of Ci for v."

So, for all temperature values, it should be possible to retrieve "THE_KELVIN" and "THE_FAHRENHEIT" and ...

However, it is nowhere stated that an invocation of such an operator must also _succeed_ for all v !!! The manifesto is notoriously silent on the subject of runtime exceptions.

Unless you interpret such a runtime exception as a "failure to retrieve the Ci value", and thus as a violation of the prescription. Under this interpretation, types must indeed be "complete" in the sense you suggest.

> 15) Does Date assume that the set of legal values of type POINT is
> well defined?  What does that mean if floats are used, thought of as
> representing certain rationals exactly but exact conversions between
> the two representations don't generally exist?

His tendency to duck the issue of "imprecise" types and oprators, you know ...

> 16) Does Date outlaw types which are uncountably infinite (because
> there is no representation with a finite encoding for every value)?

I wouldn't say he "outlaws" them. He merely observes that by definition, such types cannot be handled by computer systems. He merely observes that he cannot escape from that law, and therefore bends with it. Just like we are forced to observe that we cannot escape gravity (we can escape from the earth, but we cannot escape from gravity).

> 17) Why are only some operators called "selectors"?  I would have
> thought this distinction only arises from physical implementation
> concerns.  Is it not true that at an abstract level any operator that
> returns values of type T can be regarded as a means of selecting
> values of type T?

You're right. The term was probably introduced to be able to speak of

"operators" in general, instead of having to continuously say
"operators and/or literals".  Just like OO texts might want to talk of
"methods" in general, rather than "methods and/or constructors".

(Note that literals and selectors are not the same thing. All literals are selectors, but not all value selectors are literals. Literals are value selectors with no arguments, while there might be other value selectors that do take arguments. But the term "value selector" is restricted to those cases where the arguments match the POSSREP declaration :

VAR RATIONAL X,Y;
VAR POINT P = CARTESIAN(X,Y); /* this is a value selector but not a literal */
P = SHIFTHORZ(P, 2); /* allthough SHIFTHORZ returns a POINT value, and must thus itself _contain_ a POINT value selector, SHITHORZ itself is not a "value selector" */

> 18) On page 606 it is stated "The physical representation of such
> values is always hidden from the user" and in the same paragraph "each
> type has at least one possible representation that is explicitly
> exposed to the user".  How does one reconcile these apparently
> contradictory statements?  What is Date saying exactly?

It's not contradictory because all POSSREPs are logical, even if the physical representation has a "remarkable resemblance" to one of them.

> 19) Can the definition of nonscalar be formalised?  What is meant
> precisely by "user visible, directly accessible components"?  Why
> isn't a POINT a nonscalar given that user access is provided to
> X,Y,R,THETA through operators such as THE_X?   On the topic of
> physical representations Date states "... they can certainly have
> components - but ... any such components will be hidden from the
> user".  What does he mean by hidden?  Since tuple and relation types
> have physical representations why doesn't Date's latter statement
> apply to them as well?

Date is on record saying that "the distinction [between scalar and nonscalar] is probably not so important". And Darwen is on record saying that "All types are first-class citizens.". Received on Sun May 29 2011 - 02:12:47 CEST

Original text of this message