Re: On specialization constraints time of application

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sun, 14 Jun 2009 02:31:13 -0400
Message-ID: <SO0Zl.25789$as4.24386_at_nlpi069.nbdc.sbc.com>


"Cimode" <cimode_at_hotmail.com> wrote in message news:e8559d73-1998-4caf-8192-072a07cb9054_at_i28g2000prd.googlegroups.com...
> On 13 juin, 19:26, Tegiri Nenashi <TegiriNena..._at_gmail.com> wrote:
>> On Jun 10, 6:07 am, Cimode <cim..._at_hotmail.com> wrote:
>>
>> > > Since a TRDBMS should allow constraint specialization in a
>> > > declarative way...
>>
>> IMO, TRDBMS is not a good term. First, why relational model applied to
>> database management field only? Somewhere on databasecolumn.com I
>> remember reading that Dr.Codd's initial intention was general purpose
>> programming language; narrowing the scope to DBMS allowed to avoid
>> competition with traditional PL.
> Agreed.
>
> By TRDBMS, I meant a system that allows an inner logical
> representation that allows to operate soundly relations (thanks to a
> relation declarative operating language) and that truly implements
> independence between the physical and logical layers as defined
> according to RM. TRDBMS is here to be understood as a hypothetic
> objective for what a relational system (be it inside or outside the
> scope of databases) should do.
>
>> Also I consider the other TRDBMS features such as elimination of NULLs
>> and duplicates as trivial. If there only for these one could challenge
>> TRDBMS added value, which brings us to the TRDBMS crux feature --
>> declarative constraints.
> I wish I had your confidence. Since decomposition is the only correct
> way to deal with missing information, I can tell you that buidling a
> non direct image system that automatically decomposes relations and
> implements Darwen approach to missing information is not what I would
> call a trivial task. It took me one year to get this one to work.

Decomposition is only one way to deal with missing information, and all it does is move the problem. Under the Closed World Interpretation, the absence of a tuple in a relation does not indicate that information is missing, but instead denies it. Decomposition as a mechanism for dealing with missing information, therefore, is in my opinion incompatible with the Closed World Interpretation. Darwen's is an inconsistent approach because some relations represent what is true, and some represent what is known to be true. Those that represent what is known to be true fall under the Open World Interpretation. That's one problem.

Darwen's approach is directed at the elimination of nulls, but there are in fact two basic sorts of nulls: nulls that indicate that there should be a value, but it hasn't yet been supplied, and nulls that indicate that a particular attribute is not applicable to what is represented by the tuple. Darwen's approach is in accord with the Closed World Interpretation when it is applied only to nulls of the 'inapplicable' sort. Decomposition is therefore a good, and in my opinion, the correct mechanism for dealing with inapplicable nulls. But of course, that leaves nulls of the 'applicable' sort.

It is my opinion that nulls are not inherently flawed, but that it is the misinterpretation, and the consequent misuse of nulls that are. If each instance of null is /always/ an indication that information should have been supplied, but hasn't--that is, if there is always only the one accepted interpretation of the symbol, null, then each null can be replaced by an expression of the form,

'Exactly one d such that d is an element of D'

where D is the domain of the constant that should have been supplied.

In this way, each tuple that can be in a 'relation'--even if it is incomplete--maps to a formula in a first order language that can be assigned a truth value under an interpretation. I'm quoting 'relation' here because in a relation, every tuple has exactly the same number of components as the schema has attributes, but if incomplete tuples are permitted, which is the case when nulls are allowed, then some of the tuples can have fewer components than the schema has attributes. It should be noted, however, that an instance of the schema /is/ a relation just in case all values have been supplied, which isn't true when inapplicable nulls are permitted.

This interpretation of null is in accord with the Closed World Interpretation because each tuple that appears in a 'relation' represents something that is true, not just what is known to be true. The distinction is subtle, and some might call it nitpicking, but I think it bears directly on how each tuple maps to something in the Universe. I think it is counterintuitive for a tuple in a base relation to represent the fact that something is not known, which is one of the consequences of Darwen's approach.

>
>> What are constraints? They are equations and inequalities in
>> relational algebra, or some propositions in relational calculus.
> Both have advantages that unfortunately mutually exclusive. RA has
> stronger expressive power while RC allows better abstraction. I am
> trying to get the best of both by developping a language whose
> specifications I began to some monthes ago.
>
>
>> Comparing the two, textbooks always point to the fact that RA has more
>> procedural feel to it, compared to RC. Indeed, query has to specify RA
>> operations is order, much like commands in traditional PL. This idea
>> always bothered me, and only recently I found a satisfactory answer to
>> it.
> Yes. See above.
>
>> A good illustration for declarative specification is an often cited
>> SQRT example. Consider solving quadratic equation:
>>
>> x*x = 9
>>
>> Programmatic system where a user asks the system a question about the
>> roots of x*x = 9 is declarative. Contrast this to procedural solution
>> -- a function that returns a square root of number. It has to
>> implement some algorithm, and algorithms are inherently procedural.
>> Let's write what the function is doing as a formula
>>
>> x = SQRT(9)
>>
>> Now, compare both identities. The first one is constraint, the second
>> one is query! So if a user is allowed to write a query as a
>> constraint, that is using algebraic expressions with variables on both
>> sides of equal sign, then we immediately have declarative query.
> It is interesting to compare viewpoints but you are using an argument
> of traditional algebra to solve relational algebra.
>
> To clarify...Since I assume that a relational system does operates
> only through relational algebra and I consider x as a relation x*x = 9
> is not equivalent to x = sqrt(9) as it would dbe the case in
> traditional algebra. I consider "*" as a cartesian product between
> two unknown relations named x.
>
> x in equation would be solved as x = 9
> while x = 3 in the second equation.
>
>
>> So suppose we allow relation variables in a query and elevate queries
>> from being written as RA expressions (which are simple identities
>> "x=queryExpr") to RA equations. This is already done in one case!
>> Transitive closure written as Common Table Expression is an equation
>> that has a variable on both sides of the identity. Can we benefit from
>> writing some other (nonrecursive) queries as equations too? My
>> proposed term for those "genuine declarative queries".
> Once they get adequately represented as a set of constraints
> associated with a header set, nothing in fact precludes the assignment
> of relations to other relations in a system, since the assignement of
> the constraints and header becomes the assignment of the relation. If
> I write down R1 = (R2 * R3), I only assign the set of constraints of
> R2 * R3 at compile time to ato a specific header set called R1.
>
>> Using algebra for constraint specialization and genuine declarative
>> queries, however, has a major obstacle. One can accomplish very little
>> if the algebra is (utilizing Marshall's term) misshapen.
> Agreed. I am working on a different layer than RM. My effort is at
> building a computing system (and subsequent engine) while keeping some
> consistence with previous work. Much more of work remains to be done
> on that layer than on the layer of RM itself.
Received on Sun Jun 14 2009 - 08:31:13 CEST

Original text of this message