Re: Declarative constraints in practical terms

From: dawn <dawnwolthuis_at_gmail.com>
Date: 23 Feb 2006 21:09:21 -0800
Message-ID: <1140757761.408859.142050_at_t39g2000cwt.googlegroups.com>


ralphbecket_at_gmail.com wrote:

> dawn wrote:
> > ralphbecket_at_gmail.com wrote:
> > > CORRECTNESS
> > > In (a) the constraints can be expressed declaratively
> > > and directly (i.e., ideally with nothing more than a syntactic
> > > reformulation of the specification of the constraints) to the
> > > DBMS, whereas in (b) the constraint specification must be
> > > converted into some (probably non-declarative) programming
> > > language.
> >
> > This is something I don't yet "get."
>
> Specifications should be (and usually are) given in a declarative
> language (e.g., first order logic) which can only state *what* the
> invariant is, not *how* it should be computed or maintained.

I understand that people state this as if it were obvious that a) we must specify what not how and b) we must constrain ourselves to 1st order predicate logic and c) having a mathematically simpler sublanguage to work with makes for better quality code. I do see something elegant about being able to call a proposition p and a constraint q and then ask p ^ q? But we can do that in a general purpose language too, right? We don't have to use the full expressiveness of the language (if there is some reason not to) nor does using a functional language (or other) make it harder to code the average constraint, I would think.

I know that this is generally accepted as if it were obvious, so at some point the light bulb might go on for me.

> If the specification is implemented in an adequately expressive
> declarative implementation language (such as that provided by
> the DBMS), at most all one needs to do is make slight changes to
> the syntax in the spec. to convert the specification to the
> implementation language.

Similarly in whatever language, I would think.

> (How the implementation should work
> is left up to the DBMS.) Correctness for this and all future
> constraints is ensured by *once* proving the correspondence
> between the specification language and the implementation
> language. This is usually trivial.
>
> Virtually all industrially used languages however (such as Java)
> are imperative. That means they are based almost entirely on the
> notion of specifying *how* a process should be carried out, but
> pay little or no attention to saying anything about *what* the
> process is (or should be) computing. This is where all the bugs
> creep in: a programmer has to read the spec., code up an
> implementation, then prove *formally* that the implementation
> meets the spec. The last step is very difficult (logical descriptions
> of imperative languages have to be rather low level) and is
> invariably omitted.

OK, I think I'm following you here. As long as we use a sublanguage that is sufficiently restrictive rather than using a general purpose language, we can execute formal proofs that the implementation meets the specification (provided the specification is written in some way?) The reason we restrict ourselves with a sublanguage instead of using a full-featured language is somehow related to the provability of the code. Is there a clearly established correlation between this type of provability and quality? If so, why do we ever use a general purpose language? Even if it is required for some things, why do we use a general purpose language for so much that doesn't require it?

> > I
> > don't have a problem thinking in terms of "given this input (and
> > pre-conditions), what is the output?"
>
> This sounds like you're thinking imperatively (IO is necessarily
> an imperative concept). IMHO it's better to think in terms like
> "this table must maintain a one-to-one mapping from keys to
> values" or "the values in this column of the table must form a
> contiguous set" or "all keys in this table must be members of
> the Employees set".

Why do you think this is a better way to think? I know a lot of people think this, but I don't see that as obvious at all.

>
> > Do coders who write constraints
> > declaratively code more accurately and more quickly?
>
> Yes, in my experience - but then I am a Mercury developer.
> Plug: www.mercury.cs.mu.oz.au

OK, so you really like this way of coding. Is there proof this is better or is it a matter of taste and different brains working differently?

> > Does it code an

I suspect I meant "cost" there

> > organization less if all constraints throughout software are coded in a
> > declarative language? I gather that some think the answer to that is
> > "yes" but I don't know of any evidence of that. Do you?
>
> Well... I know of one commercial company that has just replaced a
> 1.5 million line Java multiple insurance database system with 3000
> lines
> of Mercury plus some meta-data files. The Mercury system is far
> more capable in that it supports all possible products the insurance
> company can offer, whereas the Java version had to have separate
> code for separate products and consequently only offered supported
> a very limited subset of what the company was prepared to offer
> customers.
>
> This is just one data point, but it is quite a spectacular data point.

I've got my anecdotes too in other areas, and I'll accept yours as potentially indicative.

> > > This is an extra opportunity to introduce bugs,
> > > design errors, and performance problems that simply cannot
> > > arise in (a).
> >
> > I don't see why not. If a function arises in my the requirements and I
> > have to translate it into a declaration, wouldn't we have a similar
> > scenario?
>
> As I said above, converting a declarative spec.

what if my spec is imperative?

> into a declarative
> implementation language is usually very easy to get right: there is
> often a simple correspondence between the spec. language and
> the declarative implementation language.

and a simple correspondence between an imperative spec and an imperative language?

> With an imperative language like Java I have to write a program and
> then prove that the result of executing that program (which is written
> in terms of *how* computation should proceed, not *what* is being
> computed) meets the spec. Nobody does the last step

and they do with SQL constraints?

> (because it
> is very hard) and that is where implementation bugs come from.
> Anything other than exhaustive testing (i.e., of all possible inputs)
> or
> a formal proof is not sufficient to make the case.

I understand. I just have not seen anyone actually do a formal proof of any code, even if it were possible.

> > This makes sense to me. It brings me to the question I'm not supposed
> > to ask about whether there is any data to support a conclusion that
> > constraints specified to a DBMS provide better performance in the final
> > solution than those specified in other code.
>
> Do you really find it plausible that human coders could do better than
> an automatic optimiser on a day to day basis? Especially where these
> coders are already struggling with concurrency and distribution and
> don't have access to internal DB structures?

I can imagine that much can be automated and packaged in reuable libraries. I'm not suggesting that we want to write custom code for anything more than what is coded in constraints in a declarative language, but I am suggesting that the language could be any.

>
> > I hear you on this one too. OK, so the DBMS software is written once
> > to address threading and such no matter what constraint parms are
> > passed to it. Otherwise someone has to write a framework that
> > addresses these in its interactions with the dbms. Once that was
> > written, however, we would be in the same position in either case,
>
> Not really. *You* would be in a roughly equivalent position, but other
> companies would not. Plus, you won't be taking proper advantage of
> improvements in optimisation made by the DBMS vendor when you
> upgrade. You've also given yourself a real maintenance problem.
>
> > > TRUST
> > > The constraint language used in the DBMS (a) is going to
> > > have seen much more testing and use than the code developed
> > > in the roll-your-own approach in (b).
> >
> > I would agree there are likely more lines of code in (b) but
> > constraints are code too, and often not that easy to verify and debug,
> > I would think, especially if handled in a separate process than other
> > code written to handle whatever new requirements are being addressed.
>
> I should clarify: when I talk about implementing constraints
> declaratively, I'm talking about using a real declarative langauge,
> not stored procedures written in Java or anything like that.

Yes, I understand. I still don't understand precisely why, but maybe eventually...

> > > APPLICATIONS
> > > Application programmers will be coding to an industry
> > > standard interface in (a). I expect it would also be much easier
> > > to interface other tools and products using (a) rather than (b).
> >
> > I can imagine that in a given scenario there might be more tools
> > related to SQL than libraries in Java, but I'm not sold on this one.
>
> Eh? It's not just Java in general, it's *your* Java DB interface
> library. Nobody outside your company is going to be coding to
> that interface.

OK, now I understand what you meant.

> > > There is no way to fully protect against human error, the best one can
> > > do is try to minimise the number that go undetected. The trouble with
> > > plan (b) is that it has so many more sources of trouble than (a).
> >
> > With (a) you also have that there are often two or more separate groups
> > of developers required to implement changes in an application, using
> > multiple languages, where there can be a corporate culture that is not
> > particularly conducive to communication among them. This can introduce
> > more problems than it solves.
>
> But there is an industry standard language for this which everybody has
> to use at some point: SQL!

I don't have to. I do, but I don't have to. I know a lot of folks who write data-based software and don't use software. You can find some of them at comp.databases.pick. Soon you will find others that work with xml data sources.

> > > If you like, (a) has fewer moving parts.
> >
> > It seems to me that it has the same number, but fewer that are not
> > delivered, tested, etc by the DBMS provider.
>
> No, no, no! (a) has fewer moving parts (zero, in fact) because it
> never
> says anything about *how* things should be done.

"It" (that being the entire software solution) includes the constraint engine, which surely has moving parts.

> It just says *what*

The added custom code for the specific constraints is as you indicate, but the entire solution includes both what and how

> should always be true. With (b) it's exactly the opposite: it's
> nothing
> *but* moving parts!

Given this "selling point" of yours, you might find this somewhat amusing
http://www.tincat-group.com/mewsings/2006/01/data-movement.html

Cheers! --dawn Received on Fri Feb 24 2006 - 06:09:21 CET

Original text of this message