Re: Aggregates: Largest Groups

From: paul c <toledobythesea_at_oohay.ac>
Date: Sun, 11 Apr 2010 01:13:41 GMT
Message-ID: <9n9wn.1420$Z6.1405_at_edtnps82>


Sampo Syreeni wrote:
> On Apr 10, 6:58 pm, paul c <toledobythe..._at_oohay.ac> wrote:
>

>> An average of averages is generally not equal to an overall average.

>
> No, and this gives rise to one of the most annoying statistical
> paradoxes out there as well: the Simpson-Yule one.
>
> Still, we can always guess as to the intention behind the original
> question, and go from there. Perhaps that means that we would end up
> with an entire family of algorithms instead of just one. Let's say,
> variations on a theme. There's no harm in that -- the number would be
> few, and probably somebody somewhere would find each one of them
> useful in time.
>
> That doesn't really impact my point about data languages being short
> in expressive power, though. I doubt any such algorithm or its
> criteria could be cleanly and succinctly stated in an existing data
> manipulation language.
> --
> Sampo

I think I agree with that but this doesn't mean I'm inclined that way. Rather, I think it is still the case that nobody has even specified a minimal language for a 'Codd machine', because even such a magnificent effort as Tutorial D goes far beyond what is necessary. In the case of Tutorial D, I'm pretty sure that this is because of the 'procedural' perspective and also because D&D pay homage to physical traditions involving concurrency and device techniques and an implied seamless progression away from sql. I don't say their motivation is wrong for them, but personally I could care less about the SQL legacy nor all the mis-specified db's out there, many of them built on ambiguous requirements.

For example, Date, and I presume Darwen, holds to the meaning of the Information Principle as being that all information is recorded in D&D-style relvars. But as stated, there is no reason I can see why relvars must be reference-able in a language, the environment could make them implicit. From a purist's point-of-view, aspects such as transactions are also meaningless, it would be enough to have a single-threaded process for writing, this process could be supplied with the results of all queries that determine what is to be written. There is no 'need' to start a transaction.

It is easy enough to implement write-once variables, instead of 'relvars', let me call them 'wovars'. Eg., R might be 'understood' as a relational value available to be referenced by a minimal language's equations. R' might be understood to be a replacement for R, R'' a replacement for R' and so forth, as the LISP people might say, 'lots of silly idiotic apostrophes', but this has tremendous upshot for pedestrians like me, such as no assignment so no need for 'multiple assignment', no problems of sequence.

I often think that had today's physical speeds and memory capacities been available forty years ago, nobody would have thought twice about avoiding double reads. The SQLITE virtual machine offers a very viable platform for doing this. A proof-of-concept wouldn't be very hard to produce even in javascript/ecmascript and not much harder in faster but ironically less portable C. Firefox provides some graphical support for all this. I doubt if the author of SQLITE had something other in mind than SQL for a user language and it does seem that certain parts of his vm are tailored towards SQL but I don't think they'd be too troublesome.   Such a language would be pretty much a direct implementation of D&D's A-algebra or very close to it, perhaps with slightly different fundamental ops.

I've thought more than once about trying this because I think it is easily within the reach of a solo developer, eg., less than 15,000 lines total, given the right starting environment, assuming pretty dedicated habits and discipline. Not sure I have those anymore, too many other hobbies.

As for your orientation, personally I think it's more important to first see a true Codd machine, after all the host languages I've suggested are more or less capable of what you want. Received on Sun Apr 11 2010 - 03:13:41 CEST

Original text of this message