Re: MV and SQL

From: dawn <dawnwolthuis_at_gmail.com>
Date: 15 Jan 2006 07:00:34 -0800
Message-ID: <1137337234.155021.225780_at_g49g2000cwa.googlegroups.com>


David Cressey wrote:
> DonR responded to me, a while back with this:
>
> > My guess is the proprietary MV program can be written quicker and with
> > finer details than a combination of database constraints and SQL
> > whatif's. I think the problem you and other SQL programmers have in
> > understanding MV is that you are trapped in the SQL box. ;-)
>
>
> I didn't want to respond to this, because I was more interested in what Don
> had to say about Pick than I was in arguing about what box I might or might
> not be in. But now, maybe, it's time to bring this up.
>
> I can't speak for others, but I can say, for sure, that I am not trapped in
> any SQL box. Before I learned SQL, I had gained some proficiency in
> assembler, Fortran, BASIC, Lisp, Algol, Pascal, and Datatrieve. I've used
> SQL to good advantage in a certain class of data storage and access
> problems, but I have by no means forgotten everything that I had learned in
> the quarter century that preceded my involvement with SQL. I have used SQL
> because it is useful and not because it is pristine and pure.

Yes, I'll agree that I have not noticed any SQL religious zeal from you.

> The description of Pick that DonR has given is extremely useful, but it
> skirts one aspect of the MV data model (if I can call it that without
> starting a whole different argument). I'm referring to the multivalue
> feature. If you take away the multivalue feature, The pick model looks
> very much like dozens of other prerelational models, and differs only in
> trivial details.

When I started my research to figure out what made it such a productive, howbeit often outdated, environment, I found a series of other features, that make it different from using VSAM, IMS, or other pre-relational models. With a COBOL VSAM solution you could have embedded tables IIRC (the OCCURS clause pops to mind).

> Rapid access to a record in a file by means of a key is a "good thing"
> (tm), but it's hardly unique to Pick.
> Structuring records as hierarchies of values, with field separators to
> indicate the tree structure is, in some contexts, a "good thing" (tm), but
> it's hardly unique to Pick. Using indexes to tie together data stored in
> different record types is a "good thing" (tm), but it's hardly unique to
> Pick. Even the people who are "trapped inside the SQL box" know how to use
> foreign keys, and generally know how to use indexes, if only by relying on a
> good optimizer.
>
> What seems to make the MV model unique is the "multivalue" feature. BFaux
> alluded to that in a post where he shows some queries that illustrate how
> handy it is to use this feature. But the examples he shows are only
> trivially different from examples that could be constructed using two design
> methods commonly used by SQL people in the construction of application
> systems. (IMO, they are bad designs for bad systems).

Some other features are variable length everything, data as strings converted to other formats and types for various purposes (this is one flavor of loose typing), vocabulary files including a master dictionary that has vocabulary for all data, metadata, and database code (and all app code until client/server permitted other languages). Metadata as descriptive, not proscriptive. Virtual fields associated with a vocabulary. I know these are not all clear from my brief statement here. I'll be showing various aspects in the blog over the year.

> The first is the Entity-Attribute-Value (EAV) model. In this model a
> single table is used to store a three dimensional relation, one of entity,
> attribute, and value. It's trivial to extend this model to cover
> multivalues. The EAV model has been much discussed in this newsgroup, and
> I think it's been demonstrated to be a "bad thing" (tm).

I agree that, in general, it is not a highly useful approach.

> The second design is the "one true code table" (OTCT) design, in which new
> types of codes can be stored on the fly, without having to do any DDL. That
> merely moves a data management function from being reflected in the database
> metadata, to being reflected in an off-line data dictionary, or to being
> handled informally by programmers. This has also been much discussed in
> this newsgroup, and it think it has also been demonstrated to be a "bad
> thing" (tm).

I plan to address that one in a blog entry. I was using the OTLT term that Celko and others have used, but have to research it yet to see which acronym is more common to this.

> Without having written software in Pick or that family myself, I have only a
> hazy idea of what MV consists of, gathered from what people have written in
> here.

Even after working with it for years, I could not pinpoint what it was that made it such a productive environment. I have a few more hints on that after some studying of other approaches.

> As near as I can make out, it consists of deferring a certain disambiguation
> from file definition time, or data update time, to retrieval time. I'm
> talking about the disambiguation between a simple data item, and a set (or
> list, if you prefer) that consists of a single data item. Semantically,
> they are not the same thing, and this can matter, in some circumstances,
> although I can't think of one off the top of my head.
>
> Now this deferred disambiguation can be construed as a form of "late
> binding", and late binding is, arguably, a "good thing" (tm).

I had not thought of calling the descriptive dictionaries as "late binding" but they are in some way. They are so late that they are never bound until someone uses one for a query. They are not used even as "copy books" in most MV systems much to the dismay of developers when they realize this.

> But there
> are plenty of circumstances where the early binding of a relatively formal
> data model to the SQL DDL execution time has been, in my experience a "good
> thing", precisely because of the discipline it imposes.

Another good point. The flip side of this is that the binding is there from the start of the prototype. Remember back when prototypes grew into systems and then when that was declared a bad thing? That is how most Pick systems are written. Since you don't have to build much into the model from the start, you can evolve it easily. I realize that sounds like a bad thing to those who have learned you should always through away your prototypes, but now thing about how systems change over time. It is that same agility that permits you to make an early version live and then modify it until you have the system you wanted that also permits you to keep modifying it over time.

> This is the area where SQL and MV fundamentally diverge, IMO. I'd like to
> see more about MV, how the Pick system (and, perhaps, the rest of the
> family) exploits the MV feature. Whether my guess of MV as being a case of
> late binding is or is not correct. And, in particular, I'd like to see how
> Pick people avoid ceratin pitfalls that some other people, including but
> not limited to SQL programmers, avoid by living with early binding.

You are asking for some things I intend to deliver in the blog and couldn't put all in one response, but I'm sure others can add to this. I think the answer to this question is that pitfalls are not avoided, but the possible risks are mitigated. As one revered Pick professional says "You get enough rope to hang yourself." (I just quoted him on that in comp.databases.pick too).

cheers! --dawn Received on Sun Jan 15 2006 - 16:00:34 CET

Original text of this message