Re: Lucid statement of the MV vs RM position?

From: dawn <dawnwolthuis_at_gmail.com>
Date: 24 Apr 2006 15:05:19 -0700
Message-ID: <1145916318.980049.203040_at_j33g2000cwa.googlegroups.com>


David Cressey wrote:
> <ralphbecket_at_gmail.com> wrote in message
> news:1145510297.269563.19460_at_j33g2000cwa.googlegroups.com...
> > A number of people on this group are proponents of
> > Pick (MultiValue) DBMS. I've been trying to find a
> > definition for MultiValue to give me a better handle
> > on the arguments MV types often advance against
> > the relational model.
> >
> > As I understand it, an MV database is a collection
> > of files, a file is a collection of records, records in
> > a file all have the same structure, a record is indexed
> > by a unique key, a record is a collection of fields, a
> > field is a collection of (atomic?) values.
>
> The best description of the data structure in a Pick file was given a few
> months ago, in here, by DonR, IIRC. By going to Google groups, you may be
> able to dredge it up. Or DonR, or the actual author, may be willing to
> give it again.
>
> If you are familiar with the basics of data manipulation in files in the era
> before database managment systems, much of Pick will be familiar to you.
> My understanding, which comes from reading the description I've already
> mentioned, agrees for the most part with what you wrote.
>
> Pick data is sotred in files. Files are made up of records. Records are
> set up for direct access. As I understand it, records are accessed by
> number, not by key.

It is accessed by a key. Each file has a single primary key which may be a multi-part key (which would be similar to a composite/compound key).

> In any event, translating a key to a number is a small
> problem, probably offered as a service inside Pick.

Scratch the above statement.

> Records are made up of
> ASCII characters. Certain special characters are used to separate fields
> within records, and sub-fields within fields, and values within
> sub-fields.

Yes, record marks, field marks, value marks, sub-value marks are characters such as ^253

> So far this is just a rather simple minded specific case of a hierarchy.

In some sense it is. However, having worked with VSAM files and related applications, it is really very different. For example, a file of source or object code is a record in a Pick file. Dictionaries are queried just like files are. Virtual fields are used extensively.

> Nothing to offer scathing criticism about, but nothing to wax lyrical about
> either.

I understand that as an outside perspective. That was mine for the first few months after joining a shop where I had developers working with it.

> Where it starts to get interesting is in the following: many sub-fields
> consist of only one value, but this can be for one of two reasons: The
> first is that the context is such that multiple values would be meaningless.
>
> The second is a case of a list with only one element in the list. Thus a
> pizza with onions on it will have only one value, "onions" in the
> appropriate place.

This actually describes values rather than sub-values

> A pizza with onions and mushrooms on it will have two
> values in the same place, separated by whatever the separator is.

Yes.

> This raises two questions in my mind:
>
> First, how does one disambiguate between a list consisting of only one
> entry, and the entry itself. In other words how does one disambiguate
> between "onions" and "(onions)" since they both have the same
> representation in Pick.

Because the "schema" is descriptive rather than prescriptive, it depends on your dictionary, how you define the attribute, as to whether it is considered single or multi-valued. If it is multi-valued with exactly one value, it is stored the same and looks the same at the logical level as a single-valued attribute.

> The second question for me is how one distinguishes between a list used as a
> poor man's representation of a set, and a list used as a list, where
> placement in the list is supposed to carry information.

That's a good question and I think we have discussed this one before. The only hints in the vanilla dictionary combined with the data is in the name, description, and values. Developers have complete control on whether they treat it as a list, a bag or a set and they do not advertise this anywhere other than the code.

Many people add attributes to the dictionary, but even then I suspect that few have found a need to specify whether the attribute really should be ordered or not. I'm certain there can be misunderstandings, but I don't even have anecdotes of this lack of metadata causing significant problems for an organization.

> Thus the question is whether the list "(onions, mushrooms)" conveys the
> same information or different information than the list "(mushrooms,
> onions)".

Because the system is so language based, this concern is much like a concern for any list that a person might give in a sentence. It is often clear whether the order was important or not. When it is not clear, ask someone. I realize this is a less than satisfying answer, but it is the situation.

> The answer I keep getting from Pickies is (after I've stripped away the
> veneer) seems to be: "It's all in the mind of the programmer! Isn't that
> wonderful!"

The control of whether something implemented as a list is a list, bag, or set is addressed by developers, but the answer of which is in the mind of the end-users too.

> My response is that it's not wonderful.

I agree that it is not tight. I wish I could come up with any time this has botched things up. I can think of a time when a VAR inserted items in a list in an order that was different than the customer thought it should be. I'd be happy to have it be tighter (add a specifier to the metadata to identify whether something is a logical list, bag, or set), but it really doesn't seem to bubble up as a significant issue.

> The whole reason I migrated from
> files to databases was to get away from data whose description was buried in
> some other programmer's mind.

I was happy to migrate from VSAM to IMS for the same reason. I don't know what you could have told me that would have made me think it was a good idea to use Pick. But having done so, it sure does seem to have some advantages that could help the industry for the future.

> If you think that keeping the ultimate key to
> decoding the data ought to be in the mind of the programmer, then I think
> you should stay away from databases. Either that or hang a suitable warning
> sign in front of any databases you have built.

I really do understand why you say that. It is very frustrating that I have not yet been able to explain it.

> > If that is correct, it seems to me that MV is an
> > implementation technology and the RM is a logical
> > formalism and that to compare the two is to compare
> > apples and oranges.

That would be the case except that you can take the same business domain and implement it with either. So, in the end, they are tools used to solve pretty much the same problems; address the same opportunities.

> > That said, debate on the topic still goes on and on
> > in this group, so I assume I have failed to grasp
> > something important about MV. Is there a clear,
> > *concise* explanation somewhere of (a) a formal
> > (preferably set theoretic) model of MV, and (b)
> > how integrity constraints are expressed and enforced
> > in an MV database?
>
> Once again, "it's all in the mind of the programmer". Pick programmers are
> supposed to be so smart that they never create rogue programs or unintended
> data.

No, in fact you likely need more smarts to implement the same set of requirements in just about any other environment. There is simply a different approach to quality assurance than with a SQL-DBMS.

> I'll believe it when I see it.

I have some ideas along those lines. Someday, maybe. cheers! --dawn Received on Tue Apr 25 2006 - 00:05:19 CEST

Original text of this message