Re: 3vl 2vl and NULL

From: David Cressey <david.cressey_at_earthlink.net>
Date: Thu, 08 Dec 2005 14:10:26 GMT
Message-ID: <m9Xlf.846$nm.372_at_newsread2.news.atl.earthlink.net>


"dawn" <dawnwolthuis_at_gmail.com> wrote in message news:1133970641.054944.281370_at_f14g2000cwb.googlegroups.com...

> I have no interest in any data that is not ever accessed. The services
> or API to the stored data is of primarily interest to me. How the data
> are actually written to the storage medium is of little interest to me.
> The model used by a developer (or other end-user) when accessing the
> data and metadata are definitely of interest to me.
>

This sounds a lot like "physical data independence" to me. But when I evaluate the stuff you claim to love
in terms of physical data independence and logical data independence, they appear weak in comparison to the SQL-relational stuff.

The classic Pick file structure, as explained in here by DonR has got a lot of dependencies in it. Don's explanation is that "Pick programmers are a practical lot". What I interpret this to mean is that they write code to make sense out of the data. I question whether that approach, in the long run, is better than creating self describing data.

> Agreed
> Agreed.
> Agreed.

We do agree about a lot of things.
>
> > However (and it's a big however) just as Dawn doesn't feel that database
> > designers are capable of defining the world for application programmers,
>
> I don't recall ever thinking of it that way, but I can see why you say
> that. I am more of an end-user (aka developer) looking to those
> defining the API for databases to provide me with an
> API/language/whatever that meets my requirements.
>

I certainly recall you saying as much, several times. I'm too lazy to do the research needed to pull up your exact words, but your philosophy of "programmers rule" shows through clearly in most of what you write.

> Application developers (I've programmed a little in the past decade,
> but not enough to claim to be a programmer, I suspect) are data
> modelers.

Some of them don't use this term, and some of them model data in extremely shortsighted ways, for purposes of succeeding at a well defined project. I can't be critical of that, in its own context. Succeeding at aproject is better than failing at it.
But building an enterprise database is a lot more than succeeding at a lot of little tiny programming projects.

>They model data for all sorts of purposes: e.g. data entry
> screens, data collection from devices, printers, databases, file
> systems, data exchange.

Information sharing goes way beyond data exchange. In the cases where I've seen "regular programmers" model data exchange, it generally devolves into bilateral negotiations between two camps who each wnat to minimize the disruption to their own physical data model that already exists in a legacy system. There's nothing wrong with that, but it isn't the same skill as planning a database that's going to get a big bang for a lot of bucks.

> I understand that there could be a division of
> labor where one developer specializes strictly in data modeling for
> persistence while another specializes in data modeling for user
> interfaces, for example. If a development team does not have that
> luxury, then individuals might need to be able to model for all
> purposes. I still think it is possible for a single person to do
> end-to-end development, although it requires a number of skills, not
> all of which will be executed perfectly.

It has nothing to do with a division of labor. An excellent programmer can become an excellent database designer without ceasing to be an excellent programmer. In fact, every database designer should periodically go back and do some programming, in order not to "lose the common tough", as Kipling might put it.

>
> A seasoned developer is a data modeler. If the API for the database is
> good enough, it should not be brain surgery to extend that knowledge to
> persistence.
>

As I keep saying, persistence is relatively straightforward. Sharing is not.

> >
> > And data sharing is where, in my view, you get the bang for the buck out
of
> > database development.
>
> Absolutely.
>
> > I suspect that almost all of the projects that have
> > yielded disappointing "bang for the buck" using products like Oracle
are
> > projects where the amount of data sharing is so small, that a less
> > ambitious approach (like using Pick) might have yielded the same
results,
> > with less time and expense.
>
> I don't think that is where the distinction is, although the more
> developers you have using an API, the more standardized you want that
> API to be. So you do have a point that in an environment where there
> are hundreds of developers reading and writing to the same database, it
> is important to have one team manage the integrity constraints, whether
> coded in SQL, COBOL, or Pick services. A Pick approach is only "less
> ambitious" in that it is less work, not in that it provides much else
> in the "less" category as best I can tell.
>
This is where you have stated your point again and again without demonstrating it to those of us who speak SQL and do not speak Pick.

> > What I think Dawn is missing is the big picture about "large shared
> > databases".
>
> On the other hand, I think you know that I understand your point and
> simply disagree with it.

You keep saying this, over and over again, but I remain unconvinced. I take into consideration all the other things you say in this newsgroup, and that leads me to believe that you do not understand my fundamental point.

>
> > And I just about concur with "Spight'sLaw": sooner or later,
> > you're going to need a DBMS.
>
> Agreed.
>
> > And more adapted you've become to a file
> > system implementation, the harder it's going to be to transition to a
DBMS.
>
> But it need not be a SQL-DBMS. There are and will be easier to use
> approaches that do not conform to the Information Principle and include
> constructs such as lists. Just to bring it around to the subject line,
> these almost all employ a 2VL rather than a 3VL, thankfully.

There are 3VL's whose third value is not "UNKNOWN". That's a poor choice of names, because it indicates blurry semantics. The semantics of all well formed domains is that the distinct values are mutually exclusive. UNKOWN is semantically NOT exclusive of TRUE or FALSE. Hence, [TRUE; FALSE; UNKOWN] is a poorly formed domain.

That doesn't mean that SQL isn't a better interface than the APIs you are so fond of. I've never had any problems modeling a 2VL universe of discourse into an SQL implementation. But then again, I tend to follow my own advice about mapping NULL data into a 2VL universe: filter 'em out. Received on Thu Dec 08 2005 - 15:10:26 CET

Original text of this message