Re: 3vl 2vl and NULL

From: dawn <dawnwolthuis_at_gmail.com>
Date: 30 Jan 2006 19:09:23 -0800
Message-ID: <1138676963.801089.76610_at_g47g2000cwa.googlegroups.com>


Frank Hamersley wrote:
> dawn wrote:
> > Frank Hamersley wrote:
> >>dawn wrote:
> >>>Frank Hamersley wrote:
> >>>>dawn wrote:
> >>>>>Frank Hamersley wrote:
> >>>>>>dawn wrote:
> >>>>>>>Frank Hamersley wrote:
> >>>>>>[....]
> >>>>>>>If the data model visualized on the screen has one instance of an ID
> >>>>>>>and the SQL view backing that screen has many instances of this same ID
> >>>>>>>value, then the data models are different.
> >>>>>>
> >>>>>>Are we talking about the model or manifestations of it (instances)?
> >>>>>
> >>>>>We are talking about the logical model itself, but perhaps using that
> >>>>>term differently. A logical model that is in first normal form is
> >>>>>different from a model of that same data that is not. They are not the
> >>>>>same model simply because they model the same thing. Trees, networks,
> >>>>>and relations might all model the same data, but they are not the same
> >>>>>model. The model associated with a screen of data is not typically a
> >>>>>no-repeating-groups relation as required if it were to be output
> >>>>>directly from a SQL-92 view.
> >>>>
> >>>>You are presuming that visibility of all the data on the "screen" is
> >>>>mandated. Personally I have no such hang-ups*.
> >>>
> >>>I'm presuming no such thing. I'm talking about the model of the data
> >>>needed for the screen. To answer David's question too, if you were to
> >>>prepare a diagram of your choice, perhaps UML these days, of the
> >>>logical data model that a software application needs for interfacing
> >>>with the database and the logical model it needs for the UI, choosing
> >>>the same entities and attributes to model, your UML diagrams would be
> >>>different. Given there is an industry arising from this along with the
> >>>OO-RM impedence mismatch, I think this is rather well established.
> >>
> >>G'day Dawn - just back from a break in South West region of WA - I can
> >>recommend it but leave your diet at home and be prepared to get salty!
> >>
> >>On the UML stream, I have to admit my recent deployments (as measured in
> >>your BfB terms) I am situated in a very small team where documentation
> >>of this type is not required. Notwithstanding that I can accept that
> >>communication from designer to developer could be difficult if the UML
> >>were as you describe. BTW this has been the case since Adam but is not
> >>offered as an excuse.
> >
> > Welcome back. UML was not the point, just a tool to show that the data
> > model related to the front-end and that of the back-end are necessarily
> > different if you use the RM for your DBMS modeling. If you use a
> > two-valued-logic and non-1NF data model end-to-end for your software
> > development, you need not suffer this mismatch.

>

> In my view this is not axiomatic. My experience suggests if the
> mismatch exists simply switching methodologies or tools does not
> necessarily eliminate a problem. Sure it is possible in some cases to
> eliminate, but it also a possibility that the problem is simply deferred
> to a later time or moved to another parties sphere of interest. Most IT
> types (especially when under delivery pressure) suspend their critical
> analysis at the point that a problem disappears from their view, which
> is not always (or even often) a holistic or corporate approach.
>

> >>>We take data in from a screen as strings, change them to objects based
> >>>on specifications in code or parameter data, and then to relations
> >>>based on specifications in code or parameter data, with each process
> >>>validating everything along the way. The RDBMS takes those relations,
> >>>validates with constraint logic likely written in a different language
> >>>than the other validations, and turns the data into strings to be
> >>>stored.
> >>>
> >>>That doesn't seem efficient in either machine or people time for both
> >>>development and maintenance activities. Or am I missing something?
> >>
> >>Isn't this what computers do for a crust! I'm not sure why the
> >>"strings" specifically are so interesting to you. I assume it is a
> >>Pick'ism to contemplate data in terms of strings?
> >
> > To some extent, perhaps (somewhat similar to XML developers thinking in
> > strings), but my reason in this case is that although you might cast to
> > another type (even on the way in after testing for compatibility), all
> > data input through a UI can be handled as a string, the empty string
> > being one such. The string could be seen as the top object for data
> > entry, with all other types inheriting from it. A number is string
> > input that can be cast to a numeric type, for example. (I realize this
> > might be heresy to some)
>
> OK - my view differs in that the bit is the "top object".

All the bits I've seen are single-bit strings ;-) I do make a distinction between binary MIME types and that which can be entered via a standard keyboard (roughly speaking). Query languages, for example, operate quite differently on text than on pictures or songs.

>

> >>In my view data is
> >>simply data - f someone feels compelled to cast it to another form so be
> >>it, but it is not expected that the DBMS is forcing you to do so.
> >
> > In my view data can be binary (some MIME-type) such as pictures or
> > video, or it is strings, possibly cast to whatever else will bear it.
> >
> >>>>* see below.
> >>>>
> >>>>>>[..]
> >>>>>>
> >>>>>>
> >>>>>>>>I think you have expressed most clearly the core of the problem
> >>>>>>>>troubling you. It seems you can't accommodate the SQL outcomes because
> >>>>>>>>it doesn't have a "shape" that you are comfortable with.
> >>>>>>>
> >>>>>>>I'm actually quite comfortable with it, but I don't see an advantage to
> >>>>>>>using it to the extreme typically employed with SQL-DBMS's.
> >>>>>>
> >>>>>>I'm missing something - what is the "extreme" bit?
> >>>>>
> >>>>>If there really is a reason to have data in base relations with no
> >>>>>repeating groups, surely that requirement goes away with relations that
> >>>>>are logical views of the data. SQL permits views not to be in 2NF,
> >>>>>3NF, BCNF, 4NF, ... but took this notion of no repeating groups to the
> >>>>>extreme and requires it even in logical views of the data.
> >>>>
> >>>>As DC said I don't see any reason for exchange of information from the
> >>>>DBMS to the Screen App to be exclusively 1NF in the strictest sense. I
> >>>>suspect you feel that "softness" weakens the process more so than the
> >>>>2VL/MV no translation mantra holds.
> >>>>
> >>>>I disagree - there are measurable gains to be had in adopting the RM for
> >>>>the storage engine that,
> >>>
> >>>Can you point me to some emperical data that indicates that? And even
> >>>if that were the case, are there corresponding gains to making the RM
> >>>the interface between dbms and humans (developers)?
> >>
> >>I have no empirical data (for or against).
> >
> > What are these "measurable gains" then? What do they measure? Who has
> > done such measuring?
>

> First, the most obvious gain is a repeatable capability to consistently
> disseminate and/or interrogate a large and complex dataset amongst many
> people all with varied interests and in so doing avoid the need to
> inculcate them with all the various nuances a 2VL programmer might dream
> up to persist each item of data.

Now that has GOT to be tough to measure.

> After factoring in a life cycle and
> unavoidable changes in staffing these nuances can become debilitating.

I've worked with plenty of systems that are not based on the RM with code > 20 years old and counting. So, this measurement, too, could be difficult to quantify. I've looked at possibilities for what might be quantifiable and nothing looks very straight-forward. I know I don't have emperical data related to bang for the buck or any other metric for non-SQL or non-RDBMS compared to SQL-DBMS's, but I sure have seen a lot of claims about how the RM has been shown to be superior to all other models. You just claimed that there were measureable gains with the RM. I'm just wondering what these measurements are and how I can duplicate whatever experiment these come from.

> The rigour inherent in the RM goes some way to reducing this risk (of
> course it is not foolproof).

Agreed on both counts.

> Of course the extant business rules buried in the data itself are an
> unavoidable cost of entry to the club - that is a given regardless of
> the RM's involvement.

>

> Secondly, the measured aspect as inferred above is the linear aspect of
> continuing to deal with the data, rather than an unquantifiable risk of
> non-linear cost/effort in a less rigourous implementation.

If all code associated with SQL-DBMS systems were strictly SQL, then perhaps one could argue this. However, it appears to me that SQL-DBMS systems still have stored procedures and other code associated with them for UI's and other processes. So how can you measure this gain and related risks?

> Thirdly I have done the measuring over the last 25 years!

Actual measurements you can share or anecdotes like mine? It isn't easy to get beyond anecdotes and we each have different experiences. Can we collect meaningful data with a reproducable experiment?

> There are
> other independent sources however - one I find quite humorous (but a sad
> comment on IT nonetheless) is visible at http://thedailywtf.com.

That's a new one for me--I took a quick peek, will look more later.

> In all
> my casual viewings of the material on this site, by far and away the
> predominant examples are from 2VL exponents or as in todays case 2VL
> types trying to railroad an SQL/RM database.

>

> >> On anecdotal terms however
> >>and relating to some of the commonly held (as far as I believe) precepts
> >
> > Yes, there are some such that need fixin'
>
> Which other precepts and what needs fixin' is also eminently debatable!

Undoubtedly

> >>of good IT such as independence (separation of logical and physical) and
> >>simplicity (some will dispute this of course)
> >
> > I tend to want things simple for me (and those around me) in contrast
> > to insisting everything be as simple as possible for the developers of
> > my tools or for the tools themselves.

>

> My view is less self centred - I always look to maximise the long term
> corporate benefit even if it is more work for moi at the outset!

OK, I'll give you that one. I happen to be a one man (zero man, to be precise) company right now, but I'm interested in how to improve conditions for software developers in order to maximize the long term corporate benefit.

> >>are on offer. Sure, these
> >>are not necessarily unique to the RM, but I feel
> >>
> >>>>IMO, outweighs any perceived cost of developing
> >>>>and maintaining a dialect to transfer information from it to a
> >>>>application. In my view the 2VL/MV actually allows, nay encourages,
> >>>>that "softness" to find its way into the repository, which in the wrong
> >>>>hands is a Bad Thing tm.
> >>>
> >>>There are greater risks for some types of problems in one scenario and
> >>>others in the other. I don't know how to go about getting emperical
> >>>data about the quality of data and processes over time using one
> >>>approach compared to another.
> >>
> >>...which is why IT can't match (say) the medical sciences and the
> >>epidemiological approach they deploy to determine efficacy (even if its
> >>not foolproof at least there is a basis). IT as I have bemoaned oft
> >>before has barely achieved stone-age techniques let alone producing a
> >>guilded master craftsman or sanctionable professional.
> >
> >
> > We are in agreement on that.
>
> Cheers, Frank.

Yup. --dawn Received on Tue Jan 31 2006 - 04:09:23 CET

Original text of this message