Re: Order & meaning in a proposition

From: Dawn M. Wolthuis <dwolt_at_tincat-group.com>
Date: Wed, 7 Apr 2004 16:12:57 -0500
Message-ID: <c51qsr$c0n$1_at_news.netins.net>


"Eric Kaun" <ekaun_at_yahoo.com> wrote in message news:wEZcc.9262$zM5.3375_at_newssvr32.news.prodigy.com...
> "Dawn M. Wolthuis" <dwolt_at_tincat-group.com> wrote in message
> news:c517ks$lal$1_at_news.netins.net...
> > "Eric Kaun" <ekaun_at_yahoo.com> wrote in message
> > news:I1Scc.51787$_37.3056_at_newssvr16.news.prodigy.com...
> > > "Dawn M. Wolthuis" <dwolt_at_tincat-group.com> wrote in message
> > > news:c4vdsj$n7k$1_at_news.netins.net...
> > > > Bingo - I'm referring to stuff that no one would consider "important
> > > enough"
> > > > from a data processing standpoint, but still provides information
that
> > > need
> > > > not be lost (in all cases). Why lose ordering if you don't have to?
> > >
> > > Why leave its importance implicit (e.g. to be used or ignored, but in
> > either
> > > case ASSUMED by application developers) if you don't have to?
> >
> > The information of which I speak (and my example is not great) is that
> which
> > we would recognize as insignificant and when we ask the user, to be
> certain
> > we are making correct assumptions, they will acknowledge the data as
> > unimportant. In fact, it is not important, it is just a little more
> > information that could perhaps seep into our brains and help us solve a
> > business problem better, unwittingly. I recognize that by turning
> sentences
> > into data to which we will apply predicate logic, we are losing some of
> this
> > "fruffy" information. But, if we don't lose anything from a logic
> > perspective in keeping such things as the ordering of multiple nouns in
a
> > sentence, then let's not.

>

> But there is a big difference between a list or array and a set. Are your
> multivalued attributes really sets ordered by data entry sequence? If so,
> that's a data structure we should perhaps know something more about. And
if
> it's not a set, but a list, then it's important for counts that the app
know
> that.

A couple of months ago, I collected defs of: collection, set, map, list, array, bag, etc. I suspect that the values that a multivalued variable could hold would be best described as a "list" although there is an index available for use with database functions so that you could also think of it as an array (and some folks consider those to be synonyms anyway). But from the perspective of the model the way I describe it, it is a function -- a mapping.

But why do you think an app needs to know "counts" -- that's not typically needed with PICK apps. You store what you take in and you show it back to users. You ask questions like: LIST STUDENTS WITH EVERY MAJOR NOT EQUAL "PHILOSOPHY" Input screens might provide scrollable windows for entering and maintaining the lists, for example. They don't need to count the number of entries.

> I'm not necessarily disagreeing with you - just pondering the implications
> of a novel new (?) data structure.

na -- just a list (ordered), brother where sometimes the ordering is relevant and sometimes it isn't -- use as you like.

> > For example, if there are strict rules at the pizz place (yes, that is
> > another thread) that we put mozzarella on before we put parmesan on the
> > pizza and when the order pops up it says
> >
> > Pizza Mozzarella
> > Parmesan
> >
> > Then that helps us in some little bit, even though it is redundant
> > information. It wouldn't be worth collecting this information
> specifically
> > in some sort of ordering data element, but if we pop this puppy into the
> > database and it keeps the order that we stated it in, so much the
better,
> > right?

>
> One benefit of a set is that a DBMS implementation can do what it likes,
and
> for optimization, order can be significant. Furthermore, I do think it's
> important to distinguish what data structure you really mean. Depends on
the
> answer to the above question I asked.

I'm referring to a list as a logically ordered list (which could be a SET of tuples, for example) and not referring to how the database chooses to store the data for optimization.

> > > I disagree completely. A relational structure allows the simple
> generation
> > > of any number of hierarchies, without favoring one. Unless you enjoy
> > > coupling the internals of your app to every communication it has to
make
> > > with the outside world, that's a Good Thing (tm 2004, Martha Stewart
> > Inc.).
> >
> > This reminds me of some new customers to a UniData (an IBM PICK
database)
> > application who told me that they would like to use ODBC to retrieve
their
> > data for an XML document. I pointed out to them that the UniData data
had
> > information such as a student and their "set of majors" (whether stored
> that
> > way or virtual data) so you could ask it to
> >
> > LIST STUDENTS MAJORS
> >
> > and you would get one record for each student, with their list of
majors.
> >
> > If you then use ODBC to access it, you are doing a virtual normalization
> of
> > the data, then you are taking that data and mapping it to XML, where, in
> > this case, they needed one document per student (I'm truncating the
> example
> > for simplicity purposes).
> >
> > Similarly, we can take what is in our brain as a statement about David
> > majoring in Math and Philosophy and turn that into two propositions
about
> > David prior to designing the schema for a PICK database where we would
> then
> > pour those two statements back together. But why go through that silly
> > exercise?

>

> For the benefit of other queries. Yes, if all you're ever doing is
> displaying that proposition about David, then fine. But I've yet to see
such
> an app that wouldn't want to say something (for example) about counts
within
> majors (or some other grouping that requires an inversion of the nesting
you
> propose), and then you have much more work.

How much work "you" have depends on the functions available to you and if you have functions that do what you want with a list, then you are fine, but there could be times where one would have to write a new function.

> In short: my users have always been able to surprise me in the queries,
> reports, and additional apps they want. A normalized (non-1NF) structure
has
> always been my friend in this record. And my Java development has led me
to
> further loathe the enforced nesting of "Entity A" inside "Entity B" - you
> can have the nesting both ways (e.g. use a graph), but that path is
fraught
> with bugs.

Very interesting -- my Java work is what brought me to get more upset about JDBC and 1NF structures -- Java can handle more complex objects than that.

>

> > Skip the 1NF in the process and you can go "from your brain to
> > your data structure" even without a lot of theory classes in between.
>

> For prototyping that would be a good idea - to help narrow down the
> requirements. I just wouldn't want to gamble my data structures and
queries
> on one particular nesting.

Yes -- it is important when using such a model to use an environment that is much more (dare I say) agile than what we have with many RDBMS's. If we can "refactor" our data structures and change the database handily without huge costs, then we don't have to determine in advance that we think all data elements are the same. It does take experience to make the right calls in this regard.

> > And then the translation to XML is quite obvious too. Get SQL, ODBC,
and
> the
> > relational 1NF, uh, hogwash out of the middle -- it only makes for extra
> > steps. --dawn

>

> And doing a report of, say, students listed beneath their majors wouldn't
> require much work if you've nested majors inside students? Or perhaps
> listing students and the prereqs for a major, which would then require
that
> "major" be a first-class "entity"?

Oddly enough, you can get symmetric queries in PICK in spite of multivalues because it is based on vocabularly. So, you create a field named "STUDENTS" within the vocabularly of the MAJORS function and you can type: LIST MAJORS STUDENTS
> I've never found the "additional work" to be onerous.
>
> - erk

I wish I could put my finger on it -- I feel like I'm circling the issue and not quite putting my finger on just what makes the PICK model so much easier for the initial development of software and then for the ongoing maintenance. Still plenty of mysteries for me, but maybe if I surround it, a light will go on.

Smiles. --dawn Received on Wed Apr 07 2004 - 23:12:57 CEST

Original text of this message