Re: repeating groups

From: dawn <dawnwolthuis_at_gmail.com>
Date: 19 Feb 2006 22:13:03 -0800
Message-ID: <1140415983.632579.49520_at_z14g2000cwz.googlegroups.com>


Marshall Spight wrote:
> dawn wrote:
> > Marshall Spight wrote:
> > >
> > > I see no reason to add bags. In fact, I'm not even sure I
> > > believe a bag is an actual data structure. It seems more
> > > like a traversal strategy on a set. What makes you think
> > > you need bags?
> >
> > Your logical predicates and related propositions can then look very
> > much like the language they are modeling.
>
> This is a non-goal for me.

I know. The intersection of our interests seems to be the interface between user of a database and the database management software. Metaphors of all kinds are useful for interface design, including visual, language, and mathematics. If you try to use understandable variable names, you recognize the importance of modeling with language.

> We are doing math here, not
> literature.

I wouldn't call propositions "literature."

> I love literature, but it does not inform the field
> of data modelling.
>
>
> > E.g. John flipped the coin and got heads, heads, tails, heads.
> > Mary's majors have been math, philosophy, business, math.
>
> Neither of these are well-formed propositions as I understand
> the term.

Think of the intersection of our interests in that interface. These are English propositions. I don't care if you need monadic second order logic or need to split these out into multiple propositions under the covers.

> Either:
> John flipped some coins and got heads 3 times, tails 1 time.
> or
> John flipped some coins and got [heads, heads, tails, heads].
>
> Those are propositions.

Yes, those are logical propositions.

> Either you care about the order or you don't.

At any point in time.

> Can you think of some operation you want to do that couldn't
> (easily) be satisfied by one of the above? Up to and including
> "implement bag operations."

I don't need child bags any more than I need child sets. I'm good with child lists that I can interpret as having a meaningful order or not.

> > We will always be entering data in some order and
> > showing it in some order,
>
> Usually, not always. And even so, this fact doesn't
> constrain our data models. Unless perhaps you are
> only interested in a model designed to support
> data entry and display.
>
> > so you can call these both lists, but if the user doesn't care
> > about the order, it is just a bag to them.
>
> Assuming we agree that sometimes we want order and
> sometimes not, we would also presumably agree we
> want data structures for both cases. Would you rather
> have the set {list, bag} or the set {list, set} available
> to you? I would prefer the latter.

I'm good with functions (such as relations) with no order and not nested with child lists. So, I'll choose {list, function} ;-) so, yes, (list, set}.

<snip>
> > Because logical nested bags, sets, and lists
> > are all implemented as lists,
>
> Implementation doesn't matter logically;

Sorry, I chose the wrong word for your world. Conceptual bags, sets, and lists can all be placed in the logical model as lists.

> that's pretty much
> the definition of implementation.

my implementations are at a higher level than yours, but I'll try to do better with terminology

> > if I want to indicate that the ordering matters, I call it an
> > ordered list.
>
> Sure. One needs lists; we agree on this. The question
> is whether one needs bags. I claim not.

I claim not too. But I might store a conceptual bag or set in a logical model list. Because of that, I tend to refer to lists (which might conceptually be sets, bags, or lists) and ordered lists (which are conceptually ordered as well as in the logical model as lists). You don't have to like this terminology or care why I use a phrase you think is redundant (but isn't) and I am not trying to argue that it is a good choice of terminology, but I thought an explanation of it might be of interest, especially since I know you are ignoring bags. They need not be in an implementation, but lists will be used for logical bags if the only other choice is a set.

Did that explain better what I was trying to say?

> (Again, the canonical term for this is "sequence" or "list."
> Since "unordered list" is a contraction in terms, "ordered
> list" is redundant. You might just as well say, "ordered,
> ordered list.")
>
>
> > > > It works OK for me to handle nested sets,
> > > > bags, and lists all as lists.
> > >
> > > I believe it works "OK"; which is to say, mostly except for
> > > some of the time. I have higher aspirations than that, though.
> >
> > But you gain some conceptual simplicity for users this way, so it could
> > be the optimum solution.
>
> Well, I've got 20+ years experience using system that support
> only ordered data as primitive, (Fortran, C, C++, Java, etc.)
> and I am confident that this solution isn't enough. It is definitely
> *not* optimum. As I have said before, I want lists *and* I want
> sets.

Agreed. However, you could use the model that is working quite far and wide of having sets (relations, functions) to model entities, allowing attribute lists within those. You need not permit attribute sets (relational-valued attributes) if you have list-valued attributes. You may, but it adds to the complexity perhaps without sufficient benefit.

> Lists have a great bang-for-the-buck, but they don't have
> as much bang as sets do.

The combination is strong.

> (BTW, who are these users we're talking about? Are they
> programmers?

Yes.

> Since this is a theory newsgroup, one can
> assume we're discussing systems built for trained/educated
> people, and we don't have to truncate the system prematurely,
> cutting corners so we don't have to make them think too hard.)

Right.

> > > It works *better* to handle lists as lists and sets as sets.
> >
> > I'm not sure about that.
>
> I am.

I know.

> Also, I am hard pressed to imagine a convincing argument that
> says, instead of treating X as X and Y as Y, just treat everything
> as X and that's best.

Now there's a condemnation of the RM if ever I heard it. But again, these things don't land on your plate as one thing or another, they are designed. We can constrain the model to only permit attributes in relations; to permit relation-valued attributes; or to permit lists in place of relations or whatever. I'm saying that if you broaden the RM to include list-valued attributes instead of relation-valued attributes, you get significant benefits as you have both sets (relations) and lists. If you also add in RVAs, there is definitely an added benefit, but it might not balance the cost of added complexity. I'm not adament about this -- I can roll with both, but the model I see used successfully has only lists within relations. I would like to start there and move forward.

> The most straightforward, the most conceptually simple thing to
> do, is to handle lists as lists and sets as sets.

And bags as bags ;-)

> Consider: we
> can also use sets to model lists. We can, in fact, use sets to
> model everything. How happy are you with that approach?

Hey, I hear you. I agree that you get more features by having sets (including relations & functions), bags, and lists. I can tell you that the logical data modeling for products that have nested lists within relations (functions as they have a designated primary key) hits a sweet spot that I haven't found in other products. You might be able to hit a sweeter one by pouring more into it, but you might not.

> > > > If the system provides nested lists with
> > > > an arbitrary number of attributes, my experience is that the user can
> > > > distinguish whether the ordering is relevant to them or not.
> > >
> > > Sure. Most of the time. See above.
> >
> > Then I will add "and the added complexity and potential need to switch
> > between such types might negate any benefits there would be in making a
> > distinction between them in the software."
>
> Honestly, how hard is it to say of a collection, do we care about
> the order or not? Are you saying that a professional who can handle
> doubly nested loops, learning complicated libraries, and data
> modelling,
> can't handle the awesome responsibility of deciding whether a
> collection is a list or a set?

Nope, I'm not saying that.

> It seems a less difficult choice than
> deciding between, say, and int and a long.

I don't have a good sense of what changes to the MV data model would take it off that sweet spot. That's my hesitation.

> > > Also, the user is not the only entity in play here.
> >
> > But maybe the one that has been ignored for too long
>
> I see no evidence that they have ever been ignorned.
> And if they had, what would it matter? What matters
> is getting the right answer to this question.

You question is clearly different from my mine then. I've gotta turn in and it looks like there is lots more below. Sorry to cut it short. --dawn Received on Mon Feb 20 2006 - 07:13:03 CET

Original text of this message