Re: Table design question

From: Bob Badour <bbadour_at_golden.net>
Date: Mon, 9 Feb 2004 23:31:37 -0500
Message-ID: <_aqdnd5uYueo_bXdRVn-hg_at_golden.net>


"D Guntermann" <guntermann_at_hotmail.com> wrote in message news:HsuL0s.AI3_at_news.boeing.com...
> [snip]
>
> > > >
> > > > The only reason you are trying to enumerate "six thousand
> > > > non-consecutive integers" is because you are under the mistaken
> > > > impression that the set of users who actually use a system is a
> > > > domain, when it isn't.
> > >
> > > This is quite a presumption. While it could be one reason, it's not
the
> > > only possible reason for defining a "complex" enumerated domain.
> > >
> > > Quite the contrary, if a system is "modeled" to represent something in
> the
> > > real world upon which we want to make decisions or have the system
> execute
> > > upon, then modeled domains "by constraint" at the logical level are
> entirely
> > > valid. I further submit that having the database management system
> enforce
> > > domains and therefore its consistency is not only desireable, but
> necessary.
> > >
> > > Tomorrow, a new user may be enrolled, another
> > > > may leave etc.
> > >
> > > Then you change the implementation (the implemented constraint) to
> reflect a
> > > change in the logical model as dictated by the user or business. It's
a
> > > fact of life that systems in the real world evolve, but it doesn't
> change
> > > the fact that logic should to be deterministic to reflect the real
world
> and
> > > provide some modicum of behavior that is both expected and that
conforms
> to
> > > the model.
> > >
> > > By contrast, the domain "user_id" is the (fixed for
> > > > all time) set of all possible values from which users may be given a
> > > > user_id. Of course, it is conceivable that an organisation might
> > > > decide in advance that all user_id values must be taken from the
> > > > enumerated set {1, 112, 314, ... 9997}, but if so then of course
they
> > > > pay the penalty in the complexity of enforcing the constraint.
> > >
> > > I think everyone agrees that there is a penalty involved with
enforcing
> > > certain constraints to ensure quality of the system, but that is not
> what is
> > > at issue. People argued, and still argue, that association by value
as
> a
> > > performance liability for the same reasons - to justify their rational
> for
> > > not accepting something.
> > >
> > > Much
> > > > more typical would be "between 5 and 10 alhphanumeric characters" or
> > > > "an integer between 1000 and 9999".
> > > >
> > > I don't understand the criteria upon which you decide that one degree
of
> > > specificity in defining a domain of representations in the form of
> strings
> > > or numbers is necessarily better than another.
> > >
> > > For example, if I devised a logical model for "processes" where I
defned
> > > states of a process at a logical level as only being started and
> completed,
> > > I'd want to ensure that 1) the system only allowed for two state
values
> for
> > > any given process; 2) being forced to use SQL enforcement at a lower
> > > representation level, I chose two representations and ensured that
each
> > > representation identified the correct state; and 3) that the DBMS
could
> > > enforce state transitions if it was so desired.
> > >
> > > I could choose any two strings or set of two numbers to represent
those
> > > states, but declaring and defining column variables as VARCHAR(7) or
> > > INTEGER(1) will do nothing to enforce the integrity of my model from a
> data
> > > management perspective. If I chose to represent my domain as the
> strings,
> > > 'STARTED' and 'COMPLET', or alternatively as 1 and 2, then the
> VARCHAR(7)
> > > domain definition will not enforce the two state domain in that it
> allows
> > > for 562,949,953,421,302 states other than the two I want - assuming an
> > > alphabet that allows for 2^7 symbols, which is equivalent to 28
> characters
> > > and the disallowance of null. In the case of INTEGER(1), the DBMS
> would
> > > allow for 8 additional states that are not part of my logical domain.
> > >
> > > Since we are dealing in large part with string and number
> representations,
> > > we might consider a defined language (sequences of symbols composed of
> > > alphabets upon which rules of syntax and grammer are applied) a proxy
> for a
> > > defined domain.
> > >
> > > If you want to argue the point by saying that you have to generalize
> data
> > > and domain definition to allow for evolution and incremental change to
> what
> > > one today defines as the domain, then who is to say the VARCHAR(7) or
> > > NUMBER(1), or alternatively, alphanumeric consisting of between 5 and
10
> > > characters, is sufficient? Why not just call everying attributeN and
> > > defined everying as alphanumeric 100 wide character strings?
> > >
> > > Even more importantly, what assurances can the DBMS give me that my
> universe
> > > or model of a shared real world is properly being represented at the
> current
> > > time?
> > >
> > > > I agree that SQL database don't support domains very well, but the
> > > > example being discussed here is NOT a domain.
> > >
> > > Yes it is. It's just a rather inconvenient domain.
> > >
> > > Best regards,
> > >
> > > Dan
> >
> > What can I say? I really thought this was quite a simple concept, and
> > yet you have completely failed to comprehend me.
>
> Well, I don't think I did. But I can see where the disconnect is. You
> presume that the enumerated domain under discussion represents actual
users
> at time *now* as part of a set of propositions that meet some predicate.
I
> did not see where that was clearly stated in Mr. Sherrill's example and
one
> of your statements allowed for the domain being discussed to be an
> enumerated domain of "allowable" values, which, given the contradiction,
> probably confused me.

This thread has a history going back to a question posted Jan 23, and it always seemed clear enough to me that the unary table under discussion was a table of past and current users. Well, at least as clear as any business rule can be in a newsgroup exchange--a strong reason to avoid giving specific design advice in newsgroups.

Tony had already provided a concise, accurate and useful answer to the question on Jan 24 when Mike piped up on Jan 25 with his "Smells like teen spir..." er...um... "Smells like a domain" post. At best, Mike's "Smells like a domain" post was trivially true and uninformative. At worst, uninformed.

> Another part of the disconnect is that we are discussing this at two
> different levels - your point, if I understand you correctly, is that you
> want to arbitrarily define some pool of numbers that works as an identity
> function, generating user ids for users, now and forever, which I admit is
> entirely valid.
>
> My argument was from a more generalized point of view and did not work
under
> the assumption that the domain under discussion reflected actual current
> users, in contrast with the pool of allowable user_id's; but rather
> reflected the viewpoint that the domain was an enumerated domain,
> representative of any domain that might be finite under some model of
logic.

The claim that an explicitly and singly enumerated domain would be required was a smokescreen or straw man thrown up by Mike to defend the indefensible. That's not to say such a domain cannot exist -- just that it was not a necessity nor even apparently desirable in this specific case.

> Examples of this might be social security numbers or phone numbers.

A domain or data type might consist of the set of values { blue, 3, 1, cat } and one of the operations might include the ordinal position of the value in the ordered list: blue, 4, the, 1, cat, green, for, 3

Tony conceded the general case that a domain can have any set of arbitrary unrelated values (well, unrelated except that they are values of the domain in question that is) noting that the unrelatedness necessarily complicates the specification, which is why I wondered whether you intended to talk past him. I would agree his concession could have been worded more explicitly as conceding the general case.

> Are you seriously
> > saying that if I run a business with 5 employees whom I assign the
> > user_id's 5, 7, 42, 101 and 123 then any tables that have a user_id
> > column are in the DOMAIN "user_id" that is restricted to the values
> > (5, 7, 42, 101 and 123).
>
> I won't belabor the point.
>
> I think Bob is right in that we are talking past one another, but
> nonetheless

At least, it was unintentional and seems to be all cleared up now.

>, is there some criteria that states that a domain constraint is
> ok in one instance but ridiculous in this particular case? Where is that
> line? I suppose its where the user draws it.

Tony was not objecting to domain constraints; he objected to the characterization that a unary relation is necessarily a domain. Unary relations are quite acceptable as are nullary relations (ie. degree = 0); albeit, SQL arbitrarily prohibits nullary tables.

> So if tomorrow I hire a 6th employee I have
> > to redfine the "user_id" DOMAIN to include another value?
> >
> See leading comments....
>
> > Sorry, but you are talking nonsense.
>
> I'm sorry you feel that way. Hopefully we can put it behind us. I still
> stand behind my comments though, given my own set of assumptions.

I think you and Tony vehemently agree on that score. Received on Tue Feb 10 2004 - 05:31:37 CET

Original text of this message