Re: Table design question

From: Tony <andrewst_at_onetel.net.uk>
Date: 9 Feb 2004 15:13:16 -0800
Message-ID: <c0e3f26e.0402091513.701f59fa_at_posting.google.com>


"D Guntermann" <guntermann_at_hotmail.com> wrote in message news:<Hstwz4.59I_at_news.boeing.com>...
> "Tony" <andrewst_at_onetel.net.uk> wrote in message
> news:c0e3f26e.0402090330.1ba1c048_at_posting.google.com...
> > Mike Sherrill <MSherrillnonono_at_compuserve.com> wrote in message
> news:<pp7c20dospovtg3vv925vhtrtikgdvm190_at_4ax.com>...
> > > On Fri, 30 Jan 2004 12:46:56 -0500, "Bob Badour" <bbadour_at_golden.net>
> > > wrote:
> > >
> > > >> Try thinking about it this way . . .
> > > >>
> > > >> At the conceptual level, a domain is just a data type, and a type is,
> > > >> among other things, a set of all possible values. One way to handle
> a
> > > >> set of values is to store them in a table.
> > > >
> <snip>
>
> > > In designing a SQL database based on a logical model, you have to map
> > > types, relvars, and assignments (among other things) from the logical
> > > model to features supported by the target platform.
> > >
> > > This
> > >
> > > TYPE USER_ID POSSREP ( INTEGER )
> > > CONSTRAINT THE_USER_ID IN {1, 112, 314, ... 9997};
> > >
> > > (where the enumeration of values is unavoidable) *might* map to
> > > something like this for a primitive data type
> > >
> > > CREATE DOMAIN User_ID AS INTEGER;
> > > CONSTRAINT user_id_values
> > > CHECK (VALUE IN (1, 112, 314, ... 9997));
> > >
> > > But that probably won't work. (And it's not even worth trying for any
> > > type more complex than a primitive data type.) Most SQL systems won't
> > > let you have more than a few hundred values in that CHECK constraint.
> > >
> > > Instead, you might map the relational TYPE...CONSTRAINT to a SQL
> > > CREATE DOMAIN statement, a CREATE TABLE statement, one INSERT
> > > statement for each value, one or more CHECK constraints, a bucketful
> > > of REVOKE statements, a FOREIGN KEY constraint from each table in
> > > which User_ID is (cough) used, and some administrative procedures to
> > > make sure the table is always used. *If* the designer is careful.
> > >
> > > Do you know a better way to map that TYPE...CONSTRAINT to a SQL dbms
> > > like, say, Mimer SQL? (I mention Mimer because you can download it
> > > from www.mimer.com.) Let's say you have six thousand non-consecutive
> > > integers to enumerate.
> >
> > The only reason you are trying to enumerate "six thousand
> > non-consecutive integers" is because you are under the mistaken
> > impression that the set of users who actually use a system is a
> > domain, when it isn't.
>
> This is quite a presumption. While it could be one reason, it's not the
> only possible reason for defining a "complex" enumerated domain.
>
> Quite the contrary, if a system is "modeled" to represent something in the
> real world upon which we want to make decisions or have the system execute
> upon, then modeled domains "by constraint" at the logical level are entirely
> valid. I further submit that having the database management system enforce
> domains and therefore its consistency is not only desireable, but necessary.
>
> Tomorrow, a new user may be enrolled, another
> > may leave etc.
>
> Then you change the implementation (the implemented constraint) to reflect a
> change in the logical model as dictated by the user or business. It's a
> fact of life that systems in the real world evolve, but it doesn't change
> the fact that logic should to be deterministic to reflect the real world and
> provide some modicum of behavior that is both expected and that conforms to
> the model.
>
> By contrast, the domain "user_id" is the (fixed for
> > all time) set of all possible values from which users may be given a
> > user_id. Of course, it is conceivable that an organisation might
> > decide in advance that all user_id values must be taken from the
> > enumerated set {1, 112, 314, ... 9997}, but if so then of course they
> > pay the penalty in the complexity of enforcing the constraint.
>
> I think everyone agrees that there is a penalty involved with enforcing
> certain constraints to ensure quality of the system, but that is not what is
> at issue. People argued, and still argue, that association by value as a
> performance liability for the same reasons - to justify their rational for
> not accepting something.
>
> Much
> > more typical would be "between 5 and 10 alhphanumeric characters" or
> > "an integer between 1000 and 9999".
> >
> I don't understand the criteria upon which you decide that one degree of
> specificity in defining a domain of representations in the form of strings
> or numbers is necessarily better than another.
>
> For example, if I devised a logical model for "processes" where I defned
> states of a process at a logical level as only being started and completed,
> I'd want to ensure that 1) the system only allowed for two state values for
> any given process; 2) being forced to use SQL enforcement at a lower
> representation level, I chose two representations and ensured that each
> representation identified the correct state; and 3) that the DBMS could
> enforce state transitions if it was so desired.
>
> I could choose any two strings or set of two numbers to represent those
> states, but declaring and defining column variables as VARCHAR(7) or
> INTEGER(1) will do nothing to enforce the integrity of my model from a data
> management perspective. If I chose to represent my domain as the strings,
> 'STARTED' and 'COMPLET', or alternatively as 1 and 2, then the VARCHAR(7)
> domain definition will not enforce the two state domain in that it allows
> for 562,949,953,421,302 states other than the two I want - assuming an
> alphabet that allows for 2^7 symbols, which is equivalent to 28 characters
> and the disallowance of null. In the case of INTEGER(1), the DBMS would
> allow for 8 additional states that are not part of my logical domain.
>
> Since we are dealing in large part with string and number representations,
> we might consider a defined language (sequences of symbols composed of
> alphabets upon which rules of syntax and grammer are applied) a proxy for a
> defined domain.
>
> If you want to argue the point by saying that you have to generalize data
> and domain definition to allow for evolution and incremental change to what
> one today defines as the domain, then who is to say the VARCHAR(7) or
> NUMBER(1), or alternatively, alphanumeric consisting of between 5 and 10
> characters, is sufficient? Why not just call everying attributeN and
> defined everying as alphanumeric 100 wide character strings?
>
> Even more importantly, what assurances can the DBMS give me that my universe
> or model of a shared real world is properly being represented at the current
> time?
>
> > I agree that SQL database don't support domains very well, but the
> > example being discussed here is NOT a domain.
>
> Yes it is. It's just a rather inconvenient domain.
>
> Best regards,
>
> Dan

What can I say? I really thought this was quite a simple concept, and yet you have completely failed to comprehend me. Are you seriously saying that if I run a business with 5 employees whom I assign the user_id's 5, 7, 42, 101 and 123 then any tables that have a user_id column are in the DOMAIN "user_id" that is restricted to the values (5, 7, 42, 101 and 123). So if tomorrow I hire a 6th employee I have to redfine the "user_id" DOMAIN to include another value?

Sorry, but you are talking nonsense. I will add a new user to the USERS table with a user_id taken from the DOMAIN of user_id's, which may be (for my system) a number between 1 and 999, or a character string of up to 10 characters. I am NOT restricted to the list id user_id's that have currently been allocated to existing employees.

My point is not that an enumerated domain is impossible, but that if one is defined it should be complete. It does not consist solely of the subset of values that I happen to have used to date. Received on Tue Feb 10 2004 - 00:13:16 CET

Original text of this message