Re: Why all the max length constraints?

From: J M Davitt <jdavitt_at_aeneas.net>
Date: Sun, 28 May 2006 02:01:21 GMT
Message-ID: <Rv7eg.41177$mh.17895_at_tornado.ohiordc.rr.com>


dawn wrote:
> J M Davitt wrote:
>

>>dawn wrote:
>>
>>>[OK, here is my next "stupid question" as I cut a path in my study of
>>>the RM.  Those teachers who just want to tell this student how ignorant
>>>she is are welcome to sit this out as I really am hoping to
>>>understand.]
>>
>>[Quiz time: what are scalars, tuples, and relations?]

It's a quiz. These are fundamental terms in the relational model.

> You define scalar and I'll define tuples and relations.
> btw, I think my def of relation is already in the cdt dictionary.
>
>

>>>In SQL-DBMS's, like VSAM (and other indexed sequential files before
>>>them) a lot of attributes are specified with max length constraints.
>>
>>I don't mean to pick nits, but I don't grok "VSAM" and SQL-DBMS's.

>
>
> It is not a nit pick for you to point out things you don't fully
> understand.

Okay, let's be explicit: what do SQL and VSAM have to do with each other?

>

>>IIRC, VSAM provided for an OCCURS-like construction in record layouts -
>>but all that meant was that you could have a variable number of fields,
>>all of fixed width, up to some specified maximum.

>

>
> Indexed sequential files were a natural progression from cards. It is
> easy to see how fixed lengths arose out of working with card decks. I
> wondering if the concept was simply never dropped or had a reason for
> continuing.

Fine. But off-point. Isn't it true that the variable in VSAM means a range of fixed-width fields?

>
>

>>MicroData, Pick,
>>Progress, &c, mean very different things when they describe things as
>>variable.

>
>
> The first two should mean the same if you are referring to MicroData
> Reality. There are some good stories on the splitting of ways between
> Dick Pick and Don Fuller, but yes, I'm sure that "variable" is used for
> different things.

And doesn't their use of the term variable differ from the use of the term in VSAM? And don't you want to clear this up before moving forward? After all, you tossed VSAM and "all data as variable in length" into the discussion about fixed-width attributes. (Or, more precisely, you used the phrase "max length constraints."

>
>

>>>While there are some attributes where this constraint is related to a
>>>conceptual constraint (from the analysis phase), these lengths are
>>>often introduced for the logical model or implemenation in the DBMS.
>>>
>>>In other words, when mapping from the conceptual (analysis) to the
>>>logical (design) data models (pick the terms you like best for these),
>>>these constraints are designed for many attributes that have no such
>>>conceptual/business limits (if implemented with a paper system, there
>>>would be no such limit, for example).
>>>
>>>Is there something about the RM that would prompt all (or most?)
>>>existing implementations (however flawed) to drive developers to add in
>>>these constraints for performance, space saving, or other reasons?
>>
>>No.
>>
>>
>>>I
>>>realize there can be variable length attributes, but specifying a max
>>>field length still seems to be the norm (is that still the case?)
>>
>>Yes.
>>
>>
>>>As many of you know, I work with database management systems that treat
>>>all data as variable in length, while one might specify a length for
>>>display purposes.
>>
>>I can't imagine that it's useful for 'Smith, Joseph' and 'Smith, John'
>>to appear as identical values when, say, displayed in a field of eight
>>characters.

>
>
> Obviously not if truncated. The default with Pick would be to wrap in
> any display. This is usually good, but has its own issues too, of
> course. For every solution...
>
>
>> I also work with products where all data are of variable
>>length.  (There is a maximum, but it's huge.)  PITA.  This mis-feature
>>accounts for a fair number of support calls.

>
>
> What is the best mitigation strategy in your case? If you know that an
> attribute must have no more than two characters, what do you do?

Restrict it to one or two characters, obviously. Wouldn't you?

>

>>>Thanks for any insights into database attribute length constraints,
>>>their purpose (is it related to output with fixed fonts, database
>>>performance, size or what?), and any correlation to implementations
>>>(again, however flawed) of the RM,
>>
>>The issue has nothing to do with the relational model.

>
>
> I know nothing of how dbms products are designed internally. I have
> this idea that possibly the fact that everything in XML, Pick, and
> elsewhere is ordered gives them more of a likelihood of permitting
> variable length and handling it well since you can tag-delimit values.
>
>
> Is there possibly something about how RDBMS's would need to allocate
> space for columns or something that might make it advantageous for
> there to be a max length? The other thought I had was that working
> with sets and joins might mean that implementing with everything
> variable length could be problematic for allocation of space somehow,
> but I (obviously) don't know.

But, you know, there is always a "max length." It may be a large value, but it's there. IIRC, 32K is common in the products I cited in my earlier post. Can you verify that?

>
>

>>if there is such.  Could a vendor
>>
>>>write an implementation of the RM where length constraints are as rare
>>>as they are in the conceptual model without introducing performance or
>>>any other issues?

There are already many products that can do that. But adding the phrase "without introducing performance or other issues" is naive. You should know better; everything has a cost.

>
> Is your answer to this question "yes"? If so, why hasn't anyone done
> so? It seems it would help with the goal of decoupling the physical
> from the logical. If there is no conceptual reason for a limit,
> couldn't the dbms take care of such physical issues?

This is not what others are referring to when they mention separation between logical and physical design.

>
> I'm looking right now at all of the aspects of a conceptual design that
> need to be adjusted in what I was calling the logical (data) design
> (the one implemented by means of a person or program interfacing with
> the DBMS) and how such designs are done differently using the RM (or an
> implementation thereof, however flawed) or MV.
>

That's a step down the wrong path. How could any subsequent design effort change the conceptual model? It's either complete and correct -- or it isn't.

> Thanks for your help. --dawn
>
Received on Sun May 28 2006 - 04:01:21 CEST

Original text of this message