Re: Why all the max length constraints?

From: David Cressey <dcressey_at_verizon.net>
Date: Sun, 28 May 2006 11:36:54 GMT
Message-ID: <qXfeg.336$f_4.40_at_trndny03>


"dawn" <dawnwolthuis_at_gmail.com> wrote in message news:1148773922.267863.170930_at_j55g2000cwa.googlegroups.com...
> [OK, here is my next "stupid question" as I cut a path in my study of
> the RM. Those teachers who just want to tell this student how ignorant
> she is are welcome to sit this out as I really am hoping to
> understand.]
>
> In SQL-DBMS's, like VSAM (and other indexed sequential files before
> them) a lot of attributes are specified with max length constraints.
> While there are some attributes where this constraint is related to a
> conceptual constraint (from the analysis phase), these lengths are
> often introduced for the logical model or implemenation in the DBMS.
>
> In other words, when mapping from the conceptual (analysis) to the
> logical (design) data models (pick the terms you like best for these),
> these constraints are designed for many attributes that have no such
> conceptual/business limits (if implemented with a paper system, there
> would be no such limit, for example).
>
> Is there something about the RM that would prompt all (or most?)
> existing implementations (however flawed) to drive developers to add in
> these constraints for performance, space saving, or other reasons? I
> realize there can be variable length attributes, but specifying a max
> field length still seems to be the norm (is that still the case?)
>
> As many of you know, I work with database management systems that treat
> all data as variable in length, while one might specify a length for
> display purposes.
>
> Thanks for any insights into database attribute length constraints,
> their purpose (is it related to output with fixed fonts, database
> performance, size or what?), and any correlation to implementations
> (again, however flawed) of the RM, if there is such. Could a vendor
> write an implementation of the RM where length constraints are as rare
> as they are in the conceptual model without introducing performance or
> any other issues?
>
> TIA. --dawn
>

I don't think it's a stupid question.

Back when I was first getting adjusted to DEC Rdb/VMS, I noticed that COBOL programmers tended to prefer fixed length fields (yes, fields) for everything, while BASIC programmers tended to prefer variable length character strings. Not surprising, since that is how their programming language trained them to think. Later on, when I switched over to Oracle, I noticed that Oracle designers tended to use VARCHAR2 in placed where I would have used CHAR.

In part that's driven by different performance characteristics of Oracle and Rdb (at that time). In part it's a cultural difference.

I think the specification of a MAX for variable length strings is a performance oriented feature, rather than a logical feature. I believe that a system could operate entirely without upper limits on data block size. I have designed a set of data blocks that work like this, but I haven't yet put it to the test.

The more interesting question is whether variable length is a feature of the problem domain or of the solution domain. That gets to be an interesting question when a database is being designed as a subsystem of a larger system. In the data architecture of the larger system, the choice for fixed length fields may be made for reasons that pertain to the solution domain rather than the problem domain.

From the perspective of the database designer, the data architecture of the larger system is part of the problem domain. There are exceptions to this, when the architect of the database can influence the data architecture of the larger system. Received on Sun May 28 2006 - 13:36:54 CEST

Original text of this message