RE: : RE: bytes vs chars

From: Robert Freeman <rfreeman_at_businessolver.com>
Date: Tue, 15 Mar 2016 20:37:10 +0000
Message-ID: <BY2PR10MB0744DC88D929DD131855A404D5890_at_BY2PR10MB0744.namprd10.prod.outlook.com>



Oh, this topic gives me heartburn... Both sides have points, but I also think both sides are missing THE point. Data types and lengths are constraints - and should be treated as such.

Neil - I don't think anyone is suggesting that we make every column something on the order of a VARCHAR(2000) - just in case. I would call that an extreme case and in any discussion of worth - I think we throw out the extreme cases and send any DBA that supports them back to DBA 101 and basic data modeling. :) (Hopes I have not just inadvertently sent Dave to the woodshed).

Dave - Legacy applications are aplenty and are ripe with the issues your point out. Y2K was a lesson in failure to look into the future. :) I would just hope that we don't design future applications with fudge factors as opposed to better analysis.

Let's look at your examples - I am sure many of us have seen these:

> How many have dealt with ZIPCODE NUMBER(5)?
> How many have dealt with ZIPCODE NUMBER(6)?
> Should it be ZIPCODE VARCHAR2(6)?
> Possibly ZIPCODE VARCHAR2(10)?
> How about ZIPCODE VARCHAR2(64)?

This question really boils down to a question of best practices, standards, development methodologies and ... the fog of the future (which is largely what I think the zip code examples are about). I've seen many databases developed for US types of data (e.g.: 9 digit numeric SSN or zip+4 Zip codes, etc). Then, suddenly, Globalization comes calling. Enter "fog of the future" - the unforeseen. Nobody thought about it in the beginning - nobody dreamed they would be selling into Canada, Mexico, Japan, China and the UK. Still, I don't see the fog of the future as justification for not following best practices.

Why do I say best practices? It's because I think we need to look at the size of a column as a form of constraint. No different than an FK, PK, check or not null constraint. We tend to be very methodical when creating these kinds of constraints, and I see no reason why we would not do the same with respect to column definitions. I think it's important to pay attention to the notion of data type sizing as a constraint. Why?

Constraints have incredible value in that they protect the integrity of the data. Column length and type semantics are central in that role.

What happens when a new batch job, power user or application feature manages to put in a 101 character comment into a VARCHAR2(200) - when the rest of the application defines the maximum length of that column as a VARCHAR2(100). The application fails, often in spectacular ways... nothing that makes a user's day brighter than Java core dumps.

What is really more impactful - the app crashing because data integrity has been violated in some unexpected way or having to rework the application and database design to conform to a new set of business rules. What good is an application if the data has become corrupted? I've seen cases where uncaught exceptions from reads (where there was more data than memory allocated to store it), resulted in corrupt data being written out to the database.

Sorry this is so long... one last point - Oracle has come a LONG way in providing DBA's with the functionality to make smooth iterative changes. Considering this fact, I think the justifications for throwing a Hail Mary a column sizing is even less compelling.

Cheers!

RF
<<previous comments snipped for brevity>>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Mar 15 2016 - 21:37:10 CET

Original text of this message