Re: Avoiding New Fields Causing Ambiguity Errors

From: Topmind <topmind_at_technologist.com>
Date: 9 May 2002 13:21:50 -0700
Message-ID: <4e705869.0205091221.2c5a5bf9_at_posting.google.com>


71062.1056_at_compuserve.com (--CELKO--) wrote in message news:<c0d87ec0.0205071303.1594e17c_at_posting.google.com>...
> >> There is the possibility when adding a new field [sic] that an
> existing query in an application(s) might "break" due to field [sic]
> name ambiguity being introduced. Other than relying on regression
> testing, approaches to avoid or reduce this include: <<
>
> Do the regression testing, or write a tool which will fully qualify
> column names.

I addressed the second suggestion in a nearby reply.

>
> >> 1. Have a systematic way to check for fields with the same name(s)
> as the candidate field(s) already in the schema somewhere. This may
> lead to names like "X_status" and "X_title" instead of "status" and
> "title" where "X" is the table name. (This practice makes queries
> simpler in many cases anyhow.) <<
>
> This will screw up your data model -- a data element gets its name
> from what it means LOGICALLY, not where is PHYSICALLY located.
> Mulitple names for the same data element is a REAAAAALLLLLY bad idea.

I am not quite sure what you mean. The overlap is caused by at least these 3 things:

  1. Different entities happen to have the same kind of data. Common examples include Title and Status. Almost any entity you can think of may have a title or status field (or synonyms like "name", "description", etc.). I try to avoid these by having names like "Invc_status" (invoice status) and "Invc_title". But, I don't always get to make the call.
  2. Key fields. I like the convention of "x_ref" for foriegn keys and "x_id" for the primary key. Still, there may be conflicts if 3+ tables are joined even under this convention.
  3. Bad normalization (duplication). Don't blame me, I just inherited the tables, I didn't make them. Vendors cannot always assume that table designers are sober.

>
> >> 2. Have the query engine *ignore* duplicates by picking the oldest
> table and/or fields [sic], or even have ranks assigned to tables or
> fields [sic] in case there is a conflict. <<
>
> You added the new column to the query for a reason (I hope), so you
> want to see both columns in the results.

Not necessarily. Sometimes they are #2 or #3 above, in which case they may be the same thing. IOW, it does not matter which one is chosen, but the query engine still complains because it does not know they are really duplicates.

Hmmmm. Perhaps if
the query engine could make smart use of such information given to say a data dictionary, then it wouldn't have to complain in those cases.

> And -- think about it --
> tables cannot have duplicate names in the same schema. So this
> sentence makes no sense, unless you are deliberately messing up the
> database for some reason.
>
> Yes, fully qualify the names of coumns when there is a conflict. Use
> meaningful alias table names in the FROM clause. My personal
> convention is a one or two letter abbreivation followed by an integer.

For table aliases or column names or column aliases?

Thanks for your feedback,
-T- Received on Thu May 09 2002 - 22:21:50 CEST

Original text of this message