Re: Ref. table performance in Oracle

From: Barry Hawes <bjh01_at_earthlink.net>
Date: 1997/04/29
Message-ID: <3365EC34.2767_at_earthlink.net>#1/1


Philip Zubaly wrote:
>
> Hello --
>
> I have some general questions on schema design for performance in Oracle.
> Specifically, I'm looking for some guidelines on when to use reference tables
> with integer PK's (sequence-generated) with RI to child tables vs. storing the
> actual character values in the child tables.
>
> Oracle documentation that I have read indicates that the use of 'codes' is an
> outdated practice. The justification is that: storage is inexpensive, and
> requiring data-entry personnel to remember codes leads to data-entry errors.
> The latter justification seems outdated to me, considering that the vast
> majority of front-end tools allow the 'code' or 'ID' to be invisible to the
> user through the use of combo or list boxes for selection. This, of course,
> requires the use of another table however. The documentation indicated that
> storage of the full english values is more appropriate. I have been unable to
> find information on performance issues of this decision.
>
> Consider an example where a widget can have 30 character Keywords associated to
> it in a m-m relationship. The way I see it there are several ways to handle
> this scenario.
>
> 1. KEYWORD reference table:
> KEYWORD_ID (INT)(PK)
> KEYWORD (VARCHAR2(30))(Alt unique index)
>
> WIDGET_KEYWORD associative table:
> WIDGET_ID (INT)(PK)
> KEYWORD_ID (INT)(PK)
>
> 2. KEYWORD reference table:
> KEYWORD (VARCHAR2(30))(PK)
>
> WIDGET_KEYWORD associative table:
> WIDGET_ID (INT)(PK)
> KEYWORD (VARCHAR2(30)(PK)
>
> 3. Same as 2. above, but dispose of the reference table.
>
> From a front-end development standpoint, scenario 3 is lacking in that a list
> of previously used keywords is not easily available for display/selection if
> desired.
>
> Scenario 1 will save space due to the use of Integer values to represent the
> keywords, but requires that the KEYWORD table be queried to view the keywords
> for a widget, which is not required for scenario 2.
>
> What are the performance implications of indexing on numeric columns (as in 1)
> vs. indexing on character columns (as in 2)?
>
> Any discussion on this matter is greatly appreciated--
>
> Phil Zubaly
> Z-Ware Engineering

  I don't see any space saving in scenario 1 as you'll be building  an altenate unique index on keyword anyway for value querying. If  anything, you'll be needlessly wasting the space of a redundant  unique identifier in keyword_id. I'd even go as far as to question  the need for widget_id instead of widget_name or another more  meaningful column in the widget table.

  Response time of course will depend upon many variables and needs  to be tested with actual volumes, but in scenario 1 any keyword  insert/delete is now performing 2 index updates instead of just one.  Consider frequency in this case.

  A bigger impact may be allowing keyword to be a variable length field,  test it as fixed length also, high volatility of a variable length  field can be an impact.

                                                           -bjh
Received on Tue Apr 29 1997 - 00:00:00 CEST

Original text of this message