Re: Storing Data in a Standard Format or not

From: Bradley Crockett <crockett_at_horizon.bc.ca>
Date: Mon, 10 Feb 2003 03:57:15 GMT
Message-ID: <rv6e4vk0ss7l7hrsvktvamkc6j2kr4mn9v_at_4ax.com>


Phil, be sure to keep the original data. You could offer an additional 'converted-to-standard' representation as well to make access easier. Once you take on the responsibility of converting that data over to a standard format, then you inherit any data quality issues that go with it.

Brad

On 9 Feb 2003 19:21:33 -0800, pjmaechling_at_yahoo.com (Phil Maechling) wrote:

>We have are constructing a database from a collection of external
>datasets.
>The external datasets talk about the same information but will
>represent it in different formats.
>
>As an example, latitude and logitude.
>One data set will use degees minutes, the other will use decimal
>degrees. (this is a simple example,others are more complex conversions
>between formats).
>
>I am looking for discussion of issues relating to this type of
>problem.
>We are discussing whether to convert the external data elements to a
>standard representation, or whether to preserve the original formats.
>
>We came up with these options, and tradeoffs:
>
>(1) Store original representation.
> No standard format. Not easy to read.
>(2) Store one "standard" format.
> Must convert all entries to standard format. Lose original
>representations. Must show how we converted to standard format.
>(3) Store one format, and a flag indicating the representation.
> Must define flags for all representations. Users must convert.
>(4) Store two formats, original and a standardized format.
> Two versions of the truth. Makes using data much easier.
>
>Other users must face this issue. What are the standard solutions ?
>In the theoretical side, I keep thinking that the representation is
>analogous to a "units" issue.
>For every numeric field in the database, we must know the units. Isn't
>format similiar (the same) to units ?
>Can't we just store the original format and keep the units, and the
>format in a document somewhere ?
>This however, makes programmatic access to the data very difficult.
>Thanks for any suggestions on how this is handled or on useful
>discussions of the tradeoffs.
>Phil Maechling
>pjmaechling_at_yahoo.com
>maechlin_at_usc.edu



Bradley Crockett
Duncan BC Canada Received on Mon Feb 10 2003 - 04:57:15 CET

Original text of this message