Storing Data in a Standard Format or not
Date: 9 Feb 2003 19:21:33 -0800
Message-ID: <202415c5.0302091921.54be5ddc_at_posting.google.com>
We have are constructing a database from a collection of external datasets.
The external datasets talk about the same information but will represent it in different formats.
As an example, latitude and logitude.
One data set will use degees minutes, the other will use decimal
degrees. (this is a simple example,others are more complex conversions
between formats).
I am looking for discussion of issues relating to this type of
problem.
We are discussing whether to convert the external data elements to a
standard representation, or whether to preserve the original formats.
We came up with these options, and tradeoffs:
(1) Store original representation.
No standard format. Not easy to read.
(2) Store one "standard" format.
Must convert all entries to standard format. Lose original
representations. Must show how we converted to standard format.
(3) Store one format, and a flag indicating the representation.
Must define flags for all representations. Users must convert.
(4) Store two formats, original and a standardized format.
Two versions of the truth. Makes using data much easier.
Other users must face this issue. What are the standard solutions ?
In the theoretical side, I keep thinking that the representation is
analogous to a "units" issue.
For every numeric field in the database, we must know the units. Isn't
format similiar (the same) to units ?
Can't we just store the original format and keep the units, and the
format in a document somewhere ?
This however, makes programmatic access to the data very difficult.
Thanks for any suggestions on how this is handled or on useful
discussions of the tradeoffs.
Phil Maechling
pjmaechling_at_yahoo.com
maechlin_at_usc.edu
Received on Mon Feb 10 2003 - 04:21:33 CET