Re: Units of Measurement in the Database Model

From: David J. Aronson <dja2001_at_att.net>
Date: Mon, 15 Apr 2002 18:58:20 GMT
Message-ID: <3CBB21D5.DD95CDAD_at_att.net>


"John R. Lewis" wrote:

> I am designing the Data Model for a large enterprise application, and
> have run into a problem. I solicit your feedback.
>
> The Problem:
>
> How should the database be organized in regards to measurements
> recorded, and the units of measurement (UOM) used to record them?

FWIW, I used to work for a company that did real property planning (R&K Engineering, at http://www.rkeng.com). The records of individual facilities (e.g., a clinic, a sports field, etc.), in the Army's database, included how many units it had, and what the units were. IIRC (it was several years back), they used a two-letter field, usually either SF (sq. feet) or SY (sq. yds.), but occasionally also LF (linear feet) or EA (eaches; no size-units applicable). It worked quite well for our purposes. I don't think there was an additional table explaining them, but there were so few of them it wasn't necessary for the humans, nor did we have any use for it in programming. In a more "dimensionally diverse" system, I suppose it could be useful. So, count this as a default vote for door #4, at least pending any further info why it might not be suitable for your particular application.

> 1) Don't Record UOM in Data Model

...
> PRO

...
> * Smaller Database size

This is not as big a concern as it used to be. Disk space is much cheaper now, as are CPU cycles.

> CON
> * All code that accesses the data must be configured identically, or
> the data is meaningless.

It's this shifting from data to code, that allows the data to stay simple -- but at such a cost! Programmers might be cheaper than they used to be, but nowhere near so much so as hardware.

> 2) Hardcode the UOM in the Data Model
...
> PRO

...
> * Smaller Database size.
> * Most convenient data model for which to write application and
> presentation logic.

I disagree. If you're going to store very diverse things in the database, *without making each thing that can be measured and stored a separate type of record/object/whatever*, this will explode the number of columns/attributes, and use only one of each possible measurement. Likewise, the programmer will have to account for each possible one.

Or am I misunderstanding what you mean?

> 3) UOM Lookup Table in the Data Model
...
> PRO

...
> * Smaller Database size

I don't see how. The overhead of recording each record's UOM along with a reference to that record, must be larger than not recording it.

> 4) UOM Foreign Key per measurement
...
> PRO
> * No UOM confusion
> * The UOM can vary from record to record.

It could also form the basis for far more power and flexibility! You can further describe the UoM to denote which ones are distance, area, volume, time, mass, force, energy, temperature, etc., and how to convert from one to another. (Okay, for ones with an arbitrary zero, like temperature, it can get a bit more hairy to automate, but for simple base types (e.g., feet and meters, or seconds and hours), a trivial conversion factor would suffice. If you want to allow "hairy" mechanisms, you can even think up a new table to hold conversions such as how to combine kilograms, meters, and seconds to get newtons, joules, pascals, etc.)

> CON
> * Database size may become an issue

That depends how big and diverse your database (i.e., record size and number of UoMs) already is. In something like the Army real property database, a two letter code doesn't add much space, and the UoMs were so few that a lookup table would have been of trivial size.

Sorry to ramble on, but this reminded me very much of when I had an idea for a computer language, back in college. I thought that a lot of programming mistakes could be avoided if the concept of strong typing could be extended from mismatches like "int versus char" to ones like "feet versus pounds". I came up with a syntax, and some concepts about how a compiler could construct and consult tables to go about unit-checking the attempted calculations and automating conversions. That was before the popularity of object-oriented programming. Much of this could be done that way, though it would be a bit tedious. Perhaps one day I will write the base class library, to allow people to substitute various classes for simple integers. Would anybody really like to see that? For that matter, does anyone know if much (or any!) work has already been done along those lines?

-- 
David J. Aronson, Software Engineer FOR HIRE IN PHILADELPHIA AREA
Resume, and other details, online at: http://dja2001.home.att.net
Received on Mon Apr 15 2002 - 20:58:20 CEST

Original text of this message