assigning codes to events
Date: 18 Jun 2002 17:05:37 -0700
Message-ID: <d8eb949e.0206181605.6a3361f2_at_posting.google.com>
I often run into the issue of needing a particular set of codes to represent some set of actions within a system. For instance, consider the (somewhat) trival example of the postal system. You send a letter. You then need a set of codes to describe the outcome of that letter (assume you don't care about resending in case of failure, etc.). Assume one of the following things can happen to the letter:
- Delivered successfully
- Lost in mail (permanantly)
- Returned to sender
My approach to creating codes to symbolize the state of the letter would be to create the following:
Letters
LetterID DateSent ResultCode
--------- ---------- -----------
1 5-15-2002 1 2 5-16-2002 1 3 5-17-2002 3 4 5-17-2002 2 5 5-18-2002 3
ResultCodes
ResultCode Meaning WasDelivered ----------- ---------------------------- ------------- 1 Delivered successfully 1 2 Lost in mail (permanantly) 0 3 Returned to sender 0
If later the requirements of the system changed and required the system to keep track of whether or not the letter was returned because of a bad address or lack of postage, I would tend to change the database to the following:
Letters
LetterID DateSent ResultCode
--------- ---------- -----------
1 5-15-2002 1 2 5-16-2002 1 3 5-17-2002 4 4 5-17-2002 2 5 5-18-2002 5
ResultCodes
ResultCode Meaning WasDelivered WasReturned ----------- ---------------------------- ------------- ------------ 1 Delivered successfully 1 0 2 Lost in mail (permanantly) 0 0 3 Returned to sender 0 1 4 RTS - Insufficient postage 0 1 5 RTS - Address unknown 0 1
I realize that there are quite a few other ways to handle this situation, but that is the one I would tend to choose. I have a colleague that has quite a different opinion. His approach would be the following:
Original:
Letters
LetterID DateSent ResultCode
--------- ---------- -----------
1 5-15-2002 100 2 5-16-2002 100 3 5-17-2002 300 4 5-17-2002 200 5 5-18-2002 300
ResultCodes
ResultCode Meaning
----------- ---------------------------- 100 Delivered successfully 200 Lost in mail (permanantly) 300 Returned to sender
Revised requirements:
Letters
LetterID DateSent ResultCode
--------- ---------- -----------
1 5-15-2002 100 2 5-16-2002 100 3 5-17-2002 310 4 5-17-2002 200 5 5-18-2002 320
ResultCodes
ResultCode Meaning
----------- ---------------------------- 100 Delivered successfully 200 Lost in mail (permanantly) 300 Returned to sender 310 RTS - Insufficient postage 320 RTS - Address unknown
The application logic would then know that all codes 300 - 399 meant that the letter was returned to sender and that the different codes within that range specify why the letter was returned. If you're looking for all returned mail, you search for the range, otherwise, you search for a particular code.
To me, this approach seems to violate the first normal form by making the ResultCode field non-atomic. It then indicates both that the letter was returned and the reason why the letter was returned. Am I correct in this or am I misinterpreting the 1NF?
Whether or not my reasoning as to why I think the second approach is flawed is correct, I would like opinions as to the pros and cons of either approach and if there are any worthy third options (I'm pretty much certain there are). Also, if I am correct in my conclusion that the second approach is problematic, why are there so many systems created that have codes whose values are grouped according to meaning? I know I have written interfaces to quite a few third party systems (in several different industries) where this is the case. To me, the second approach seems very much like the old BASIC line numbering scheme... 10, 20, 30... oops, I need a line between 10 and 20...15. I may be being a bit overcritical here, thus my request for opinions for or against either method.
Thanks in advance,
Chris Cubley
www.queryplan.com
Received on Wed Jun 19 2002 - 02:05:37 CEST