Re: Modeling Address using Relational Theory

From: dawn <dawnwolthuis_at_gmail.com>
Date: 4 Sep 2005 10:57:22 -0700
Message-ID: <1125856642.232802.269730_at_g49g2000cwa.googlegroups.com>


Marshall Spight wrote:
> dawn wrote:
> > I was hoping to hear from you on this one, Marshall. Do you agree that
> > defining something as a list, even if using no insert or delete
> > operators on it, has advantages over manually enumerating the list
> > values using attribute names that have ordering information in them and
> > then coding the logic for the order in the application code?
>
> Not so much. What operations you need to do are quite significant
> from my perspective.

>From mine as well. A very significant and oft used operation that is
done with address is to order the lines of the address. In fact, these two lines (removing the addressee line at the top of an address block and the city, postcode, etc line at the end for this discussion) often have no parsing or other operations other than to place them in the position in which they were identified. Ordering is THE KEY function that is applied to these attributes -- would you agree with that?

> Also, I don't believe addr1 and addr2
> actually do have any ordering information between them,

I guess I got my answer. Do you mean that the semantics of these two lines includes no ordering? What might you name them otherwise to avoid all hint of ordering information in the naming? All I could come up with was to name one with the term "Optional" designated somehow so that you know to put the required one first. But that still provides you with required ordering information in the name.

>From a programmer's perspective, the ordering operation is one of the
few things we have to care about with these text fields, perhaps along with max length. We need to put one of these first and the other second. When these are two separate attributes, the way we know which one is ordered before the other is by reading the attribute names. However, when these are combined into a single list attribute, we don't have to glean the ordering data from the attribute names.

Is it the case that you do not see the ordering function as an advantage of declaring these addressLines to be a list attribute, or that it is not sufficient to justify use of a list from your perspective? I don't understand how you could claim the former. In implementations where lists are either non-existent or have added complications, then I can see how you might thing the latter.

> just
> like I don't think firstName, lastName has any order, even
> though *conventionally* *in the USA* we *present* one always
> before the other. For both names and address lines, in other
> countries the conventions may be different.

I completely agree when it comes to the name and do not see the same pattern with the address.

> If we called them familyName and givenName, would it make them
> seem less ordered? If we called them streetAddress and apartmentNumber
> would that make them seem less ordered?

Yes. Is that what these are for addresses world-wide? That is not my experience. Where is the PO Box? Sometimes the apartmentNumber is properly on the line with the streetAddress, for example. While I've read U.S. Postal standards, I'm not aware of world-wide standards, so if you know of any, please point me to them. Perhaps these two address lines do have more structure than that with which I'm aware.

>
> > Do you still think that addressLine1, addressLine2, each single-valued
> > is a better approach than addressLines that is a multivalued list?
>
> Yes. (Based on my experience with not needing to do list operations
> on them; I understand your experience is different.)

This one is not an experience issue. Isn't "ordering" a list operation? It isn't an operation ON the list, it is an operation on the data -- it is intrinsic in the type "list" where otherwise you need to apply the operation to two separate attributes.

It might sound like I'm rehashing, but I think that there is the potential here for me to communicate this question in a way that I get back the type of information I am seeking. I'm trying to understand if there is a logical rationale, aside from constraints in the target dbms, for a conceptual or logical model of address to split these lines out into addr1, addr2.

>
> > Is
> > that because there are only two values in this list? If there were 15,
> > would it still be the best approach?
>
> If you had 15 of something, and they were in no way interchangable,
> and they each had specific semantics, then I'd want to refer to
> them by name.

Would the logic be the same if someone were to suggest that word1 of a description, word2, ... word20 should be split out into twenty attributes rather than putting them into a space-delimited single attribute? If not, what am I missing?

> Consider any Java class with 15 properties that
> just all happen to be ints or strings or whatever. Would you
> define your get() methods as get1(), get2(), or would you use
> getBudget(), getCurrency(), getBalance(), etc?

I'm in complete agreement that if the attributes can be named in such a way as to not imply ordering (as addr1, addr2 implies), then I would surely not include them in a java array, nor separate fields numbers 1-n. I'm good with getBudget(), for example. I'm not as good with getJanBudget() getFebBudget(), ... getDecBudget(). In that case, put the budgets in a single attribute and have getBudget() return an array.  Similarly with our addr example.

> If they *were* interchangable, though, I'd be more likely to
> use a list.

The January budget is not interchangable with the December budget, but that doesn't mean that we would not want these in an array, right? If we are tracking some production metric on a daily basis, even though each day is surely separate and we would not flip values of metric65 and metric87, for example, it would be silly to make attributes metric1, metric2, ..., metric366, right?

> If I were modelling a chessboard, say to solve
> the 8 queens problem, each one of the 8x8 squares is distinct
> from the next, but they're all instances of the same thing.
> I'd certainly use a list of lists and not make up 64 names,
> or even use 64 attributes with names like square35. I'd probably
> use a list even if it was 2x2.

But these square are not interchangeable either.

> So I guess there's a matter of perspective in deciding which
> of the above two paragraphs best describes addr1 and addr2.
> To me it's the first paragraph; to you it's the second.

If I thought this was just perspective, I would not be so persistent. Based on your description, I'm thinking there is some world-wide standard related to the second & third lines of an address of which I am not aware that would render these lines as having different meanings. I'll try to figure that out, but you can point me to something relevant, that would be appreciated. Thanks. --dawn

>
> Marshall
Received on Sun Sep 04 2005 - 19:57:22 CEST

Original text of this message