Re: Modeling Address using Relational Theory

From: dawn <dawnwolthuis_at_gmail.com>
Date: 4 Sep 2005 13:42:02 -0700
Message-ID: <1125866522.648736.150850_at_g47g2000cwa.googlegroups.com>


Marshall Spight wrote:
> dawn wrote:
> > Marshall Spight wrote:
> > A very significant and oft used operation that is
> > done with address is to order the lines of the address.
>
> That's not an operation. You don't start with them in one
> state and end with them in another.

If you have attr1, attr2 as attributes in a relation, they are unordered in the source and ordered in the sink where you have an address (not a representation of one).

> There's no function;
> you don't have inputs and output.
>
> There's no place in your code where you sort city, state,
> and zip, for example.

I'm still not tapped into your thinking, but this statement is irrelevant to me.

>
> > Ordering is THE KEY function
> > that is applied to these attributes -- would you agree with that?
>
> No. The order you are speaking of is not an attribute of the data;

I don't understand what the data are that are being modeled if they are not two lines of addresses that happen to be ordered.

> it's an attribute of the presentation of the data,

it is definitely that too, but it is an attribute of the addressLines that they are ordered too, whether you represent them with one on top of the other or one to the right of the other. That is because there is an ordering to them, outside of the representation.

> which is
> consistent with the fact that we're speaking of attributes
> of the metadata.

No, I'm talking about attributes of the address lines themselves. It is only when someone uses the relational model to model these as addr1, addr2 that you end up with ordering information in the metadata. That is what I'm thinking ought to be avoided.

> The operations we look at to decide whether something is a list
> or not is the operations on the data, NOT the operations on
> the metadata.

agreed. One operation on the data is an operation to order addr2 after addr1. If the data are modeled as a list, no operation needs to be coded in application logic as it is already encoded in the attribute type.

>
> > Do you mean that the semantics of these two
> > lines includes no ordering?
>
> Yes.

I am working to comprehend this. I simply don't see it, but if you are convinced you are right about that, I will work on it some more.

>
> > What might you name them otherwise to
> > avoid all hint of ordering information in the naming?
>
> I see no need to rename them. The names are hints to the
> human; they have no meaning to the computer.

But why are we making the application developer, for example, have to interpret the operation implicit in the names and then hand-code it when we could have created a list so that no additional hand-coded logic prompted by the hints in the attribute names was required? (that was poorly worded, sorry).

>
> > >From a programmer's perspective, the ordering operation is one of the
> > few things we have to care about with these text fields, perhaps along
> > with max length.
>
> Only in the UI. In the business logic, there is no concept of
> one being ahead of the other.

Whether it is the UI or a web service or any other target for these data, the logic taking the attributes values to the target needs to be hand-coded for the ordering in the one case of two strings, while it can be a string list in the other case. This does not have to do with representation -- it has to do with meaning.

>
> > We need to put one of these first and the other second.
>
> What exactly do you mean by "put"? Are you talking about the
> in-memory layout of a C struct?

I'm talking about the meaning of the data.

> I believe you're talking about
> the UI, which has no relevance for the sematics of the fields.

I agree that this is not related to the UI. It is related to meaning only.

>
> > When these are two separate attributes, the way we know which
> > one is ordered before the other is by reading the attribute names.
>
> That's just as true for city, state, and zip, though; and you've
> agreed they aren't lists. So this criterion is not sufficient.

I disagree. The representation of city, state, and zip might be ordered, but the meaning is not. The meaning of addr1 and addr2 is an ordered list of these two values.

>
> > Is it the case that you do not see the ordering function as an
> > advantage of declaring these addressLines to be a list attribute, or
> > that it is not sufficient to justify use of a list from your
> > perspective?
>
> I don't see the ordering function in the first place.

You see that a list has an ordering function, right? You just do not see that the two address lines form a list, as I understand it -- is that correct?

>
> > > just
> > > like I don't think firstName, lastName has any order, even
> > > though *conventionally* *in the USA* we *present* one always
> > > before the other. For both names and address lines, in other
> > > countries the conventions may be different.
> >
> > I completely agree when it comes to the name and do not see the same
> > pattern with the address.
>
> To me they are the same.
>
>
> > This one is not an experience issue. Isn't "ordering" a list
> > operation?
>
> No. *Sorting* is a list operation.

tomato, tomato
Fine, a list is sorted then, rather than ordered. Please replace the word "ordered" in anything above by "sorted".

> Note that one never sorts
> the address components.

Sure we do -- all the time, but the algorithm is really, really easy. If the data are stored in addr1 and addr2, we use the algorithm of starting with the value in ...1 and then ...2

> Let's put it another way. What if we went in to your source code
> and globally renamed addr1 to addr2 and vice versa. The UI would
> stay the same, though. No code would break. In fact, if you stripped
> the binaries, the before-and-after would be binary-identical.

It would be like swapping budget1 to budget2. The data would now be inaccurate even though the code could stay the same.

> This demonstrates that there are no list *operations* being done
> on the fields.

If it makes no difference whether value1 is placed into addr1 or addr2 and same for value2, then I would agree, but I think that most people maintaining addresses care a lot about whether the code swaps these two values. Agreed?

>
> > It isn't an operation ON the list, it is an operation on
> > the data -- it is intrinsic in the type "list" where otherwise you need
> > to apply the operation to two separate attributes.
>
> I don't agree that it's an operation on the data.
>
>
> > It might sound like I'm rehashing, but I think that there is the
> > potential here for me to communicate this question in a way that I get
> > back the type of information I am seeking. I'm trying to understand if
> > there is a logical rationale, aside from constraints in the target
> > dbms, for a conceptual or logical model of address to split these lines
> > out into addr1, addr2.
>
> Sure. As I've said, I wouldn't model addr1 and addr2 as a list in
> a Java object, even in the absence of, say, jdbc. They'd be separate
> properties, just as city and zip would be.
>
Hmm. OK, just trying to understand.
>
> > Would the logic be the same if someone were to suggest that word1 of a
> > description, word2, ... word20 should be split out into twenty
> > attributes rather than putting them into a space-delimited single
> > attribute? If not, what am I missing?
>
> No.
>
> You're missing the fact that your string of words doesn't have
> any structure beyond "description" but the address does.

This is where I'm sure I'm missing something. What is the structure of the 2nd & 3rd address lines that I'm referring to as addr1 and addr2? I know there are a bunch of possible components, but I'm unaware of a world-wide structure on this data. It seems to be handled programmatically as two strings (with an order, in my experience).

>
> > The January budget is not interchangable with the December budget, but
> > that doesn't mean that we would not want these in an array, right?
>
> Yeah, "interchangable" was a poor choice of words.
>
>
> > > If I were modelling a chessboard, say to solve
> > > the 8 queens problem, each one of the 8x8 squares is distinct
> > > from the next, but they're all instances of the same thing.
> > > I'd certainly use a list of lists and not make up 64 names,
> > > or even use 64 attributes with names like square35. I'd probably
> > > use a list even if it was 2x2.
> >
> > But these square are not interchangeable either.
>
> They kind of are, though. The only relationship they have
> with each other is a spatial one. 2,2 isn't any different
> than 3,5, except for its position. They have no differing
> semantics. The semantics are all semantics for the entire
> grid, not for any given square.

OK, and that is what I though these two address lines were too. So, I'm missing some key piece of data about the structure of address lines. Any clue where I might find that?

>
> > If I thought this was just perspective, I would not be so persistent.
> > Based on your description, I'm thinking there is some world-wide
> > standard related to the second & third lines of an address of which I
> > am not aware that would render these lines as having different
> > meanings.
>
> I really haven't done all that much with addresses and I'm not all
> that into ISO standards. (Sorry, Joe. :-) I'm just going off of
> caffeine, stubbornness, and my own understanding of semantics.
> Plus some address work I did on a big webapp.

I'm a recovering caffeine addict, and I did read the ISO standard once upon a time, but I'm going on experience and a desire to prepare a relational model and a non-relational model that includes an address. I want to be fair to relational theory and I cannot believe that addr1, addr2 would be acceptable in theory, even if it is in practice. But I'm starting to think that everyone else knows something about how lines 2 & 3 of an address can be flipped around to represent the exact same address when I thought the ordering was important to the meaning. Thanks for your help. It sounds like I have some studying to do. cheers! --dawn

>
> Marshall
Received on Sun Sep 04 2005 - 22:42:02 CEST

Original text of this message