Re: Modeling Address using Relational Theory

From: dawn <dawnwolthuis_at_gmail.com>
Date: 4 Sep 2005 19:26:37 -0700
Message-ID: <1125887197.151786.138880_at_g49g2000cwa.googlegroups.com>


Marshall Spight wrote:
> Ha ha! I'm posting on this topic again.
>
> dawn wrote:
> > Marshall Spight wrote:
> > > dawn wrote:
> > > > Marshall Spight wrote:
> > > > A very significant and oft used operation that is
> > > > done with address is to order the lines of the address.
> > >
> > > That's not an operation. You don't start with them in one
> > > state and end with them in another.
> >
> > If you have attr1, attr2 as attributes in a relation, they are
> > unordered in the source and ordered in the sink where you have an
> > address (not a representation of one).
>
> I can't parse this. What's the sink?

Any target for the data values.

> And to me, when you draw
> a distinction between an address and a representation of an
> address, I think, mailbox-beside-the-curb vs. record in a table.
> So I guess I'm saying I don't get what you're saying here.

So do you think that sorting only relates to representation and not the reality of what is being modeled? I have two daughters and it should come as no surprise that the oldest was born first. The word "oldest" implies this. Children are an ordered list that is typically modeled with birthdates giving the ordering information. These values are not just ordered when listed, however, the birth pains for the first really did come before the birth pains for the second whether or not I represent this data in any order at all. My point is that order is not just a representation issue -- it has to do with the reality being modeled.

>
> > > it's an attribute of the presentation of the data,
> >
> > it is definitely that too, but it is an attribute of the addressLines
> > that they are ordered too, whether you represent them with one on top
> > of the other or one to the right of the other. That is because there
> > is an ordering to them, outside of the representation.
>
> I disagree. The fields in a C struct, the properties in a Java class,
> and the columns in an SQL table are all unordered.

Yes, but the property that has a type of string array has order within defined within it. In Java, lists are typically modeled as lists (arrays), so I see this problem more in relational models than OO designs.

> Yes, they have
> layout in memory and in SQL's case, notational order, but in the
> platonic sense, they are unordered. Their meaning is derived
> from their names, not their position.
>
> This is *exactly* the same thing as the difference between
> ordered pairs and named attributes, (a topic we actually
> agree on. :-)
>
>
> > > The operations we look at to decide whether something is a list
> > > or not is the operations on the data, NOT the operations on
> > > the metadata.
> >
> > agreed. One operation on the data is an operation to order addr2 after
> > addr1. If the data are modeled as a list, no operation needs to be
> > coded in application logic as it is already encoded in the attribute
> > type.
>
> There's nowhere in the code where you *do* anything to put
> these fields in order.

I might agree if it were the case that you could collect these data from a user in any order on an input form. I can put State/Province before City and people might have to change their habits, but they will be able to adjust much more handily than if I put addr2 before addr1 on a form. Gene was right when he said that you could code this as a single value and use markup of some sort to identify the second value from the first. Putting them in the opposite order would make a whole lot less sense than putting other values in an order that is not common. It would be more like having the user write a Word document with the last line on top.

> > But why are we making the application developer, for example, have to
> > interpret the operation implicit in the names and then hand-code it
> > when we could have created a list so that no additional hand-coded
> > logic prompted by the hints in the attribute names was required? (that
> > was poorly worded, sorry).
>
> No, I got you.
>
> I don't see that we're doing that at all. What special thing does
> he (or she, grumble) have to do for addr1 and addr2 that he (or she)
> doesn't also have to do for city state zip?

If doing a "dump" of values for almost any purpose, most values can be in any order, including PostCode and State/Province, but addr2 must follow addr1.

> If we're not talking
> about UI, then the answer is nothing. So the "operation" in the
> above paragraph doesn't exist.

If the data always stay in the database and are not used anywhere else, then there is no need to order these values. That reminds me of the librarian who was in a great mood because "all the books have been checked back into the library except for one". Yes, I am referring to USING the data, and you can consider all uses to be representation, I guess. Hmmm.

>
> > Whether it is the UI or a web service or any other target for these
> > data, the logic taking the attributes values to the target needs to be
> > hand-coded for the ordering in the one case of two strings, while it
> > can be a string list in the other case.
>
> Could you find a snippet of code where it has to do that?

select addr1, addr2 from Address;

> I'd be interested to see what specifically you mean. I don't
> believe any such logic exists that doesn't also exist for
> the other fields.

There might be an analysis of the data that makes use of the city, but doesn't include any other data values from the address. Can you think of any use for addr1 that is not a use for addr2 or vice versa? They are really the same attribute value, just placed into two attributes, as best I can tell.

>
> > > That's just as true for city, state, and zip, though; and you've
> > > agreed they aren't lists. So this criterion is not sufficient.
> >
> > I disagree. The representation of city, state, and zip might be
> > ordered, but the meaning is not. The meaning of addr1 and addr2 is an
> > ordered list of these two values.
>
> Well, this is the fundamental core of the disagreement. I believe
> we can resolve the general discussion if and only if we can resolve
> this specific point. To me, city, state, zip, addr1 and addr2 are
> all of a kind with respect to intra-attribute ordering.

OK. I think you are right that this is the crux of the matter. addr2 has meaning to you outside of addr1. So, what is that meaning? It isn't just th second line of the address as that would relate it to addr1. So, what is it?

>
> > You see that a list has an ordering function, right?
>
> No! There is no function. There is no spoon. There is no
> mountain. It's only in the head of the human that this
> order exists.

I think I'm using the term list the way you said you use it. It includes a function that maps the positive integers (or the non-negatives for many computer languages) to the values. You can call them sorted by this index, but that is equivalent for me to saying they are ordered. There is a 1st one, a 2nd, and so on.

>
> > > > This one is not an experience issue. Isn't "ordering" a list
> > > > operation?
> > >
> > > No. *Sorting* is a list operation.
> >
> > tomato, tomato
> > Fine, a list is sorted then, rather than ordered. Please replace the
> > word "ordered" in anything above by "sorted".
>
> No, I'm not being terminological; I was trying to get at a specific
> difference. The difference is between 1) a *static property*
> which is the order of the fields in the mind of the human,
> and which has no representation inside the computer, and 2)
> an actual coded-into-machine-language function that takes an
> unordered list of data and puts it into a specific order: a
> sorting operation.

Get your head out of the relational implementations and think of this not as two attributes, but as a single attribute with the type being a "list". That type of "list" comes packaged with an ordering function.

> I agree that 1) has an order. I assert that in sense 2) there
> is no order.

I agree. I'm talking about an attribute defined as a list.

> I tried to make this distinction with some lame adjective-vs-verb
> play; it didn't work. But I wasn't just faulting you for word
> choice; I at least try not to do that.
>
>
> > > Let's put it another way. What if we went in to your source code
> > > and globally renamed addr1 to addr2 and vice versa. The UI would
> > > stay the same, though. No code would break. In fact, if you stripped
> > > the binaries, the before-and-after would be binary-identical.
> >
> > It would be like swapping budget1 to budget2. The data would now be
> > inaccurate even though the code could stay the same.
>
> No! I *didn't* say to swap all the values. I just said
> to *rename* all the *variables* inside the source code. The
> values all stay put.

OK, got it.

> You know that variable names are not generally present in
> the production binaries of statically typed languages, right?
> (Maybe you didn't know that; your background is a lot different
> from mine.)

oddly, I did know that from my reading, but it's just a fluke -- good idea to assume I'm ignorant on such matters and I'll assume I ought to catch up someday

>
> > > This demonstrates that there are no list *operations* being done
> > > on the fields.
> >
> > If it makes no difference whether value1 is placed into addr1 or addr2
> > and same for value2, then I would agree, but I think that most people
> > maintaining addresses care a lot about whether the code swaps these two
> > values. Agreed?
>
> Again, I was not proposing swapping the values. I was only making
> a change to the metadata.

OK.

>
> > > > Would the logic be the same if someone were to suggest that word1 of a
> > > > description, word2, ... word20 should be split out into twenty
> > > > attributes rather than putting them into a space-delimited single
> > > > attribute? If not, what am I missing?
> > >
> > > No.
> > >
> > > You're missing the fact that your string of words doesn't have
> > > any structure beyond "description" but the address does.
> >
> > This is where I'm sure I'm missing something. What is the structure of
> > the 2nd & 3rd address lines that I'm referring to as addr1 and addr2?
> > I know there are a bunch of possible components, but I'm unaware of a
> > world-wide structure on this data. It seems to be handled
> > programmatically as two strings (with an order, in my experience).
>
> I said the address has structure; I didn't say addr1/addr2 had
> structure.

I agree with that, but still missing something.

> > > > But these square are not interchangeable either.
> > >
> > > They kind of are, though. The only relationship they have
> > > with each other is a spatial one. 2,2 isn't any different
> > > than 3,5, except for its position. They have no differing
> > > semantics. The semantics are all semantics for the entire
> > > grid, not for any given square.
> >
> > OK, and that is what I though these two address lines were too. So,
> > I'm missing some key piece of data about the structure of address
> > lines. Any clue where I might find that?
>
> Best I can do: what is some ***operation*** (in other words,
> some function or method expressed in source code) that you
> perform on addr1 and identically on addr2, that you *don't*
> also perform on city/state/zip?

It is a mapping of the number 1 to addr1 and the number 2 to addr2.

> If you can find one, they start to look like a list. If you
> can't find one, they they start to look like they have
> the same relationship with each other that they have to
> city/state/zip.

I agree.

> > I'm a recovering caffeine addict,
>
> Mmmmm, coffee.

geekier than that -- mountain dew -- they have got to get me a splenda version soon or I might just go off the wagon

>
> > and I did read the ISO standard once
> > upon a time, but I'm going on experience and a desire to prepare a
> > relational model and a non-relational model that includes an address.
> > I want to be fair to relational theory and I cannot believe that addr1,
> > addr2 would be acceptable in theory, even if it is in practice.
>
> The consensus here has been that addr1/addr2 do not form a repeating
> group. I'm not aware of any other theoretical basis for objecting
> to them.

I really, really wish I understood that. Can you think of any two values that would form a repeating group then? Is there no such thing as a repeating group of cardinality 2? If there is, what does it possess that addr1 and addr2 do not possess?

>
> > But
> > I'm starting to think that everyone else knows something about how
> > lines 2 & 3 of an address can be flipped around to represent the exact
> > same address when I thought the ordering was important to the meaning.
>
> You can't flip them around! Neither can you swap city for state,

I can swap the city & the state anywhere that people use them except in representations that require them in a particular order. It is intrinsic in addr1 & addr2 that I cannot switch their order, however.

> nor zip for addr2. But this is consistent with my claim that
> addr1/addr2/city/state/zip all have the same interrelationship.
>

We might get there on this one yet. Thanks for your stubborness on it.  In the next round I might just say "uncle" but I'm thinking you should instead.
smiles. --dawn

> Marshall
Received on Mon Sep 05 2005 - 04:26:37 CEST

Original text of this message