Re: Demo: Modelling Cost of Travel Paths Between Towns

From: Hugo Kornelis <hugo_at_pe_NO_rFact.in_SPAM_fo>
Date: Tue, 16 Nov 2004 21:45:42 +0100
Message-ID: <41nkp0dqn1c797o9lspa25ogp5hmgaf29f_at_4ax.com>


On 16 Nov 2004 11:47:27 -0800, Neo wrote:

>> > redundant strings: Modifying one will corrupt your data.
>>
>> Okay, so to prove that storing the same string twice is not redundant, ...
>
>Are you reading what you are writing? Of course representing (storing
>at logical level) the same string twice is redundant. Isn't that
>simply obvious?

Hi Neo,

Silly you - if it was obvious, why are we having this discussion. To me, it's obvious that you are posting utter nonsense.

>> ...example:
>>
>> Mary likes John
>> Paula likes John
>>
>> .. [Mary now like Paul not John]...
>>
>> Mary likes Paul
>> Paula likes John
>>
>> I modified one. My data is not corrupt.

Any reason why you had to leave the remaining "QED" out of this quote? Even without indicating that you snipped my quote? I will gladly agree that I did something similar with one of your messages yesterday - but I clearly indicated that I did it, made clear that it was intended as a joke and included the snipped part of your message after my joke. You, on the other hand, are attempting to twist my words by incomplete quoting.

Note to the group: after not paying his debts, not keeping his promises, making false claims about his products, copyright violation, lying, flooding and spamming, Neo is now also guilty of post editing.

>First, above example is deficient in that a program could not
>guarantee that John in line 1 and 2 are the same. You shouldn't rely
>on matching strings (or have you forgotten the lesson of the same flaw
>in your RM Solution #1 for Common Ancestor Report). Much like towns in
>Celko's example, persons in your example needs to be in a separate
>table to avoid redundancy.

Since you obviously fail to grasp even the most basic prerequisites, I guess I'll have to spell it out. I'll use short sentences:

* It was an example.
* It was simplified.
* I assumed first name uniquely identifies a person in this example.
* Real world is not so simple.
* I make different data models for the real world.

Based on the assumption that first name uniquely identifies a person, John in line 1 IS the same as John in line 2. Without that assumption, the example sentences would not even have been facts; they'd have been meaningless strings of words.

>Assuming, you meant to keep the example simple and that John on line 1
>and 2 are the same person, then you do have redundant data (the
>person's name). For example, after having entered the data, you
>realize that John's name is really spelled Johnn. Changing the first
>John to Johnn, corrupts the data. A program can no longer determine
>that Johnn on the first line is the same as John on the second line.

Regardless of the model, ANY time a change is not entered correctly, the data will become corrupt. Changing Mary's preference is not the same kind of data modification as changing John's name - they require the execution of different code. I presume I don't have to spell out the correct UPDATE statement to change John's name to Johnn without corrupting the data.

>The reason your example is flawed is because you did not modify one of
>the redundant John's, you modified a relation involving John ("Mary
>like John" to "Mary like Paul").

I changed one of the so-called "redundant" strings, just as you indicated. I can't help that this modified a relationship - it's what you asked for.

>> The RM doesn't believe in "one model fits all".
>
>Then you are wiser as some RM zealots here have been propogating to
>the contrary for years.

<Sigh> And yet again, I'll have to spell out the obvious. The quote should have read >>The RM doesn't believe in "one data mode fits all"<<

Oh and by the way - I am not in the camp that advocates that the RM is the one and only Holy Model. I firmly believe that there is no single model that will fit all purposes.

(I also have clear evidence that there is at least one model that will fit no purpose, but let's not go there - not again).

>> If a customer's business requires operations on a symbol
>> (or rather: > character) level, the RM is quite capable of handling it.
>
>I would say RM is capable of it, but it is impractical.

Say what you want.

> All TM/XDb2
>dbs are normalized to individual symbols (just by entering simple
>scripts).

Whereas no client actually need the references to their business objects to be broken up.

> Nearly all RM dbs aren't. Would you be willing to
>script/query some examples to prove or disprove this assertion? For
>example find all things named John in a normalized/null-less db that
>might contain persons, horses, robots, etc. With XDb2, the query is
>"%.name='john'".

Of course I would - no problem. Of course, I demand to know the full specifications (in business terms) before I start; you can't change them afterwards. And though I normally don't do this, in *your* case, I'll have to demand advance payment.

Post the specifications; I'll make an estimate of the time required to create what you need. I'll start working on it as soon as my bank confirms receipt of the payment.

>> However, I have not yet encountered any such business.
>
>Then you probably haven't dealt with the business of AI type
>applications. While quite an important scope, business applications
>aren't the entire scope.

That's correct. Could you elaborate? I have a hard time trying to imagin how the 'o' in 'John' should be of specific interest in any application that doesn't involve typesetting (in which case ONLY the 'o' and it's position relative to 'J' and 'h' is relevant, not the full string 'John') or numerology.

>> > [Rhetorical] Why can't one represent the symbol X the same way as
>> > representing the person john in RM?
>>
>> One can: CREATE TABLE Symbols (symbol char(1) NOT NULL PRIMARY KEY)
>
>RM can as you show above which is the same method I showed in OT "A
>Normalization Question". It's not that RM can't, but RMer willingly
>shortcut true relational methodology because it is too cumbersome with
>little benefit for their scope.

"True relational"? Are you referring to the Third Manifesto or am I missing something else here?

Other than that, I have no idea what you mean. I know all the words, but somehow they don't connect to meaningfull sentences. All probably my fault (English is not my native language) - but please explain what you mean.

>> John is a thing (or rather: a person). 'John', 'X', '-> John', 1FB54A and
>> 110010100111010011001 are all references to that thing (using different
>> referencing schemes), not the thing itself.
>
>The flaw in your understanding of normalization is in the above
>paragraph. True, the thing in the db representing John is not the same
>as the real John. In TM/XDb2, there is only one thing within the db
>that represents the person John. Then there are as many
>data-independent references to that one John (within db) as needed.

So you refer to John once by name and many times through a pointer. Good for you. You still refer to John many times.

At the logical level, it's not relevant *HOW* a reference to John is stored, but *IF* a reference to John is stored. You are the one who insists that we should only consider the logical level, yet you are also the one who keeps on hammering about the technicalities of how XDb stores it's references and how that is so superior. Frankly, I don't give a hoot about how a reference is stored; that just distracts me from my work.

> In
>your simple example, you have represented John twice with no
>guaranteed mechanism to tie them together. There is no strong
>relationship tying the two together as does a data-independent
>ref/id/link.

Yes, there is: 'John' = 'John'. Based on the presumption that first name uniquely identifies a person within the context of the simplified example, that is all you need.

> You have already demonstrated the weakness/inflexibility
>of using strings as refs in your RM Sol#1.

Lying again, hmm? Aside from one error (that I found and fixed in my second solution), it was a perfect solution for the requirements you posted. It stopped to be a perfect solution when you changed the requirements.

> The ref should be
>completely independent of the thing being represented (ie don't use
>its name) and the ref should be mostly hidden from the user (ie can
>you find one in XDb2 scripts?).

You really think I read those things? Gosh - this group is called comp.databases.theory, not comp.databases.scripts-that-look-like-C.

>The problem with multiple things in a db representing the real John
>(ie 'John' and 'John') is how does a program guarantee their
>synchronization. You can't guarantee it without data-independent refs
>to the original in db. This is why I can corrupt your data by changing
>the first John to Johnn.

See above.

>> You still have trouble understanding the basic fact that redundancy in the
>> relational model is not about storing _references_ to a "thing" twice, but
>> about storing FACTS about "things" twice. Until you understand that basic
>> fact, you'll never understand basic normalization.
>
>Funny, you know it, but don't know it. Your explanation is exactly why
>you have redundant data, because you are storing facts about things
>twice and not a reference to the original fact in db there after. In
>your simple example, you stored john (a fact) twice.

Bwahahahahahaha!

You call 'john' a fact?????

Bwahahahahahahahahahaha!!!

(Thanks for making me laugh, Neo!)

> The second john
>does not have a ref to the original (which should be in T_Person). If
>the second John had a ref to the original John, then changing the
>original John to Johnn does not corrupt the db.

That, dear Neo, depends on WHY the "original" John is changed to Johnn. Or Paul.

Best, Hugo

-- 

(Remove _NO_ and _SPAM_ to get my e-mail address)
Received on Tue Nov 16 2004 - 21:45:42 CET

Original text of this message