Re: Fails Relational, Fails Third Normal Form

From: Derek Asirvadem <derek.asirvadem_at_gmail.com>
Date: Sat, 7 Feb 2015 17:46:38 -0800 (PST)
Message-ID: <abc8bebc-4e27-4af4-a16d-dd94ad61f21c_at_googlegroups.com>

> On Sunday, 8 February 2015 05:39:26 UTC+11, Nicola wrote:
>
> > > To find the keys, you
> > > must determine which functional dependencies hold.
> >
> > That is a very slow method, but yes. The tables are dead simple, and the
> > attributes should be well-known to anyone who has any experience at all.
>
> Sure. If you find the keys directly, in fact you have defined some
> functional dependencies (from the keys). But, in general, you must
> ensure that no other relevant semantic constraints (not necessarily only
> functional) are missed.

Do you have an example of such ?

> > > Once you have found all the keys, you may tell whether your
> > > schema is in 3NF or not (in this case, it is - provided that I interpret
> > > correctly, and I suppose I do, the meaning of the remaining attributes).
> >
> > No, I have stated that it FAILS 3NF.
>
> Look, I overall accept your line of argument and its conclusions, with a
> caveat. Let me see if we can converge.

Yes!

> You say a schema with a surrogate "key" does not enforce row uniqueness.
> I write "key" to emphasize that it does not conform to Codd's definition
> because it is a totally made up set of values without any tie to reality
> - and that's fine with me to assume that.

Ok, but that is despite my post, which details exactly why the word is false and confusing. So to acknowledge that; to acknowledge that it is not Relational; and then to continue using the term with double quotes as a modifier, removes it from the previous category of unconscious fraud, and places it in the category of conscious fraud.

Either it is a Key, key, "key", 'key', or it is not.

Or, you still think a surrogate has some properties of a key. In which case, I have not gotten through to you, and there is something we need to pursue.

At the least, since I have given you detailed reasoning, which you seem to accept, you have to stop using the term "surrogate" and the term "key" (in any form) together, because it is false, and you accept it is false. Otherwise, every sentence it is located in causes a strain and a response.

Do you want me to enumerate the properties of a Key; the properties of a surrogate; then identify why the latter has none of the former ? Keys are central to the RM. I mean, you say you know the RM, and you like it, but each time we discuss some aspect of the RM, it appears you are ignorant of that aspect.

I have already posted the definitive section from the RM that relates to Key (in my previous post, which you are responding to). If you do not accept that definition, then you are not Relational, stop pretending to be.

If you are purely theoretical, which as evidenced, results in a non-relational RFS and non-keys, just state that. For such people the pretence at Relational is fraud. It allows you to anoint the non-relational RFS as "relational", which is a serious fraud, because such RFS are devoid of the Integrity, Power, and Speed of the RM. Until someone like me comes along and takes the fraud apart.

> A logical consequence of the
> previous statement is that a surrogate "key" cannot be part of a
> relational schema.

I have specifically declared that it is not Relational, yes.

> E.g., this representation:
>
> CountryId CountryName
> 1 Australia
> 2 Australia
>
> depicts two identical tuples. Then the instance above is not a relation,
> hence the model is not a relational model. Fine.

Agreed.

> But then, that diagram is ill-formed, as it does not represent a logical
> relational model.

Agreed. Already stated by me.

If you have arrived at that conclusion now, then yes, you are correct.

> Saying that it is not in 3NF is akin to saying that an
> integer does not run at 100Mph. Not false, but not particularly
> significant either.

???

You may be running into problems by chopping up my statements, and then viewing each of them in isolation. Here is what I stated. Please do not break this up into fragments: == Quote: Fails Relational, Fails Third Normal From == -- Fail --
The model fails for the following reasons (many instances of said reasons). The ordering of the issues is mine, in that if it isn't Relational, it is not worth bothering about the specifics of a Normalisation error. It fails *Relational* mandates on two counts, it is non-relational.

1 Definition for keys, from the RM: "a Key is made up from the data" __ A surrogate (RecordId) is not made up from the data __ There are no keys on the data

2 The RM demands unique rows (data)
__ A surrogate does not provide row (data) uniqueness

3 Re Normalisation, (if we consider that outside the Relational Model), it breaks Third Normal Form, because in each case, there is no Key for the data to be Functionally Dependent upon (it is dependent, yes, but on a pork sausage, outside the data, not on a Key).

[4] This is a classic Record Filing System, anti-relational. It is nowhere near ready for Normalisation, let alone Relational Normalisation.

None of you theoreticians identified any of that. == End Quote ==

[1][2][3] I stated that it was not Relational AND that it breaks 3NF. I stated that it was not Relational for several reasons,

[1] the no-keys-on-the-data issue was just one
[2] the non-uniquesness was another
[3] I stated that it breaks 3NF issue, and I gave the reason

Now, you have drawn my attention to [2], you have laid out yet another example (no idea why), and agreed with it. Good. But then you jump over to the conclusion of [3] and make a comment about it as if the reasons for [2] applies to conclusion of [3]. I did not make that connexion, therefore I cannot defend it. It is you who is make that connexion (and I agree, yes, the connexion is absurd). To wit, the conclusion for [3] is for the reasons given under [3], the conclusion for [3] is not for the reasons given under [1] and [2].

(For the purpose of being complete, since we are examining this issue again, by "pork sausage", I am referring to your surrogate, the non-key that you are treating as "key". I have stated that such an act is not only wrong, it is self-confusing.)

That kind of mistake (fragmenting my statements; then mixing them up; then making an incorrect attribution) is typical of people who have gone to universities after 1983, the consequence of losing the war. They actually teach pharisaic argument as "logic".

Therefore this does not make any sense at all:
> Saying that it is not in 3NF is akin to saying that an
> integer does not run at 100Mph. Not false, but not particularly
> significant either.

> Btw, your argument about surrogate "keys"

I argued no such thing. Surrogates are not Keys by any stretch of the imagination. I argued surrogates.

> contradicts the business rule
> "Country is uniquely Identified by (CountryId)", stated in the document
> you have shown us. Why doesn't that statement imply that CountryId is a
> key? How would that be different from "Country is uniquely Identified by
> (Name)" (which would imply that Name is a key)? The difference is in the
> eyes of the beholder, it seems.

I will answer the whole para, then answer each sentence.

In that same post, mine, part of which I have quoted above,

I state:
> > The model fails for the following reasons (many instances of said reasons).
That means, I have not enumerated each instance.
I state that the developer has corrected the mistakes identified as per above conclusions; that he has made a second submission, Address [B]; to which I have given a link; and then I stated:

> > My corrections to his /previous/ model [A] are on page 2.

There, I have enumerated (not completely) and detailed his some of his mistakes.

It appears you have not read Address [B] page 2.

I suggest you read it.

For your convenience, I will quote the relevant part (the corrective notes, in the form of a Post-It-style note, stuck on top of his Business Rules) here: == Quote ==

*	These are idiotic Business Rules, they merely declare the Records in your Record Filing System.  Of course, every File in a RFS is independent, but not so in a Relational Database.
*	I have told you one hundred times, if you start the design process by sticking RecordId on every box, you cripple yourself and the modelling exercise.
*	IDEF1X is a Methodology, not merely a notation.  Follow it.
*	Go and discuss with the business, and determine the real Identifiers, the real Business Rules.  
*	Find out what the data is, what it means, how it relates to all other data in this cluster.

== End Quote ==

Now for each sentence.

> Btw, your argument about surrogate [] contradicts the business rule
> "Country is uniquely Identified by (CountryId)", stated in the document
> you have shown us. Why doesn't that statement imply that CountryId is a
> key?

Ok, that means you do not understand the world of implementation.

On one side, where the business gives us "business rules", they are not to be taken as implementation imperatives. If taken as such, we would be merely clerks, implementing their requirements, without using the skills that they hired us for. Eg. we would implement a "business transaction" that updated six million rows, that hung the users up for 15 minutes in the middle of the day, and we would take no responsibility, because the business "told us to do it".

1.a Obviously, we do not do that. We exercise the skills we were hired for. Part of which is to implement OLTP Standard-compliant transactions. We do not view the business requirements as imperatives, we view them as initial requirement statements. We work back and forth, such that the requirements are modified, then accepted, and then implemented, such that they do not crash the system; such that the database does not have circular references; etc; etc; etc.

1.b So the example "business transaction" would be converted into a batch job that runs in a loop and executes six million OLTP Standard-compliant single-row transactions. The batch job keeps track of its position; is restartable; etc. SO they business gets the requirement they want, but not in the METHOD that they initially stated it. Ie. Just tell me what you want, don't tell me how to do it.

1.c On this one side, in no case is a business rule to be taken as an imperative.

2. On another side, I was merely answering a request from people here who evidently could not read an IDEF1X data model; who evidently could not understand that the predicates were in the diagram, and I asked the developer to fill them in.

2.a The way we do that is this:
___ All the predicates in the data model, *as well as predicates that are unknown to you people*, are covered under a formal documentary section "Business Rules". The predicates are included, precisely because we do not expect the users and auditors to fully understand the notation in the data model, so we spell it out for them.

2.b The developers had better know the predicates (a) from the business requirements given, that they have actually implemented, in the model, as well as (b) derived from the model, as a validation of it.

2.c In fact, the exercise of modelling, is to go back-and-forth with the users, using the data model as a communication tool, such that:

___ The model progresses incrementally
___ Understanding of the data, is progressively increased with every iteration
___ by *both* parties
___ The data model used for communication with the users, is incomplete without that "Business Rule" section
___ the discussions that are had are specifically to get them to agree that the stated Business Rules (which an IT person understands to be predicates) are correct.  Ie. it is a formal method of validating the model, and obtaining a signature.
___ The primary purpose of the data modelling exercise (with the model used as a subject tool) is to *understand the data*.  It is not to design a database.  After the model has achieved some maturity; after the users have agreed to it; then yes, the secondary purpose may be commenced, it may be ready for designing a database.  If you get those primary and secondary purpose mixed up, the resulting database will be a gross failure.

2.d This, btw, is exactly what is going on in this thread, with the model progressing incrementally A, B, and so on. ___ With the exception that you guys are more like users, who need everything spelled out, because you are not used to the models that we have been using since 1985, and you have little understanding of the modelling exercise ___ instead of data modellers, who can read models, forwards, backwards, side-ways, and determine the faults in the model, in short order.

2.e So the developer was just filling in the gap as requested by posters here, using the formal structures that are in place.

3. On the third side,
3.a It is plainly obvious to me (unfortunately not to you people), that he has:

- not understood the data
- therefore not understood how to Identify the data (the Keys required)
- therefore not correctly worked the FDs out
- erected a model starting with certain boxes in mind (a severe mistake, if one is trying to understand the data)
- stuck an ID column on every box that moves (which cripples his ability to understand the data, as data, and nothing but data; cripples his ability to perceive the data Relationally; cripples his ability to determine the required entities) 
- which is the classic hallmark of a Record Filing System
- which has none of the Integrity, Power, and Speed of a Relational Database
- which you people use, each and every time

Therefore he is screwed before he starts.

Hence my annotations to him in my quote above.

Therefore:
> Btw, your argument about surrogate [] contradicts the business rule
> "Country is uniquely Identified by (CountryId)", stated in the document
> you have shown us.

is plainly incorrect. CountryId uniquely identifies precisely NOTHING. Instead of recognising that the stated BR is an error, a false and impossible statement, you are taking it as fact.

As detailed above, and in my annotations in Address [B] page 2, the idiot worked backwards, and merely wrote BRs from the model, which is an RFS. Since the model is incorrect, the BRs are therefore incorrect. Separate to it being false, for any human being.

> Why doesn't that statement imply that CountryId is a
> key?

That statement taken alone, ignoring the fact that it is a severe error, does carry that implication, yes.

Which is one of the reasons (here, exemplary) that viewing surrogates as "keys" is so dangerous. You mess with the meaning of Key, and you mess your own self up.

Screwed, the same as you people, crippled, same as you people, because you take the same RFS approach, draw rectangles without having a clue about the entities that are *actually*required; without understanding the data; without determining the Keys.

Actually, he is less screwed than you, by one recognisable increment, because he understands that surrogates are not Keys.

But that is ok, we can approach the issues in the sequence that you are following (no, Derek, surrogates are ok; we have our magical mysterious ways of ensuring the {keys|fds|dependencies|fks|relationship|etc} re correct), the result (for me) is the same, I will determine various faults such as: - Non-compliance with the RM
- Beaks {1NF|2NF|3NF}
that you will fail to determine. And do so in five minutes.

I had better identify the reasons for that. The difference is, I am working with a deep understanding of: - the RM
- 3NF
___ which contains your BCNF, 4NF, 5NF fragments ___ and any NF fragments that you might declare in the future

- data modelling, specifically the standard for it, IDEF1X
- your 17 NFs fragments, carefully inspected and dismissed
- your level of abstraction, carefully examined and determined to be too abstract, due to loss of meaning, and dismissed
- knowledge of RFSs vs RDBs, and knowledge of the specific failures of RFSs (Loss of Integrity, Power, Speed that is in the RM)
- the specific issues concerning surrogates

Whereas you are working with:
- zero understanding of the RM
- a tiny understanding of 3NF
___ due to your 17 fragments taking precedence over 3NF ___ thus denying 3NF

- zero understanding of data modelling
- a small ability to read and understand a standard data model
- your 17 NF fragments
- your level of abstraction
- zero knowledge of RFSs vs RDBs, and zero knowledge of the specific failures of RFSs (Loss of Integrity, Power, Speed that is in the RM)
- thus zero knowledge that you are creating an RFS that is nowhere near an RDB, but declaring it to be an "RDB"
- which relies on surrogates as "keys"

The difference, in sum, is that I determine errors and failures in five minutes, in five words, whereas you take the long and winding trail through the forest of your mysteries; fail to determine those errors; and have six hundred words to justify your mysteries.

> > (There is an interaction going on with Jan, where we have teased out that you
> > people have a fragment of 3NF which you fraudulently call "3NF". Result
> > being, as evidenced here, you accept a total failure, a non-relational Record
> > Filing System, to be satisfactory.)
>
> No. We are making a formal argument. In other words, we *interpret*
>
> CountryId CountryName
> 1 Australia
> 2 Australia
>
> as having *two* distinct tuples. And we are *assuming* that CountryId ->
> CountryName (as per the stated business rule). Formally, this does not
> look any different from
>
> X Y
> 1 A
> 2 A
> FDs: {X -> Y}
>
> where I don't need to care what those attribute mean any longer to tell
> that this is a relation, and its schema happens to be in 3NF (and 5NF,
> too).

This could be an excellent vehicle to Identify and Determine why the vehicle that you people use is broken, bankrupt. Thank you for expressing it clearly, in technical English, so that the 99% can deal with it, and not in the gibberish that is so beloved of the 1%.

It remains an error, irrelevant to the physical universe, relevant only in the interaction amongst yourselves.

Your formalism simply means that you can declare a non-relational RFS that breaks 3NF, as "relationaL", and "satisfying 5NF" (and now "satisfying your "3NF" "), and rely on the fact that you have arrived at such declaration on the basis of formal argument. You are missing the point that the argument (with or without the formal basis) is wrong; the determination that you made using that argument are wrong.

I have every respect for formalism. I have no respect for a formalism that is sooo isolated from reality (facts in the physical universe); sooo ignorant of other sciences (ie. established truths); sooo abstracted such as to lose the meaning of the very thing it is abstracting, such that it "proves" something that is patently false; devoid of meaning.

Eg. there is no problem at all to "prove" in an abstract, isolated, ignorant sense (formal argument), that pigs can fly. But that has to be very abstract, very isolated, very ignorant. And if you tell anyone who has not lost his mind that you have proved theoretically that pigs can fly, he will split his sides, because the extent of his laughter is physically damaging. But you make that statement with a straight face.

And if you make that statement to someone who is an authority, he will lock you up, in order to protect society for such insanity.

Eg. here you have "proved" that the heap of rubbish that the developer submitted "satisfies 5NF", with a straight face, despite the physically evidenced fact that it breaks 3NF for the reasons given.

I laugh in your face. ROTFLMAO, etc, etc.

I won't be addressing any of your formal argument, because that might give it some credibility, or suggest it has some value, where it has none right now. It remains an incredible mental abstraction that is wrong, wrong, wrong. Pigs can't fly. The data model breaks 3NF. Five minutes, not five days.

I am happy for you to continue this line of reasoning (I will note, and ignore, your formal arguments), if you are interested in: - determining the delta (why it fails Codd's and my 3NF vs why it passes you "5NF") - determining the delta (why it fails the RM vs why you think it is "relational").

But that the formal argument is devoid of credibility, has the value of toilet paper, is already proved.

> Since we are discussing upon different *assumptions*, we are not going
> anywhere with this game of "it is/it is not 3NF".

I have already proved it breaks 3NF, there is no argument to be had.
It is not a game, it is established science, which you are in denial of, a denial that you must maintain in order to maintain the relevance of your 17 "NF" fragments.
It does become silly, when you hold the yes-but-but-but position that 3NF is not valid (denying established science), that you will use only your 17 "NF" fragments, and your isolated, abstracted processes, whilst denying physical facts, to prove that it "satisfies "5NF" ". For you. Not for me, I have already shot it down, the pig is in flames.
You are maintaining your unreal universe (the validity thereof), but the task called for dealing with the model in the physical universe. Epic Fail.

> If you want to
> continue arguing on the irrationality or dumbness of the formal position
> above, feel free to do that, but mathematics is mathematics, and
> modeling reality is a totally different business (I have already
> expressed my almost complete support for you point of view, see above).

The request was to accept/reject the data model (modelling reality) on specific criteria (established science), in the physical universe, and you (plural) failed miserably, twice. Instead, you have give reasons (a second set of six hundred words) as to why your formal arguments might be valid in the unreal, abstract universe.

> > I think the pivotal difference is, Codd and I require a Key, before
> > Functional Dependency can be worked with, and you people have non-FDs, and on
> > non-keys.
>
> The formal definition of a key depends on the notion of FD: FDs come
> first, keys can be derived from them. I don't know what you mean by
> "non-FDs", but FDs on "non-keys" exist, because the reality to be
> modeled mandates it.

Totally rejected.

0. A non-FD is your theoretical notion of the FD, which is abstracted to the point where the FD has lost its meaning. Therefore, a fragment of the FD. Therefore it is a fraud to label that fragment "FD". Since I will not participate in your fraud, and in the confusion that fraudulent labels cause, I call it what it is: non-FD.

The established definition for Key is in the RM. The requirement was to comply with the RM, not a fractured notion of "key" from outer space, thus I couldn't care less about any other definition of Key, it does not apply. This again, proves you are not Relational.
The established definition for Key does not depend on the "notion of FD".
The Key comes first, the FD comes second, and it is used to validate the key. (Goin back-and-forth during the modelling exercise.)
- Aside --
I am quite aware of the abstract notion that non-FDs can be used to determine the key (or non-key because you assent to surrogates). But that is an exercise in solving a puzzle, much like solving Sudoku or a crossword on the train. That is relevant only when you have a Record Filing system, devoid of Keys with meaning, focusing on non-keys that have no meaning, to determine that such non-keys are somehow "valid". What I have labelled MMM, and again it is evidenced here.

All that [4] is totally irrelevant because it is not necessary for the requested task, because the Keys have not been correctly determined. Therefore to embark on evaluating the FDs (given, not your puzzle) is premature. And to evaluate non-FDs is hysterically funny. ROTFL, etc, etc.

Ok, I accept, MMM is addictive, once the long and winding trail through the mysterious forest is commenced, there is no stopping the train, it must reach its climax. -- End Aside --

5. FDs do not come first. Your non-FDs may well come first. I couldn't care less, it has nothing to do with the task, it is a rumble in the jungle. After you return, if you have some statement of value (that relates to the task), sure, I will listen. I don't need the details of the latest rumble. In this case, no statement of value has been made.

6. Repeat [3]. Your non-FDs on your non-keys may well exist. Repeat [5].

7. The reality to be modelled scoffs at it (the notions in your para, destroyed by my points above). It takes a massive amount of time and energy, and comes up with totally incorrect determinations. Null and void in the reality (physical universe).

7. The reality to be modelled, due the efforts of others (ie. not due to you people, who have delivered nothing of value since 1970), have tools and methods, that are unfortunately unknown to you, that can be used to determine the correctness and completeness (that is what a standard does) of a model, in simple scientific steps, and in a tiny fraction of the time it takes you.

8. Your notion of what the reality mandates is false. It is transparent that your mandates are totally self-serving (the abstract, unreal universe), and it has nothing at all to do with reality. It is yet another fraud to state that your self-serving, self-imposed mandates are "mandated by the reality".

Stated otherwise, that means, your model is bankrupt, isolated from the reality is alleges to model. And you are ignorant of models that do serve the reality, that reality does use [7].

> Finally, I have summarized in a few words the problem with that model:
> "it does not capture the real constraints"

Very good. That is a correct, although less-than-perfect, conclusion about the aspect of constraints (the 3NF and RM issues remain separate aspects, in which you failed). One sentence as to why "it does not capture the real constraints" would have made it a perfect conclusion.

Hint (re Address A): there are no keys on the data. Hint (re Address B): the keys have not been determined correctly.

I am grateful, more than you can imagine, that you did not give me six hundred words, that detailed your ramble through the forest, which exercise gave you the reasoning to form that conclusion.

> (sorry if it's more that
> six). I don't think my posts will ever be longer than yours :)

I have no problem with length. I do have a problem with length that does not contain substance, yes. Or length that "proves" the opposite of an established fact, or that fails to determine faults that are determined by other science.

Cheers
Derek Received on Sun Feb 08 2015 - 02:46:38 CET

This message: [ Message body ]
Next message: Nicola: "Re: Fails Relational, Fails Third Normal Form"
Previous message: James K. Lowden: "Re: EAV - again"
In reply to: Nicola: "Re: Fails Relational, Fails Third Normal Form"
In reply to Nicola: "Re: Fails Relational, Fails Third Normal Form"
Next in thread: James K. Lowden: "Re: EAV - again"
Reply: Nicola: "Re: Fails Relational, Fails Third Normal Form"
Reply: James K. Lowden: "Re: Fails Relational, Fails Third Normal Form"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message