Re: Base Normal Form

From: dawn <dawnwolthuis_at_gmail.com>
Date: 13 Jul 2005 21:20:31 -0700
Message-ID: <1121314831.938920.140150_at_o13g2000cwo.googlegroups.com>


Jan Hidders wrote:
> dawn wrote:
> > Jan Hidders wrote:
> > <snip>
> >>>A candidate key of a relation could be modeled as the
> >>>domain of a function -- gotta love language, eh?
> >>
> >>That's still way too sloppy and I think you can do better than that.
> >
> > That was purposely sloppy -- many words with multiple meanings in
> > multiple contexts.
>
> Ah, my apologies, I think I misunderstood what you wanted to say with
> that sentence.
>
> > [...] I suspect I'm working in a much simpler context
> > than you are, however. For example, I might be explaining to a web
> > page developer who has never had a database behind their pages before,
> > how to model data. I would not bring in the term "relation" very
> > quickly, where I would use the term "function" right away.
>
> Hmm. I think I understand where you are coming from, but this has
> probably more to do with you

dang

> and the way you think about data modelling

like a programmer, right?

> than with some inherent difficulty in understanding of the relational
> model.

While the average comp sci major can figure out relations from the way it is typically explained, I don't like the way we have carved out the dbms from the rest of the application, modeling stored, persisted, remembered data; validating such data; naming such data; etc decidedly differently than other data that will hang around for less time. The current language does nothing to promote a holistic approach to software development unless you use mountain man's approach of hauling everything into the dbms. I think that is the next best thing to hauling it all out of "there" (the typical sql-dbms). (The last statement was for your amusement only, mostly, sortof).

> I have taught at university level and sub-university level, also
> to business students and chemistry students and I would *never* explain
> things in terms of functions. The first time the term would show up
> would probably at the point where I would discuss normalization.

I'm sure. So we disagree. You are much more knowledgeable about the theory than I and almost all of my concerns are those of an "old" (seasoned?) practitioner. I've watched the software development industry devolve in some areas where I'd like to see us take another look at the partitioning of the discipline and our software products. When I led back to back projects with a team doing work in a "pre-relational" database and then in a SQL-DBMS, a light went on, even if the words, theory, and experiments to back it up aren't yet there -- a "Blink" moment, if you have read that book.

Trimming down the terms so we can see where we have like items and where we don't by looking at the basics -- input to functions, processing by functions, output from functions -- is one small step in trying to show that databases need not seem altogether different than the rest of the work in developing software. It's all about data and it's all about functions. Input-processing-output. End of mystery.

> One of the dangers with viewing relations as functions is that it also
> tends to promote a view of the database where its only purpose is to
> retrieve data given a certain key.

Ok, so you are, indeed, tapping into my brain (pretty scary). One of the dangers of declaring "stored" database functions as "relations" is that they seem so distant, as if we cannot access them. Set operators are fine and dandy, but people understand single transactions handily. They pay at the grocery store, get money from the bank, and type their name into ba-zillion web pages that require it. Start with individual "records" (as in "the doctors office keeps a paper copy of my ct scan record") and show that you have an API for it, then go to sets. How do we collect data? One record at a time. (I seem to be in a mood as if trying to entice Fabian to quote me, eh?)

> That much too narrow view

it is equally as narrow as describing data only in terms of sets and set operators. I would move to aggregate (by some defs) data soon enough and introduce the functions that can be applied to these "stored" functions.

> is exactly
> the reason why so many web-applications have incredible crappy
> database-programming behind it,

NOW we are addressing a similar issue, but with a somewhat different diagnosis. A lot of the LAMP, uh, crap, that is going into production is written by people who have learned what a relation is and how to put it in 3NF, without a clue how to develop quality, maintainable, flexible, scalable, etc, software. Programming languages are taught in one undergrad course and data modeling in another, with different words, concepts, pictures, tools, naming standards, ...

> where they forget you can also have
> other indexes and use the table "backwards" or where joins are not done
> in SQL but in PHP.

MySQL is setting us back more than a little, methinks. But, yes, all other functions need to be brought into the discussion, but I like starting with put and get -- lookUp.

I'm starting to dabble with AJAX (like thousands of others, but I am the first girl on my block to give it a spin) and I can cope with XHTML, CSS, and even JavaScript and the dog-ugly way you have to instantiate an http request object, but every time I think I'll try to figure out more than the basics of PHP, I gag. Now put that together with a striped down SQL-RDBMS and a person armed with the definition of a relation and rules for normalization and I am just really starting to sound like an old person, sheesh!

> If I would teach web-developers I would see it as one
> of my biggest tasks to prevent such misuse of the DBMS.

I'm thinking I'll prevent misuse by preventing use. From the file system and indexed sequential files to the DBMS and back to the file system with a semantic web or .xml files. Waddaya think? (Don't answer.)

> So I'm afraid I'm still very very wary of your terminology.

Golly gee, I haven't a clue why ;-) But you gotta admit that we have to mix something up here as software development in general and data modeling in particular are becoming a DIY (do it yourself) hobby with a backlash against more traditional approaches. Since the SQL-RDBMS is in need of a significant face lift anyway, I'm thinking there will be a breakthrough there with a not-only-SQL/relational tool going neck and neck with MySQL, reaping the beneifts of XQuery for gets and puts, for example. I'd like to write a database app with no SQL in it (of course I can do that with 1970's tools today, but I'd like to do it with new tools).

> -- Jan Hidders

Apologies for being too long and wordy, but I'm turning in rather than proofreading and trimming back. Hopefully it was entertaining enough and not too stupid. Cheers! --dawn Received on Thu Jul 14 2005 - 06:20:31 CEST

Original text of this message