Re: A database theory resource - ideas

From: JOG <jog_at_cs.nott.ac.uk>
Date: 17 Mar 2007 07:55:08 -0700
Message-ID: <1174143308.711462.179560_at_y66g2000hsf.googlegroups.com>


On Mar 16, 3:43 pm, "Marshall" <marshall.spi..._at_gmail.com> wrote:
> On Mar 16, 6:58 am, "JOG" <j..._at_cs.nott.ac.uk> wrote:
>
>
>
> > So my question to cdt is to ask what /you/ believe the priorities for
> > such a resource would be?
> > - which pivotal questions are most misunderstood?
> > - where does most ignorance lie in our field?
> > - are there are any crucial topics that you believe it would be useful
> > to address that I have not listed.
>
> My suspicion is that such a body of information would be
> able to grow for a long time. That is, I wouldn't think it
> would be particularly important to try to identify all
> the topics up front.
>
> That said, here are some of my thoughts:
>
> The purpose of normalization is the elimination of update
> anomalies, not space savings. (Dammit!)
>
> A Guide to the Normal Forms. These were hard for me; when
> I first started reading about them, the best information I could
> find generally plops you down in the middle of a bunch of
> fairly abstract statements about functional dependencies etc.
> and I had a really hard time linking it to anything familiar.
> It would be cool to have something that lists what all the
> functional dependencies are, identifies what the important
> ones are, and focuses on them. (BCNF, anyone?)
>
> Re:> That Data models involve not just structure, but also manipulation
>
> and integrity.
>
> I'd go further and emphasize that structure is in fact the least
> and the easiest of these.
>
> And how about an explanation of the term "query bias?" That
> one really took me a long time. In fact in general I find that,
> lacking the long-term exposure to data management and lacking
> the educational background in same, that a lot of the terms in
> the data management field are opaque. And like in most fields
> the terms tend to get used with the assumption that the
> listener already understands the terms and their long history!
>
> In fact I came to my first deep understanding of the fact that
> hierarchies can only organize around one dimension after
> wrestling with a particular tree in a particular app for a long
> time. I think this could be explained, with examples, in a way
> that was quite useful.
>
> A quick thought: it might be desirable to use examples that
> aren't customers/invoices or suppliers/parts. I think this
> contributes to the overall impression that RM is only for
> accounting applications. There might be some advantage
> to playing against type, like when Robin Williams plays the
> bad guy. Maybe a database of MP3s or something?
>
> Also: crappy writing is easy; good writing is hard. I think
> making this worthwhile will require some nontrivial investment.
> I am imagining a closed-edit wiki, maybe?
>
> Marshall

>From these suggestions summary seems to be:

  • Normalization - to prevent update, insert and delete anomalies not just redundancy
  • Normalization - BCNF (which I believe can be described more incisively than 1-3NF)
  • Normalization - simple, but relevant examples (I like the idea of mp3's)
  • Query Bias - define and discuss (so it may be used as a basis to explain the possible weaknesses hierarchical/network/object approaches - and the protection query neutrality offers from changing requirements)
  • that this will be a gradual process.Perhaps mediawiki would be a good canvas.

I appreciate this feedback. If I get this project off the ground hopefully everyone here will have some vested interest in its content. It seems the one piece of common ground everyone here has is that we want people outside of cdt to be more knowledgeable to data fundamentals - and that can only be a good thing. Received on Sat Mar 17 2007 - 15:55:08 CET

Original text of this message