Re: Fails Relational, Fails Third Normal Form
Date: Sun, 15 Feb 2015 20:08:02 -0500
Message-Id: <20150215200802.3c8c8bae.jklowden_at_speakeasy.net>
On Thu, 12 Feb 2015 10:52:54 +0100
Nicola <nvitacolonna_at_gmail.com> wrote:
> In article <20150212012341.d8329e51.jklowden_at_speakeasy.net>,
> "James K. Lowden" <jklowden_at_speakeasy.net> wrote:
>
> > Quoth Nicola on Thu, 05 Feb 2015:
> > > A few years ago I implemented a few algorithms from Koehler's PhD
> > > thesis in a Ruby script. Given a set of FDs, the script finds all
> > > the keys and all the minimal covers....
> >
> > My first reaction is a little unkind. I think this is what lawyers
> > call "assuming facts not in evidence". *Given* a set of FDs, the
> > program generated a 3NF database design. Hurray! Now, where to
> > find those pesky FDs for input?
>
> From the requirements? I'm not sure I grasp what your point is.
> Do you mean that turning some requirements into FDs is difficult (you
> would not be alone: Stonebraker says that "mere mortals do not
> understand FDS")? Or that FDs (and even MVDs, JDs, etc) comprise only
> a tiny fraction of real system's set of requirements, so it's not
> worth bothering?
> Computationally, database design problems are (highly) intractable in
> general. But in practice many problems can be solved. For example,
> finding the keys of a schema (given the FDs) is an inherently hard
> problem, but for schemas with up to a few tens of attributes or so,
> only in few cases you are not able to find them in a reasonable
> amount of time. On the contrary, finding all the minimal covers is
> much less practical, even when there are not many of them.
Even if the problem is NP hard in general, I suspect there are ways to prune the problem space such that more than "tens" of attributes is feasible. (Unfortunately, not many people interested in that kind of work are addressing themselves to this particular problem.) If human beings are able to design normalized databases with hundreds of attributes (1000 is a lot, 10,000 I've never heard of), they must be doing something the machine could also do. To me that's prima facia evidence the algorithms we're using are naïve.
Granted, that's a hand-wave, and you'd be justified to answer that if it's so easy, maybe I should just show everyone how it's done. I'm not saying it's easy. I'm saying we're making progress on other computationally hard problems by limiting the problem space, and that there's evidence this field could profit from the same work.
> FDs are a special type of first-order formulas. I can imagine
> defining an English-like syntax for them.
I'm sure, but the problem is not as simple as you suggest.
Notational convenience is a big deal, and more art than science. If I may say so, the field of databases hasn't produced much in the way of compilable languages. We got SQL (not from a lab) and stopped.
I had to learn the value of notation the way I learn everything: the hard way. Some years ago I took up an interest in dependency tracking for software packages. To represent packages and their dependencies in a relational database design is no big deal, maybe 10 tables and 50 attributes IIRC. But, how was the user of my database going to present the dependencies to it? With a sequence of INSERT statements? Hardly. Thus did I find myself in the world of language design when I thought the problem was only one of satisfying dependencies.
--jkl Received on Mon Feb 16 2015 - 02:08:02 CET