Re: Agility and Data Design (was: Dreaming About Redesigning SQL)

From: Dawn M. Wolthuis <dwolt_at_iserv.net>
Date: 19 Oct 2003 19:19:28 -0700
Message-ID: <6db906b2.0310191819.4e374303_at_posting.google.com>


topmind_at_technologist.com (Topmind) wrote in message news:<4e705869.0310191137.57cbf1c7_at_posting.google.com>... ...
> > I held a dialog (that reads more like a monologue) with Pascal and it is
> > reproduced in total if you scroll to the bottom of the
> > http://store.tincat-group.com page and click on the Dick Pick / Ted
> > Codd Blue Brothers parody picture. I had not been reading this news
> > group until lately, but it strikes me that this is a group that might
> > be very entertained by that dialog.....
>
> Regarding that Acrobat page of yours, I would like to make a few
> comments.
>
> First off, I don't think relational dictates fixed sized variables.
> It is true that current implementations do this, but I don't
> see it a "rule of logic" that dynamically-typed, or type-free
> "cells" cannot be part of a RDBMS. You may want to take a look at:

You are absolutely correct that it does not dictate fixed lengths. That seems to be an implementation approach taken by most RDBMS vendors, however.

>
> http://www.c2.com/cgi/wiki?MultiParadigmDatabase
>

I'll take a look, but have not yet.

> But, your bicycle example bothered me a bit. It seems to be an
> IMS-like hierarchical model where an "all the things that X owns"
> sub-list is made for each person. In practice, we would probably get
> "Schwinn" spelled 30 different ways (I probably even mispelled it).
> Or, if the vendor changes names or merges, that information is not
> automatically "propogated" to all the child nodes. Plus, we have to go
> thru the parent to query about things people own. We now have to
> "navigate" a path not related to our query.

Storing forms/documents/propositions (or any other language-based object) without piecing it out does not preclude data edits. I did not mean to suggest that it did. Yes, where appropriate, there should be 'code files' (drop down lists, or whatever name you like for them) for helping to ensure the integrity of the data through edits, including persisting various codes for use in such edits.

> Doesn't XP also dictate factoring so that an algorithm or piece of
> info is in one and only place? In that case, you are not XP-compatible
> there because bicycle brand names and perhaps bicycle attributes would
> be *replicated* on several nodes, since multiple people own the same
> models/brands. Relational would have a "productID" or "modelID" most
> likely. The ID number is "dumb" such that changes in product names or
> attributes are not reflected in the keys. That is why, for example,
> that employee numbers make a better key than peoples names: names are
> often misspelled and people get married or divorce and change their
> names. Easier to fix it in one place than on multiple child nodes.

I'm sure there are many various interpretations of XP, but one thing that I believe is a common theme is that one not implement infrastructure & architecture for their own sake or for the sake of possible future features. The idea is that you look at your current requirements (or user stories) -- those you plan to put into the next iteration of the software -- and include what is required to meet those requirements, refactoring as required if there are previous features that were implemented in ways that need to change. So, yes, refactoring is definitely important for going in and factoring out common code use for meeting existing requirements, but not for meeting possible future requirements that might never occur. I'm not an across-the-board XP proponent, but the idea of aiming for agility in our software development processes makes sense to me.

> Yes, your approach might be more "agile" since it satisfies
> *immediate* requirements, but shanty towns are also more "agile", but
> are not practical in the longer run. In that sense, I reject pure
> agile approaches. Navigational (non-relational) databases do tend to
> have just such a shanty-town feel in my experience. The "paths" are
> built up based on the first requirements, but become awkward for tasks
> not tightly related to the first requirements.

They don't "feel" that way to me and I'm sure you will agree that we will often have different intuitions and that is fine to state (I do so often) but not exactly a SOLID arguement for your point, right?

> It is a matter of being agile for the short term or agile for the
> longer term.

I don't think so -- I think you can select tools, techniques and methods that run lean and mean (agile) for both the short & long term.  We have also certainly seen tools, techniques, and methods that bog down the efforts from the start, whether short or long-term. Again, my intuition does not square with yours -- nor does my experience.

> Relational tends to describe things independent of how they are used,
> and this is one of its best benefits. Sure, it does not always
> perfectly satisfy that goal, but comes closer than any existing
> database model.

I agree that the relational model attempts to persist data in a way that is independent from how it is used. Not only does it "not always perfectly satisfy that goal" but because we are talking about storing data in a particular language and language is not fixed through all time, it isn't even possible to achieve. Also, I would contend that it is not as useful as one might hope it to be to even try to do so. What is holy or even all that useful about storing data independently of how they are used? It seems to me that the reason for trying to do that is to make the data persistence plus data constraints be independent from languages used to access it, for example. And that the reason for that is to help control the integrity of the data. I have no doubt that there is some good in doing this, but the cost from my perspective is not worth the benefit. That is not to say that it is not important to protect the data, but that there are other means of doing that. It can be protected with quality assurance (also flawed, of course) on the sum total of all applications that maintain it.

> Further, your complaints about relational rows and not being able to
> get the output report format you want are apples and oranges. You need
> a report generator to get various output formats. Relational queries
> are not meant to replace report writers when choosey about format
> (sure, it could get you a quick-and-dirty result). If we hard-wire
> information management to fit one report, it will be less likely to
> fit another. The same piece of info often shows up in multiple reports
> in practice. Why favor just the first round of requirements at the
> expense of others?

I'm not actually talking about the end-user -- I'm talking about the tool or app developer that is setting up whatever tools are used for reporting the data. There is a cost to exploding propositions that then need to be pasted back together -- you pay when you take the propositions apart to store them and again when you try to put them back together to report on them.

> In short, I would rather improve upon relational rather than toss it
> to get some of the "dynamic" features you describe. I agree that the
> "Oracle model" of relational is a bit stodgy at times, but a return to
> navigational databases is not a fix.

Don't you find it interesting that XML is one of the TLAs and corresponding technologies that has taken off? It is a very similar model for communicating data as MV/PICK is for persisting it (at the logical level). From my experience with various data models, it looks to me like a good bandwagon from a number of perspectives. So, my intuition -- again different from yours -- is that we would be well-served as an industry if we take the relational model and write up the history on it -- rise & fall. Then we can perhaps focus on writing quality applications with lower costs. Just an opinion -- I don't definitely don't have a full proof, but I am defending the PICK model from the standpoint of a futurist (self-declared, but I did do a talk on Mosaic in 1992 or 3, before there was netscape; and on data marts about a decade ago), not as someone trying to cling to what is old. Cheers! --dawn  

> -T-
Received on Mon Oct 20 2003 - 04:19:28 CEST

Original text of this message