Re: Concurrency in an RDB

From: Marshall <marshall.spight_at_gmail.com>
Date: 23 Dec 2006 17:58:32 -0800
Message-ID: <1166925512.486987.10450_at_i12g2000cwa.googlegroups.com>


On Dec 23, 5:08 pm, "David" <davi..._at_iinet.net.au> wrote:
> Marshall wrote:
>
> > I'm not sure if this is a trivial rephrasing of what you just
> > said or an actual disagreement, but no, I would not agree
> > that they "shouldn't" be enforced on every update.
> > However I would agree that there may be practical limitations
> > on so doing, such as the amount of time necessary to check
> > the constraint. Any constraint that can practically be checked
> > on every update should be.
>
> You suggest that performance is the only issue at stake. In examples
> like the above, a verification failure often points to an error in
> *previously* committed changes. Software development is a good example
> of a "non-monotonic" process. Sometimes you need to commit a change
> that will temporarily break the integrity of the system.
>
> Now you could argue that the user should be forced to make all the
> changes necessary for the DB to atomically change from one valid state
> to the next. However in some domains that could lead to long running
> transactions that take hours, days or even months to complete.

How is that not exactly and precisely a performance argument?

I will admit that I have seen examples of commits that break the system. For example, Perforce doesn't allow one to both move and modify a Java source file in a single transaction. Since Java source code specifies the package that it's in, and since the package must correspond to a directory, this is necessarily a compile-breaking checking. But that's a limitation of the tool, not of the process; if Perforce simply supported move-and-modify it becomes a non-issue.

And in fact over time I've become more and more aware of just how much pain in the software world is cause by dbms implementations that don't realize they're dbmss, and hence fail to get right things that were solved problems decades ago. This is so prevalent that I've called it Spight's Law, which in essence says you have a dbms whether you realize it or not. (And if you don't realize it you have a crappy, broken one.)

> > > I have the impression (please correct me if I'm wrong) that your
> > > assumption that a DB should always be in a valid state is coloured by
> > > the (relational) problems that you have expertise in.
>
> > Certainly this is always true for everyone. However I am having
> > a hard time seeing the value of your approach given how much
> > less it lets us count on the dbms.
> Is it really a problem? A workflow can easily force a user to run the
> verification as part of the process of using the DB within a system.
>
> As an example, good software companies have a release engineering
> workflow that ensures that the release has passed various unit tests,
> regression tests etc before it can be released. It goes without saying
> that it must compile successfully.

I suppose it could be considered a matter of perspective. If the system requires checkins to compile, then we can say we have a database of valid source code. If the system allows checkins which don't compile, then we ought to say we have a database of text files that may or may not be valid in some particular language. By and large I've seen that people are happy with the latter.

In my experience, constraints hold if and only if they are centrally enforced. Those constraints that everyone knows of and which are supposed to be enforced in application code are a distant memory, victims of broken windows syndrome.

> > I'm also unclear on how much
> > I have to give up in the way of integrity enforcement. I'm having
> > a hard time building a mental model for that. Your intent only to
> > speak at a high level somewhat exacerbates this difficulty.
>
> > Hmm, I just had an interesting idea. Perhaps the issues your
> > idea raises could be dealt with as a "quality of service" issue.
> > Where one needs strict durability, one could so specify externally
> > to the application.
>
> > This is a bit tricky because of the question of guarantees of
> > desirable properties. One area I'm interested in is
> > static analysis, and that's entirely dependent on building
> > theorems from known properties of a system. Weakening
> > those properties might render some analysis techniques
> > unsound.
>
> Examples of that would be relevant to this discussion.

How ironic that now you are asking me to be specific.

Hrm. Well, my understanding of the limits of your proposal remains sketchy. However if I understand you if transaction A and transaction B each issue an update in an incompatible way, the system picks one and discards the other, and the code that issued the update is not notified of this in a timely manner. Is that correct? Can we characterize precisely what kinds of constraints are still possible, and which are lost? I didn't see that that was clarified; I apologize if I missed it. (I recall the phrase "complex constraints" but I don't recall a specific definition of what that was.)

Suppose we have a constraint that a graph will not contain loops, and software that traverses the loop, and an update that temporarily allows a loop. Static analysis will tell us that the graph traversal logic doesn't need to check for duplicate nodes, because there are no loops, so this code is omitted. However if it encounters a loop, the program will diverge where we thought we could count on it to converge.

My strong feeling is that software needs to move towards *more* correctness, not less, and that the way to get there is runtime constraints and static analysis. I don't see that anything else has a chance of working. (Unit test sure ain't gonna get us there.) And I can't help but feel that any software engineer who gives up correctness for some performance has sold his birthright for a mess of pottage.

Marshall Received on Sun Dec 24 2006 - 02:58:32 CET

Original text of this message