Re: "Transactions are bad, real bad" - discuss

From: Paul Vernon <paul.vernon_at_ukk.ibmm.comm>
Date: Fri, 2 May 2003 15:51:17 +0100
Message-ID: <b8u0ov$2108$1_at_gazette.almaden.ibm.com>


"Costin Cozianu" <c_cozianu_at_hotmail.com> wrote in message news:b8pec1$ces2s$1_at_ID-152540.news.dfncis.de...
> Paul, I'm willing to give you a pass to almost all what you claim,
> although you have to provide more substance, so I'll give you a pass "in
> principle it can be done, the results are unclear".

Cool.

Could I paraphrase that, as

   "It is quite possible that in principle everything that can be done with transactions in the logical level of the Relational Data Model, can also be done without transactions"

and you would not disagree with such a statement?

Then I might want to add

     "... and without excessive circumlocution"

But I'd agree that I need to show more evidence that living without transactions is not too arduous. Of course, what I really want to show is that life without them is actually very much better indeed!

> Do you have a
> published paper somewhere ?

Nope. I've a draft that is in need of some rework. Once that is done, I'm open to offers from willing reviewers.

> But to my pertinent and very general example, you haven't responded

I was busy, but also, I realise now why I asked for a specific example. The problem is that the detailed semantics of transactions are not AFAIK generally accepted. Certainly different implementations have different semantics (and ones that can vary under settings such as Repeatable Read and Cursor Stability etc).

I think I might need to define (or find defns of) the 'transaction model' (or models) before I can properly attack them all. In the meantime I'll note my semantic assumptions below as I go

> instead raised a different example, which by the way can be resolved
> trivially, even with the current DB2 and transaction.
>
> >>Here's a deal: I reserve a sum to be debited against a client account
> >>(the sum is not debited yet, just the available balance reflects the
> >>reservation - i.e. the client has committed to pay). Based on some other
> >>happenings in the client software (let's say I cannot performa the
> >>service, or I can, etc), I want either to go ahead with the debit, or
> >>cancel the transaction (ROLLBACK). Compose several of these happening

I'll leave your example for a moment. That 'reserve' brings me to relational locking methods and I don't want to go there here just yet.

BEGIN
db->DB_STATEMENT1
client->ACTION1
db->DB_STATEMENT2
client->ACTION2
db->DB_STATEMENT3
END To take the easy way out, lets say that the semantics of the above is that there are no side effects and the 'now' of the transaction is the COMMIT time. By no side effects, I mean that none of the DB_STATEMENTs have the side effect of locking any rows in the database. I know that this is probably not reasonable to you, but as I said, this is the easy case.

This would be one generic txn-less translation of your generic txn.

  client->ACTION1 against (DB_STATEMENT1(dbvar))

  client->ACTION2 against (DB_STATEMENT2(DB_STATEMENT1(dbvar)))

  db-> dbvar := DB_STATEMENT3(DB_STATEMENT2(DB_STATEMENT1(dbvar)))

Do you follow?
Remember the final assignment to the database variable is just my logical way of finally applying the three DB_STATEMENTS to the database. Physically it would just be the few INSERT/DELETEs etc that the statements comprise of in that instance.

> >>Where each consecutive action depends on some results of the previous
> >>action, and at each point in time client should be able to decide a
> >>rollback.

> > It seductive isn't it, this rollback ability. A counter example.
>
> It is not seductive but very efficient in terms of development cost and
> in terms of factoring out functionality that in your model will have to
> be shifted to be a burden for your users.

So a mere matter of convenience then?

> What about the mere commit statements ? Not only rollback, but just
> commit, how do you make the client computation part of the single
> database statement ?

Depends if the client computation can be coded as (part of) a relational query.

> As for your example, that's a trivial problem to solve:
>
> >
> > db->DB_STATEMENT0 INSERT INTO 3G_Auction_Bid ('Network2', $11Billion')
> > BEGIN TXN
> > db->DB_STATEMENT1 INSERT INTO 3G_Auction_Bid ('Network2', $12Billion')
> > client->ACTION1 Wait for 2 mins // end of the auction //
> > db->DB_STATEMENT2 SELECT winning_bid FROM 3G_Auction_Result
> > client->ACTION2 If winning_bid > $11Billion THEN
> > db->DB_STATEMENT3 COMMIT ELSE ROLLBACK END
> > END TXN
> >
> > Would you allow that in your DBMS? If not how to avoid it? Vet all
> > applications using your DBMS! What an overhead.
> >
>
> First fo all, the code above is incorrect with regards to probable
> business rules.

> To protect against bad developers, I doubt there is any
> model in the world that could do that.

Well that is my goal, and I believe that my model could and should do exactly that. In the way that a web browser is meant to be protected against bad Java applet coders, by the inherent robustness of limiting execution inside a Java virtual machine, I want my databases to be robust in the face of stupid and even malicious users. All business rules are in the database, and the database does not allow anything to happen that breaks any of those rules.

> The second the insert if itb gets
> the lock, will prevent anybody else from submitting bids, and so on, so
> forth

I was not clear in my example. Every bid by each bidder is a separate row in the table. If a bidder bids twice, two rows get stored and the one with the latest timestamp is that bidder's final bid. I did not specify the locking semantics as they are purposely not an issue in this example. Assume optimistic locking if you wish.

> Second you can make the client automatically rollback if he violates
> business rules,

Which business rule? Making the assumption that each statment is individually timestamped, in this case his second bid was valid - it's timestamp met the bid deadline, its only the commit time that did not. I guess I also am making the assumptinon that the INSERT does not get re-applied at commit time with the COMMIT timestamp. If this is your suggestion that it is a shame that an INSERT that has already been accepted, now get rejected at COMMIT.

BTW one big advantage of my txn-less model is that proven statements never get rolled back (short of a system error)

> by using triggers, you can design a special column to
> get the timestamp of the transaction,

Indeed, but we need to define what the semantics of the 'timestamp of the transaction' are.
First, is there one transaction timestamp, or one per transaction component statement?
If there is just one is it the transactions start time, commit/rollback time, or something like 'as close to the commit time as embedded references to the actual value during the transaction allow'?

> create special rules to autocommit
> for particular transactions, and so on so forth.

In an application yes, maybe, but not (at least not very easily at all) in the database. I know of no database that can say, mandate 'auto-commit' on the modification of data in a table or an arbitrary database subset.

> >>>My short answer is, if the 'client programmer' want to freeze time then
> >>>tough - that is not allowed in this model of reality.
> >>>
> >>
> >>Tough luck indeed. So we are back to using better models of reality.
> >>Like transactions.
> >
> >
> > One of my key points is that
> >
> > ** Transactions are not a good model of reality **
> >
> > Rollbacks are not a good model of the world. I can think of no example
where
> > I can perform some actions in the real world and then decide that I did
not
> > actually want to do them and so roll them back. I cannot do things and
expect
> > others to not be able to see them until I'm happy and commit them. Heck I
> > wish I could, but we cannot freeze time, nor wind time back and undo what
we
> > have done. In reality we reverse things by taking compensating actions. I
if
> > withdraw too much money from my bank, I cannot 'roll back' my action,
rather I
> > need to deposit some of it back.
> >
>
>
> Yes, so most of the time these compensating transactions can be taken
> automatically by the database, while also taking care of concurrency and
> isolation issues.

I disagree that compensating transactions are a matter for automation by the database. They are a business modelling issue. Often there is more than one way to compensate for a previous action. For example if the database records who makes an action, then business rules about who is allowed to make a compensating action should not be decided by the DMBS.

> That's why we have so much theory on the subject.

Any advice on a good high level overview of this theory (my paper is seriously lacking good references). I don't particularly want to wade into Jim Grey's stuff (prob should though....)

> Now you want me to write code to do all these ? Forget about it.

I want your database to fully describe your business rules. Nothing of that can be delegated to the DBMS unless you want your DBMS to dictate some of your business rules.

> When I need a special case, I'll commit my transactions sooner rather
> than later, or even have the autocommit turned on, and I shall write
> code for special cases.

You keep thinking from the point of view of an applications developer, you need to think from the point of view of a user of the database.

[snipped for brevity some of the stuff on distributed systems]

> Yes, but I'll not be able to integrate that system with your
> transactionless database.

I don't think that is true. My claim is that my model can do every thing that transactions can do, so therefore I should be able to 'simulate' transactions in my model if that was needed for such an integration (assuming such an integration was desirable).

> You know transaction theory is not confined to
> the universe of databases.

I know. For example here are Web Services wanting transactions

   http://www.w3.org/2001/03/WSWS-popa/paper50

> They are an essential ingredient in the
> construction of many distributed algorithms.

A distributed algorithm works on a logically distributed system. A logically distributed database might be a useful concept. I don't know. It might even be a genuine 'extension' of the relational model (i.e. something that is based on the model, but has extra, complicating concepts). A 'distributed relational database' is, at the logical level and it's name notwithstanding, not a 'logically distributed system'.

[snip]
> Those mreley implementation issues, should lead you down the path that
> some implementation issues in design of advanced programming languages
> proved to be provably unsolvable.

??

[snip]
> >>Yes. You just shifted the burden to the client programmer.
> >
> >
> > Which burden? Although I probably agree with you. My model does, in
places,
> > require more work by users (and removes work from them in other places),
but
> > all the extra work is work to fully define the business rules of their
> > database, rather than delegating some of the rules (such as choose of
locking
> > protocol) to the DBMS and it's often arbitrary and inflexible defaults and
> > options.
> >
> >
>
> I think you seriously need to contemplate and quantify what this "extra
> work" means in a systematic way.

Agreed and I hope to get round to it...
Although I will say here that this "extra burden" is needed anyway if you want your databases to be a complete description of all your business rules.

> I hope it was a typo error when you implied we should not delegate
> things like choosing the locking protocol.

It was no typo ;-)

> As inflexible as that is in
> the DBMSes of today, it is correct and it works.

It is not correct IMO and works 'after a fashion', but I don't want to degenerate this into a debate about say the correctness or not of Oracle's multi-version read consistency.

> Delegate that to the user and then you'll see troubles.
>
>
> >>No, you made a claim, you have to provide at least as good mechanism as
> >>transactions do.
> >
> >
> > Agreed. I do provide a good as mechanism, well it's a lot better actually.
>
> Well, if you do provide that mechanism all is left to do is present that
> to us. What you presented so far is extremely unconvincing and rather
> flawed in my opinion.

It's a big topic and will take time to present all the aspects...

Regards
Paul Vernon
Business Intelligence, IBM Global Services Received on Fri May 02 2003 - 16:51:17 CEST

Original text of this message