Re: "Transactions are bad, real bad" - discuss

From: Costin Cozianu <c_cozianu_at_hotmail.com>
Date: Tue, 29 Apr 2003 14:31:36 -0700
Message-ID: <b8mqpa$bf0u1$1_at_ID-152540.news.dfncis.de>

Paul Vernon wrote:
> "Costin Cozianu" <c_cozianu_at_hotmail.com> wrote in message
> news:b8mecc$bnbun$1_at_ID-152540.news.dfncis.de...
>

>>Ok, let's say you want only single statement updates,

>
>
> Agreed.
>
>

>>with single
>>statement supporting multiple database operations.

>
>
> No. A single statement performs a single 'update' - a 'update' to the database
> value. Strictly speaking we are replacing the current value of the database
> variable (dbvar) with a new value. If you wish to take the current dbvar
> value, delete a few tuples, insert a few tuples and update a few tuples all in
> many relations, then assign the result to the dbvar to be the new dabase
> value, you have still only performed one database operation.
>

That's a useful abstraction for you to claim that your model can be sustained, it's not a useful abstraction for me at all as the user of the database.

How you're going to transport or refer to the new database value in a client server protocol ? It's not making sense at all.

What I do with INSERT , UPDATE, DELETE SELECT is that I apply a transformation to the old database value. I couldn't care less most of the times what was the old value and what is the new value, only some properties thereof.

As a consequence I'm not receiving database values nor am I sending database values, I am applying operations. Some operations are a composition of many simpler operations. This composition should obey the properties of transactions, and should support some concurrency control features.

Otherwise, I won't be using your model at all, no thank you very much.

> Remember INSET, DELETE and UPDATE are not primitives, only assignment is.
>

Quite. +,-,*,/ are not primitive only *succ* is succ(0)=1, succ(1)=2 ... , + n 0 = n, + n succ(k) = succ( + n k), ....

Let's set aside programming languages for a while and let's joyfully program in Peanno's arithmetic, with a little ambition we can do it :)

>
>

>>Now, how do you design a client server protocol, in which the client
>>programmer wants to decide later in the host language some based on the
>>results from database and other possibly external conditions, what is
>>the next database update to be executed (of course in a transaction-like
>>mode if you claim you don't support tansaction).

>
>
> I think it would be most productive if you could give an example of such a
> scenario.
>

I'm sorry it is you who make claims here. If you don't have a systematic approach to address my needs, and claim you have a model that covers everything you're on the loosing side.

Here's a deal: I reserve a sum to be debited against a client account (the sum is not debited yet, just the available balance reflects the reservation - i.e. the client has committed to pay). Based on some other happenings in the client software (let's say I cannot performa the service, or I can, etc), I want either to go ahead with the debit, or cancel the transaction (ROLLBACK). Compose several of these happening

BEGIN
db->DB_STATEMENT1
client->ACTION1
db->DB_STATEMENT2
client->ACTION2
db->STATEMENT3
END Where each consecutive action depends on some results of the previous action, and at each point in time client should be able to decide a rollback.

> My short answer is, if the 'client programmer' want to freeze time then
> tough - that is not allowed in this model of reality.
>

Tough luck indeed. So we are back to using better models of reality. Like transactions.

>
>

>>How do you integrate databases and others in a distributed transaction ?

>
>
> The job of integrating databases and of supporting distributed databases is a
> systems issue, its at the *physical level* of the model. Obviously I'm not
> saying that the general concept of a transaction is not needed at such levels,
> I'm just saying that is real bad to allow such things to poke through to the
> logical, model level.
>
> If you want a distributed databases, then the DMBS will need to support
> transactions under the covers, but the users don't need to see this, they just
> update their database value as they would with a single database.
>

Well, another reality check for you: not all participants in distributed systems are databases. Even if some are databases not all need to support relational algebra. Therefore you cannot say

BIG_VALUE_OFTHE_WHOLE WORLD := NEW_VALUE_OF_THE_WHOLE_WORLD Imagine the scenario above with DB and client, compounded by several participating resource managers.

>
>

>>How do you allow for rollback, possibly even in the middle of an update
>>operation ?

>
>
> At what level? As far as users are concerned, there is no 'middle of an update
> operation'. If the DBMS runs out of disk space mid update, then it's business
> as usual; the update gets backed out and an error message returned to the
> users.
>

The middle of an update means exactly what I said the middle of an update. I perform an update, I get some results back, do a computation in the client, depending on my results, I'll issue some more updates, or I want to issue a rollback.

>
>

>>Related to the above, what is your model for concurrency control ? Do
>>you only allow for fully serializable executions ?

>
>
> Well (at the logical level) obviously. At the physical level, obviously not -
> just as long as the illusion, i.e. the actuality as far as the users are
> concerned, is full serializability. That the dbvar moves atomically from one
> value to the next in the same sequence that the CURRENT TIMESTAMP given for
> each user's query or update dictates.
>
>

>>How is database going
>>to decide that, if instead of a peacemeal statement by statement
>>transaction, it will have to wait somehow for the full single statement
>>behemoth ?

>
>
> This is no different from today. Well it is different as many existing DBMS
> systems do not respect the rule that transaction timestamp order needs to
> agrees with transaction serialization order. Banning transactions makes
> keeping to this rule much easier to implement.

Yes. You just shifted the burden to the client programmer.

> Not ignoring this rule does
> deprive us of a possible hack to improve performance. but because we don't
> have transactions waiting on external code (or worse, external events) we
> don't have such serious concurrency issues. Anyhow is it better to be fast or
> correct?
>
> To be fair, I've not investigated this aspect all that thoroughly, but I only
> see benefits here. For example, the DBMS knows the full extent of the database
> that any statement will touch. With a transaction, the full scope is usually
> unknown at the beginning of the transaction. Second, transactions encourage
> users to alter data in place and in a piecemeal fashion, banning transactions
> encourages users to build up data elsewhere then apply it in one go. This
> frees the existing data from contention and helps concurrency. Third, for
> updates that can be proved to be correct (e.g. divide by zero not possible,
> duplicate violation not possible, ...), long running update statements (say
> DELETE FROM BigTable WHERE Date + 1 Month < TODAY) can actually be *included*
> in subsequent queries/updates. So SELECT COUNT(*) FROM BigTable becomes SELECT
> COUNT(*) FROM BigTable WHERE Date + 1 Month < TODAY, while the big delete is
> still running...
>
> BTW If you could come up with an example of where you think a 'large' single
> statement will under perform, or provide worse concurrency than many
> statements within a transaction, then that could be useful to our discussions.
>

No, you made a claim, you have to provide at least as good mechanism as transactions do. You have to do that in a consistent and systematic fashion, it's not like: "give me this particular use-case, I'll program something ad-hock,and see, it can be done without transactions". I know it can be done without transactions, by reinventing some wheels, that is.

>>This is just to begin with, depending on your responses we'll see what
>>follows...

>
>
> Rubbing my hands ;-)
>
> Regards
> Paul Vernon
> Business Intelligence, IBM Global Services
>
>

Best,
Costin Cozianu Received on Tue Apr 29 2003 - 23:31:36 CEST

This message: [ Message body ]
Next message: Mikito Harakiri: "Re: "Transactions are bad, real bad" - discuss"
In reply to Paul Vernon: "Re: "Transactions are bad, real bad" - discuss"
Next in thread: Paul Vernon: "Re: Do Data Models Need to built on a Mathematical Concept?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message