Path: news.easynews.com!core-easynews!newsfeed1.easynews.com!easynews.com!easynews!newsfeed.frii.net!newsfeed.frii.net!news.compaq.com!uunet!sac.uu.net!ash.uu.net!an02.austin.ibm.com!ausnews.austin.ibm.com!newsfeed.btv.ibm.com!gazette.almaden.ibm.com!not-for-mail
From: "Paul Vernon" <paul.vernon@ukk.ibmm.comm>
Newsgroups: comp.databases.theory
Subject: Re: "Transactions are bad, real bad" - discuss
Date: Wed, 30 Apr 2003 19:17:05 +0100
Organization: IBM Almaden Research Center
Lines: 138
Message-ID: <b8p42r$1ltu$1@gazette.almaden.ibm.com>
References: <3EAE16E6.2070803@earthlink.net> <b8ljgd$383i$3@gazette.almaden.ibm.com> <e4330f45.0304300221.855dfca@posting.google.com>
NNTP-Posting-Host: 9.169.203.28
X-Trace: gazette.almaden.ibm.com 1051726749 55230 9.169.203.28 (30 Apr 2003 18:19:09 GMT)
X-Complaints-To: postmaster@almaden.ibm.com
NNTP-Posting-Date: Wed, 30 Apr 2003 18:19:09 +0000 (UTC)
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2800.1158
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
Xref: core-easynews comp.databases.theory:26062
X-Received-Date: Wed, 30 Apr 2003 12:36:02 MST (news.easynews.com)

"Alfredo Novoa" <alfredo@ncs.es> wrote in message
news:e4330f45.0304300221.855dfca@posting.google.com...
> "Paul Vernon" <paul.vernon@ukk.ibmm.comm> wrote in message
news:<b8ljgd$383i$3@gazette.almaden.ibm.com>...
>
> > Well that is exacly what I am arguing. We do not require transations in
the
> > (logical level of) the relational model. In fact they are positively
harmful.
>
> The only problem I see is if you are in a client server enviroment,
> how can you check if the partial updates are correct?
>
> Imagine the operator is typing a long invoice. He could like to know
> if each item line is correct before typing the next line.
>
> Of course even with transactions he can not be 100% sure, but It will
> work in lots of cases, and it is practical.
>

Good one Alfredo. Another sneaky thing that transactions are supposed to be
good for. One of the original justifications for transactions I would imagine.

You know, it occurs to me that one reason transactions achieved such a mind
share is that they do seem to offer a solution to many diverse problems. The
weight of all these answers has caused people to accept that the concept
itself must be a good one, must be fundamental somehow.  However, I know of no
proof that transactions are fundamental; that the relational model is
incomplete without them. They are just taken as granted, and any problems that
they cause are just accepted as the way things have to be.  No wonder that the
database community has been off in a ghetto all this time. How can other
computer scientists take us seriously when we promote inessential and harmful
concepts is such a manner?

Anyhow, lets see what is really going on in this case and see if transactions
really are good for something (but no surprises for guessing the answer
already).

I want to offer you some alternate solutions.

A) Invoice lines are persistent

Each invoice line is stored in the database in a set of tables set aside for
'invoices in progress'. These tables have the relevant constraints on them and
as each line is inserted (and committed) into these tables, the constraints
are checked. Once an invoice is complete - defined by say a view over the
'invoices in progress' it can be moved into the 'main' database tables.

Frankly this is the best design. It avoids issues such as not defining the
constraint
    'Every invoice must have at least one line'
in the main database. This kind of omission happens all too frequently in
current databases. It is omitted to allow an Invoice to be build
'incrementally', with initially only a invoice header and no invoice lines.
The problem is that the constraints on an Invoice are actually more
restrictive than the constraint on an Invoice-in-progress

One proposed (I would say hacked) answer to this constraint omission, is the
suggestion that tuple and table constraints be enforced at each statement, but
database constraints should be checked at transaction commit. Chris Date has
written on dbdebunk.com on the fallacy of not checking all constraints at each
statement (reduction of  semantic optimisation opportunities was one
argument), and will write further in his upcoming 8th edition of database
systems. I suggest those in favour of SQL style constraint checking read his
writings.


B) Invoice lines are not persistent

As above, but the tables set aside for 'invoices in progress' are local,
in-memory tables on the client. Again, the constraints on these will be
similar, but not identical to those on 'completed invoices'.

On the matter of redundancy between the constraints and table structures in
the 'invoices in progress' tables and the 'completed invoices' tables. I say
that any exploitation of this redundancy is a matter for the database design
tool - it is a data modelling matter. Alternatively, or additionally, it could
be seen as a higher-level constraint matter. E.g. no matter the specific
schema for 'completed invoices', there will be a one-to-one mapping of
attributes and tuple constraints to the schema for 'invoices in progress'.


C) Invoice lines are not persistent, but don't want to design an 'invoices in
progress'

Lets say that the constraints on an 'invoice in progress' are identical to
those on 'completed invoice'. In this case, if we don't want persistency, we
might well want to use the DBMS constraints defined for 'completed invoice',
for our 'invoices in progress'.  With transactions you can 'simply' build your
invoice incrementally in the DBMS, relying on it's statement level constraint
checking. This is fine if you accept the downsides of transactions
(highlighted elsewhere).
The problem we have is: how do we get to test the DBMS constraints against our
'invoice in progress' without actually doing an update? Now the constraints
are available in the catalog, all we need do is pull them out and test them
against our proposed update.  The quick thinkers here will see that we have a
higher-order problem here. The constraints are stored as values in the
catalog, we need to 'execute' those values against the other values in the
database.

For example, making a higher-order syntax up on the fly...

    EVALUATE (Catalog.Constraints.RelExpresion) OVER (D UNION 'invoice in
progress rows')

which would return the Boolean value TRUE, if all constraints evaluate to TRUE
when evaluated over the current value of the database D unioned with our
proposed update.


Hoping that people have followed me thus far, the above is for me just yet
another example of where a critical look at transactions shows that they are
inessential, and further, that in fully replacing their functionality within
the confines of the relational model, we highlights a number of important
aspects of the model hitherto under exposed.

My list so far (newly extended with 'higher-order' queries (albeit ones
limited to evaluating values from the catalog)) includes:
    multiple assignment / dbvar assignment
    selection-assignment
    multiple relation selection
    catalog update intermixed with relation assignment
    higher-order queries using catalog values
    'constraint locking'
    triggered external actions
    plus others....



To conclude. Transactions are not essential and not good for this 'partial
update' problem. In fact, I believe they are good for nothing (in the
relational model).

Regards
Paul Vernon
Business Intelligence, IBM Global Services


