Re: more on delete from join

From: Kevin Kirkpatrick <kvnkrkptrck_at_gmail.com>
Date: Fri, 28 Aug 2009 13:47:46 -0700 (PDT)
Message-ID: <8b751f60-779b-4cdd-8361-8ab27cdc4ea0_at_e27g2000yqm.googlegroups.com>


On Aug 28, 12:42 pm, paul c <toledobythe..._at_oohay.ac> wrote:
> Kevin Kirkpatrick wrote:

(snip)

>
> It might be interesting for defining an environment that allows the
> language features mentioned to reference an algebra or calculus (which
> don't have a notion of update in the first place), but I don't see why
> normalization needs to be introduced to implement updating of bases nor
> of views. If it does need to be, then presumably the updating of any
> relation somehow depends on how 'normal' it is.

Perhaps I'm missing something. But given EMP{ENO, ENAME, SAL}, and business rules that allow a user to assign ENAME but forbid that user from assigning SAL, it is impossible for that user to express: UPDATE EMP SET ENAME = 'JOHN DOE' WHERE ENO = 'E123'; as a relational assignment which didn't violate the security business rule.

> I don't see what
> triggers have to do with updating either, even though some
> implementations require them.

I only brought up triggers as an analogy - an example of a tool which attains ends that, IMO, can always be accomlished by preferable means (e.g. better data modeling and better constraint support of the DBMS). My point was to say that perhaps "updatable views" are similar - perhaps they are being used to accomplish tasks which might be better addressed in a manner that avoids the nonsensical notion of "asserting conclusions".

> By the way, why assume that CURRENT_USER
> is not updateable?
>

Great question, cuts to the heart of the matter: It can't be updated because it is a view. It returns an conclusion, and it is not (IMO) valid to assert conclusions. In this case, the view returns the *conclusion* that the current user of the database connection has been authenticated as <ENO>. Perhaps CURRENT_USER is based on the join of two base relvars

DBUSER {USERNAME} -- in many DBMS's, this is actually a global variable, e.g. "USER" in Oracle, but for our purposes, we'll assume it is a base relvar
EMP_USER {USERNAME, ENO} -- this would associated each username with a specific employee who is allowed to log in as that database user.

So CURRENT_USER is defined as the view {ENO} JOIN (DBUSER, EMP_USER)

So if I logged in as USERNAME = 'U53', and 'U53' is the authenticated user id of employee 'E143', a query of CURRENT_USER would return 'E143'. Now, if I suddenly wanted the database to conclude that I was employee E144, I might execute the statement:

UPDATE CURRENT_USER SET ENO = E144; Let's assume this works, and I get the message such as "1 row updated". Furthermore, I query CURRENT_USER, and I get a response 'E144'. Now, what has been asserted behind the scenes?

  1. The USERNAME with which I've logged in changed from 'U53' to 'U55' (the USERNAME assocated with 'E144') ?
  2. The USERNAME stays 'U53', but 'E144' become the ENO associated with 'U53', and 'E143' is switched to the USERNAME 'U55'?

Either of these changes are ways that the DBMS *could* modify the propositions of the database such that my desired conclusion will be reached. But I don't need to dig into the relational theory to know that, from the end-user perspective, neither is the *sole* correct approach to this - if forced to change the CURRENT_USER without an updatable view, different users might use either of these tactics.

Consider another example - imagine you have a just a single table in your model which deals with BOSS information:

BOSS {ENO, BOSS_ENO} "<ENO> works for <BOSS_ENO>"

Now, imagine an end-user says, "I want to be able to update the database and store the proposition 'ENO=E12 either works for ENO=E43 or ENO=E44'." What is the appropriate response? Hopefully, you'd not miss a beat in saying, "The data model does not support propositions of that form", perhaps followed by an helpful, "Do we need to modify the model such that it can accomodate statements like that?". Basically, it's the same response as you'd give for "Can I enter an employee's marital status?", given a data model with no notion of marital status.

Now imagine a second end-user approaches you and says, "I have a query which returns all employess who report to either E43 or E44. I'd like to see Jane (ENO='E12') to appear in the query results, but I'm not sure which of those two she works for. How can I update the database so she appears in my query, without assigning her to one or the other?" Does this end-user get a different answer? Of course not. One simply can't assert propositions that don't fit predicates of the underlying data model.

So if a third end-user says, "I created a view that shows me all employees who report to either E43 or E44 and I want to insert tuple {ENO='E12'} into this view", why is it that we suddenly have an interesting problem in need of deep theoretical analysis? After all, if nobody is talking about a "correct" way for the DBMS to handle a statement like this:

INSERT INTO (SELECT ENO FROM BOSS WHERE BOSS_ENO IN ('E43','E44')) VALUES ('E12'). And nobody is talking about a "correct" response to this: INSERT INTO (SELECT ENO FROM BOSS WHERE BOSS_ENO = 'E43' UNION SELECT ENO FROM BOSS WHERE ENO = 'E44')
VALUES ('E12'). Then why do we have endless conversations about "correctly" handling this:

CREATE VIEW A_NAME_FOR_A_QUERY AS SELECT ENO FROM BOSS WHERE BOSS_ENO = 'E43' UNION SELECT ENO FROM BOSS WHERE ENO = 'E44'; INSERT INTO A_NAME_FOR_A_QUERY VALUES ('E12'); ?

> When it comes to updating, I'd prefer to use as few concepts as
> possible, named sets of tuples, algebraically expressible constraints
> and a set of algebra operators. Design matters might determine
> particular results but a logical engine shouldn't distinguish between
> designs, that's what the old hierachical and network systems did.
> Regarding an UPDATE verb, it is probably simpler to do what TTM does and
> assume an algebra that has extend and rename operators. I suppose a
> dbms that follows a mere algebra could distinguish 6NF relations, but I
> think it would be ponderous to use.- Hide quoted text -
>
> - Show quoted text -

It's kind of an aside to this conversation, but my underlying idea here is that if a business rule differentiates between two parts of a predicate, e.g. "Some people can update this column, but not that column", then the data model should treat them as two separate predicates. I believe it's the same basic logic used by Darwen & Date when they argue for using 6th normal form when dealing with parts of predicates which may have different date ranges (if i recall correctly, they'd have you decompose EMP{ENO,ENAME, SAL} into EMP_NAME {ENO, INTERVAL, ENAME} and EMP_SAL {ENO, INTERVAL, SAL} if names and salaries might have different temporal inverval values).

Anyway, as I said, it's kind of an offshoot from the conversation about updatable views; perhaps we can pick it up in a different thread. Received on Fri Aug 28 2009 - 22:47:46 CEST

Original text of this message