Oracle 9i Limitation (Bug?) in Identifying Key Preserved Tables in Views

From: Dan Berkery <danberkery_at_yahoo.com>
Date: 26 May 2003 11:59:08 -0700
Message-ID: <d5831952.0305261059.4979ee86@posting.google.com>

My apologies in advance for the long post, but this issue has been nagging at me for months.

My basic issue is that I want to use an inline updateable view construction to do deletes on some very large tables. The details are described below (with an example). However, before I get started, I understand that there are other ways to do what is described below , but I think the inline view construction would be the most efficient way to do what I need to do. At the very bottom of the message are all the gory details of the real problem, but my key driver is efficienty. Since the tables I am working with are very large--up to 200 million rows for the larges, efficiency is a big concern.

To get the essentials out first, I tested my example shown below on Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production running on XP Home (I know an odd combination but it is a laptop and the OS is not likely to matter for this one). The production systems run on very large unix boxes (fully loaded p690s, superdomes, and sunfires with all the trimmings).

Below is a quick example to illustrate the issue (it is adapted from the real problem but simplified to the essence).

The system has a set of objects that are the fundamental work unit of the system; let's call them foos with a schema and use the following table creation command:

CREATE TABLE foo (
foo_id NUMBER NOT NULL PRIMARY KEY)
/

The way the system is used is to run jobs that process a subset of the foos; using a table job_foos that maps foos to jobs (this is a clsssic many to many resolution table since a foo will be part of zero of more jobs and each job includes one of more foos):

CREATE TABLE job_foos (
  job_id NUMBER NOT NULL,
  foo_id NUMBER NOT NULL,
  PRIMARY KEY(job_id, foo_id))
/

The jobs create a bunch of outputs but I just need one to illustrate the issue (I included the gratuitous sub_seq_num to help illustrate a point later):

CREATE TABLE FOO_OUTPUT (

  foo_id NUMBER NOT NULL, 
  seq_num NUMBER NOT NULL, 
  sub_seq_num NUMBER NOT NULL,

some_value NUMBER NOT NULL,
PRIMARY KEY(foo_id, seq_num, sub_seq_num))
/

Now the gripe. As part of the business process, as the first step in a job we delete an existing outputs for that job (actually it is a bit more complicated but that is largely beside the point). I thought this would be very easy using an updateable inline view construction for a delete statement like this:

DELETE FROM
(SELECT
    *
  FROM
    FOO_OUTPUT fo,
    JOB_FOOS jf
  WHERE fo.foo_id = jf.foo_id
    AND jf.job_id = 1)
/

This generates an ORA-01752: cannot delete from view without exactly one key-preserved table. This bothers me because I believe the key requirements for a key-preserved view to allow deletes are satisfied: 1) each record returned from the view corresponds to exactly one record in foo_output and 2) there is no other table referenced in the view for which this is true.

To illustrate that there is nothing fundamentally wrong with my concept the following statement is acceptable to Oracle (but makes no sense in my application)

DELETE FROM
(SELECT
    *
  FROM
    FOO_OUTPUT fo,
    JOB_FOOS jf
  WHERE fo.foo_id = jf.foo_id
    AND jf.job_id = fo.seq_num) -- changed literal for job_id to join criteria
/

Based on this statement being acceptable and the first not being acceptable, I think the SQL engine is missing the fact that specifying that a parent table's (job_foos' in this case) primary key component is equal to a constant is just as valid as joining it to the child table (foo_output in this case) in determining whether the child table is key-preserved.

Just to see if I could trick the SQL engine into seeing thing my way, I tried the following (unsuccessful) construction.

DELETE FROM
(SELECT
    *
  FROM
    (SELECT

       fo.*,
       1 job_id
     FROM FOO_OUTPUT fo) fov,

    JOB_FOOS jf
  WHERE fov.foo_id = jf.foo_id
    AND jf.job_id = fov.job_id)
/

This generates the same ORA-01752: cannot delete from view without exactly one key-preserved table.

Again, to illustrate that there is nothing fundamentally wrong with my new concept the following statement is acceptable to Oracle (but makes no sense in my application)

DELETE FROM
(SELECT
    *
  FROM
    (SELECT

       fo.*,
       1 job_id
     FROM FOO_OUTPUT fo) fov,

    JOB_FOOS jf
  WHERE fov.foo_id = jf.foo_id
    AND jf.job_id = fov.seq_num) -- changed join to column from fo
/

Am I missing something or should Oracle handle this?

Maybe Tom Kyte is reading and can point out my fundamental lack of understanding of key Oracle concepts (actually his posts are always far too nice to do that)--his posts are what got me so fired up to use key preserved views in the first place.

Thanks,
Dan Berkery

PS For those who are interested, a little more background (not that any of this is really your problem, but in case anyone is interested). The fundamental thing I am trying to do here is to selectively delete a bunch of records (but not all) from foo_output as quickly as possible without blowing out rollback segments. This is not academic as I have honestly run on systems where foo_output has 170M records (fully loaded p690s, superdomes, and sun fires with all the trimmings), so efficency matters. The best idea I have so far is to do a delete on inlined view construction with limitations on rownum (eg delete rows in 10M row chunks)--actually that is a bit of a simplification since job_foos is broken up into batches and I can use that to limit the delete's size. The delete is actually a bit hairier than in the example because I have to join to a couple of other tables to identify which records should be deleted. Fortunately, the schema design is such that none of these other tables imperil delete from inlined, key preserved view query construction. To add to the fun, in the previous version of the real system I have never actually managed to saturate the IO subsystem (multiple high end RAID arrays) so in the next version I want to try to run as many as 50 deletes in parallel by taking advantage of the fact that the foos in job_foos are mapped into a set of MECE batches (a batch contains multiple foos within a job and each foo in a job is in exactly one batch). Obviously, the odds that I blow out rollback segments and/or generate snapshot too old errors (the tables are concurrently being inserted into, selected from, and deleted frow) is large but the customer has purchased some serious hardware and expects that it be used as efficiently as possible (and I agree with them!).

If you are still reading all the way down here, thanks a lot for taking the time. At least I asked a question that has not been asked before :).

Cheers,
Dan Berkery

PPS The email is just for SPAM so please respond to the news group. Received on Mon May 26 2003 - 13:59:08 CDT