Re: Oracle 9i Limitation (Bug?) in Identifying Key Preserved Tables in Views

From: Dan Berkery <danberkery_at_yahoo.com>
Date: 27 May 2003 10:28:55 -0700
Message-ID: <d5831952.0305270928.7744c5d7@posting.google.com>

Dave, thanks for the alternate construction. In my focus on the key preserved view construction, I missed this option. An embarassing oversight as it is far more straightforward approach. In my defense, I had ruled out most subquery based approaches since I could not afford anything that executed one subquery per row in foo_output. Your query using IN, however, seems to provide a potential for a very efficient execution plan that identified the unique set of foo_ids and then used that set to do foo_output lookups and performed the delete.

Over the weekend I will run some benchmark tests to see how the IN construction performs compared to a delete from inline view construction (with the schema modified so that Oracle recognizes the foo_output table as key preserved--just a hack to see how the query might perform if my initial construction were supported by Oracle). A key criteria for me is the ability to delete in batches (say 200M rows deleted 10M at a time--with commits--so that the rollback buffers are not blown out) so I will use that as the benchmark test.

I will post a comparison of the performance of the two approaches. (I don't have access to a huge box right now so I will probably run the comparison as 20M rows deleted 1M at a time). If the IN formulation is just as fast, I will write off my concerns about this issue.

Thanks,
Dan

"Dave Hau" <davehau-no-spam-123_at_no-spam.netscape.net> wrote in message news:<V8xAa.568$Te4.79_at_newssvr19.news.prodigy.com>...
> Consider this updatable view:
>
> delete from
> (select * from emp, dept
> where emp.deptno = dept.deptno);
>
> 14 rows deleted
>
> rollback;
>
> Now consider this updatable view:
>
> delete from
> (select * from emp, dept
> where emp.deptno = 1
> and dept.deptno = 1);
>
> ORA-01752: cannot delete from view without exactly one key-preserved table
>
> If you think about how Oracle most likely implements updatable views, this
> makes sense. My guess is that Oracle looks at the join columns to make sure
> they constitute a unique key on the non-key-preserved table. In the first
> example above, emp.deptno and dept.deptno are the join columns. Because
> dept.deptno is the primary key of the dept table, and emp.deptno is not the
> primary key of the emp table, Oracle knows that emp is the key-preserved
> table and dept is the non-key-preserved table in this updatable view.
>
> Back to your example, you found that the following updatable view works:
>
> delete from
> (select * from job_foos jf, foo_output fo
> where jf.foo_id = fo.foo_id
> and jf.job_id = fo.seq_num);
>
> But this one doesn't:
>
> delete from
> (select * from job_foos jf, foo_output fo
> where jf.foo_id = fo.foo_id
> and jf.job_id = 1);
>
> This is because for the first one, Oracle sees that the join columns for
> job_foos are foo_id and job_id, and this is the composite primary key for
> the job_foos table. So the updatable view works. However, for the second
> one, the only join column for job_foos is foo_id which is not a unique key
> for the job_foos table. The join column for foo_output is foo_id which is
> not a unique key for the foo_output table either. So the updatable view
> fails.
>
> What this means is: for updatable views, you need to have join columns
> constitute unique keys for all but one of the tables joined. The one table
> where the join columns do not constitute a unique key becomes the
> key-preserved table.
>
> In your case, I suggest you do this instead:
>
> delete from foo_output
> where foo_id in
> (select foo_id from job_foos where job_id = 1);
>
> Cheers,
> Dave
>
>
>
>
>
> "Dan Berkery" <danberkery_at_yahoo.com> wrote in message
> news:d5831952.0305261059.4979ee86_at_posting.google.com...
> > My apologies in advance for the long post, but this issue has been
> > nagging at me for months.
> >
> > My basic issue is that I want to use an inline updateable view
> > construction to do deletes on some very large tables. The details are
> > described below (with an example). However, before I get started, I
> > understand that there are other ways to do what is described below ,
> > but I think the inline view construction would be the most efficient
> > way to do what I need to do. At the very bottom of the message are
> > all the gory details of the real problem, but my key driver is
> > efficienty. Since the tables I am working with are very large--up to
> > 200 million rows for the larges, efficiency is a big concern.
> >
> > To get the essentials out first, I tested my example shown below on
> > Oracle9i Enterprise Edition Release 9.2.0.1.0 - Production running on
> > XP Home (I know an odd combination but it is a laptop and the OS is
> > not likely to matter for this one). The production systems run on
> > very large unix boxes (fully loaded p690s, superdomes, and sunfires
> > with all the trimmings).
> >
> > Below is a quick example to illustrate the issue (it is adapted from
> > the real problem but simplified to the essence).
> >
> > The system has a set of objects that are the fundamental work unit of
> > the system; let's call them foos with a schema and use the following
> > table creation command:
> >
> > CREATE TABLE foo (
> > foo_id NUMBER NOT NULL PRIMARY KEY)
> > /
> >
> > The way the system is used is to run jobs that process a subset of the
> > foos; using a table job_foos that maps foos to jobs (this is a clsssic
> > many to many resolution table since a foo will be part of zero of more
> > jobs and each job includes one of more foos):
> >
> > CREATE TABLE job_foos (
> > job_id NUMBER NOT NULL,
> > foo_id NUMBER NOT NULL,
> > PRIMARY KEY(job_id, foo_id))
> > /
> >
> > The jobs create a bunch of outputs but I just need one to illustrate
> > the issue (I included the gratuitous sub_seq_num to help illustrate a
> > point later):
> >
> > CREATE TABLE FOO_OUTPUT (
> > foo_id NUMBER NOT NULL,
> > seq_num NUMBER NOT NULL,
> > sub_seq_num NUMBER NOT NULL,
> > some_value NUMBER NOT NULL,
> > PRIMARY KEY(foo_id, seq_num, sub_seq_num))
> > /
> >
> > Now the gripe. As part of the business process, as the first step in
> > a job we delete an existing outputs for that job (actually it is a bit
> > more complicated but that is largely beside the point). I thought
> > this would be very easy using an updateable inline view construction
> > for a delete statement like this:
> >
> > DELETE FROM
> > (SELECT
> > *
> > FROM
> > FOO_OUTPUT fo,
> > JOB_FOOS jf
> > WHERE fo.foo_id = jf.foo_id
> > AND jf.job_id = 1)
> > /
> >
> > This generates an ORA-01752: cannot delete from view without exactly
> > one key-preserved table. This bothers me because I believe the key
> > requirements for a key-preserved view to allow deletes are satisfied:
> > 1) each record returned from the view corresponds to exactly one
> > record in foo_output and 2) there is no other table referenced in the
> > view for which this is true.
> >
> > To illustrate that there is nothing fundamentally wrong with my
> > concept the following statement is acceptable to Oracle (but makes no
> > sense in my application)
> >
> > DELETE FROM
> > (SELECT
> > *
> > FROM
> > FOO_OUTPUT fo,
> > JOB_FOOS jf
> > WHERE fo.foo_id = jf.foo_id
> > AND jf.job_id = fo.seq_num) -- changed literal for job_id to join
> > criteria
> > /
> >
> > Based on this statement being acceptable and the first not being
> > acceptable, I think the SQL engine is missing the fact that specifying
> > that a parent table's (job_foos' in this case) primary key component
> > is equal to a constant is just as valid as joining it to the child
> > table (foo_output in this case) in determining whether the child table
> > is key-preserved.
> >
> > Just to see if I could trick the SQL engine into seeing thing my way,
> > I tried the following (unsuccessful) construction.
> >
> > DELETE FROM
> > (SELECT
> > *
> > FROM
> > (SELECT
> > fo.*,
> > 1 job_id
> > FROM FOO_OUTPUT fo) fov,
> > JOB_FOOS jf
> > WHERE fov.foo_id = jf.foo_id
> > AND jf.job_id = fov.job_id)
> > /
> >
> > This generates the same ORA-01752: cannot delete from view without
> > exactly one key-preserved table.
> >
> > Again, to illustrate that there is nothing fundamentally wrong with my
> > new concept the following statement is acceptable to Oracle (but makes
> > no sense in my application)
> >
> > DELETE FROM
> > (SELECT
> > *
> > FROM
> > (SELECT
> > fo.*,
> > 1 job_id
> > FROM FOO_OUTPUT fo) fov,
> > JOB_FOOS jf
> > WHERE fov.foo_id = jf.foo_id
> > AND jf.job_id = fov.seq_num) -- changed join to column from fo
> > /
> >
> > Am I missing something or should Oracle handle this?
> >
> > Maybe Tom Kyte is reading and can point out my fundamental lack of
> > understanding of key Oracle concepts (actually his posts are always
> > far too nice to do that)--his posts are what got me so fired up to use
> > key preserved views in the first place.
> >
> > Thanks,
> > Dan Berkery
> >
> > PS For those who are interested, a little more background (not that
> > any of this is really your problem, but in case anyone is interested).
> > The fundamental thing I am trying to do here is to selectively delete
> > a bunch of records (but not all) from foo_output as quickly as
> > possible without blowing out rollback segments. This is not academic
> > as I have honestly run on systems where foo_output has 170M records
> > (fully loaded p690s, superdomes, and sun fires with all the
> > trimmings), so efficency matters. The best idea I have so far is to
> > do a delete on inlined view construction with limitations on rownum
> > (eg delete rows in 10M row chunks)--actually that is a bit of a
> > simplification since job_foos is broken up into batches and I can use
> > that to limit the delete's size. The delete is actually a bit hairier
> > than in the example because I have to join to a couple of other tables
> > to identify which records should be deleted. Fortunately, the schema
> > design is such that none of these other tables imperil delete from
> > inlined, key preserved view query construction. To add to the fun, in
> > the previous version of the real system I have never actually managed
> > to saturate the IO subsystem (multiple high end RAID arrays) so in the
> > next version I want to try to run as many as 50 deletes in parallel by
> > taking advantage of the fact that the foos in job_foos are mapped into
> > a set of MECE batches (a batch contains multiple foos within a job and
> > each foo in a job is in exactly one batch). Obviously, the odds that
> > I blow out rollback segments and/or generate snapshot too old errors
> > (the tables are concurrently being inserted into, selected from, and
> > deleted frow) is large but the customer has purchased some serious
> > hardware and expects that it be used as efficiently as possible (and I
> > agree with them!).
> >
> > If you are still reading all the way down here, thanks a lot for
> > taking the time. At least I asked a question that has not been asked
> > before :).
> >
> > Cheers,
> > Dan Berkery
> >
> > PPS The email is just for SPAM so please respond to the news group.
Received on Tue May 27 2003 - 12:28:55 CDT