Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Removing dublicate records in table and hostarray
Brian Rasmusson <br_at_belle.dk> wrote (in
<354077AD.4037658F_at_belle.dk>)...
| Hi,
|
| Here is my detailed scenario:
|
| 1. I have table X with 1000's of transactions
| 2. I have destination table Y with millions of processed transactions
|
| I read 1000 records from table X into my hostarray.
| I want to find out which records in the hostarray already exists in the
| destination table Y.
|
This is where some of us are having a problem. Why don't you want to
read only the records which are not duplicates? It will not be a
problem.
So when you are filling your host array with transactions us a query
which only gets rows which are certain to be inserted
i.e:
select
t.rowid, t.key_col, t.col1, t.col2, 0 t.status
from
transactions_X t
where
not exists
(select null
from master_Y m m.key_col = T.key_col);
Note: this query will use the index on the Master table primary key.
| What i would like is a sql statement that can return a list of statuses,
| one for each record on my hostarray.
| I would do something like this:
|
| col1[1000]
| col2[1000]
| status[1000]
|
| col1+2 is filled by my select from table x
| status is filled by an examination in table y which records from my
| hostarray already exists.
| I imagine a statement returning a 0 if the record is not a dupe, and 1
| if it is. I can then insert all records with status 0, and know it wont
| failed because of the primary key constraint.
|
Now, insert all the rows in the host array then delete them from the
transaction table using the t.rowid as the key. Or mark them as
processed or whatever.
This process will also be quick as you are accessing them by row-id
(using the host array at a time processing - one call to Oracle for
each 1000 rows).
| I continue the steps above, until table X is empty. It may contain
| 100.000's of records, and the destination table contains millions of
| records, so there is no way i can have all records in memory at the same
| time. Thats why i read them in chunks of 1000 records.
|
At the and of the cursor you will have inserted (and removed, marked
or otherwise) all the non-duplicates.
What is left in the transaction table is the ones which already
existed in the master table. Do with these what you will.
| Table X is filled by process A, while process B reads data from it. This
| cannot be synchronized.
Has no bearing on the above processing. imo
graham Received on Fri Apr 24 1998 - 07:28:36 CDT