Re: INSTEAD OF INSERT trigger and SELECT

From: Volker Apelt <gq437x_at_yahoo.de>
Date: Mon, 17 Mar 2003 18:41:23 +0100
Message-ID: <lgptopc2m4.fsf@mu.biosolveit.local>

Galen Boyer <galenboyer_at_hotpop.com> writes:

> On Sun, 16 Mar 2003, gq437x_at_yahoo.de wrote:
>> Galen Boyer <galenboyer_at_hotpop.com> writes:
>>> On Fri, 14 Mar 2003, gq437x_at_yahoo.de wrote:
>>>>
>>>> We insert into one table from many clients via mass inserts.
>>>
>>> How does the data get there? Do the clients insert it? Do you guys
>>> get file feeds and load into a staging table? (ie, what's the steps
>>> to performing a mass insert?)

>> Thank you for asking. No, we are not importing files.
>>
>> The data is produced by the clients. It is a result of some

> Hm... Since the data source, ie client, isn't part of the key, the whole
> solution of how to handle duplicate rows should handle the fact that
> clients might send duplicate rows?

Yes, the clients can send identical rows and the database should accept the duplicates if they are 100% identical to some existing rows.

The 'select' part is a temporary debugging helper for us. It will not be part of the release version. The count == 1 branch catches the duplicates while the 'else' branch handles the invalid ID cases.

The debugging code has already catched some errors of another component, which had a dangling pointer. With that code in the trigger we could prove that the data send and data raising the DUP_VAL_ON_INDEX was different. So, the problem was neither the database nor the client data.

>> from the natural key data on the client (kind of compression). This
>> ID calculation is guaranteed to produce unique results.
>
> This looks like it might be the culprit. I would bet that two different
> rows are "hashing" to the same key (especially since the error is a PK
> violation?). Why not just use a sequence id as the PK and then use a
> unique index on the natural key? Then, check for unique index
> violations when inserting? (ie, you rolled your own "hashed natural
> key")

If we draw ids from a sequence instead of the hashed key, the other clients have to query the id values to use the same id for identical DATA.
>> after the calculation has started. So, the duplicate detection has to
>> be done by the database. Thats where it belongs anyway, IMHO.
>
> To isolate this, you could add to the existing trigger so, on insert, it
> inserts all inserted rows to a new unprotected table and you could do
> some analysis on that table? I bet you'll find duplicate PKs with
> different attributes on that row.

The current problem is that the database is rejecting a valid duplicate. I have checked the current content of that table and recalculated and compared the hash value. All pairs of ID+DATA are ok. The rejected row is even in the database.

Writing them aside into a different table may be an option for documenting them, but the utility.debug.f do the same.

Different questions:
1)
Is there any reason why a select on the same table as an insert within the same transaction could cause a problem?

2)
The day the problem started, I did some performance tests. I have changed the FREELISTS parameter for that table from 1 to 2. Can that cause a problem?

Thank you,

Volker

-- 
Volker Apelt

Received on Mon Mar 17 2003 - 11:41:23 CST