calculating aggregates over row intervals in an update

From: Wil Simpson <akusai_at_mac.com>
Date: 6 Aug 2004 16:10:06 -0700
Message-ID: <36016a92.0408061510.687ecf92_at_posting.google.com>



what is the best way to calculate: (in the context of an update) the number of transactions that were entered in the same way as the current transaction out of the 5 last transactions by the cardholder (regardless of entry method)

or more generally stated: calculate some aggregate function over a subset of rows from an interval of rows defined around this row, i.e. the one being updated

the set-theory happy version:

CREATE TABLE cg (
cardnumber varchar2(16) not null,
entrytype char(1) not null,
chargedate date not null,
NBANAT5#DD0 numeric(7) null) ;

UPDATE cg
SET NBANAT5#DD0
= NVL(( SELECT COUNT(*)

        FROM cg t   
        WHERE   t.cardnumber = cg.cardnumber 
        AND t.entrytype = cg.entrytype  
        AND ( SELECT COUNT(*)   
              FROM cg t1   
              WHERE   t1.cardnumber = cg.cardnumber    
              AND t1.chargedate >= t.chargedate 
              AND t1.chargedate < cg.chargedate) <= 5
        AND t.chargedate <= cg.chargedate  ),0) ;

so, for every single row in the transaction history for the cardholder, it calculates how many rows exist between it and the current transctions to determine if it is in the last 5 or not. it works and when each cardholder has a sufficiently small transaction history then the performance is not TOO bad, but when the transaction history becomes long it is grotesque. (well, it's grotesque in the first place, but the response time will change from maybe a minute over a million rows with small transaction histories to being several hours over a million rows with long transaction histories)

if it were legal, then the following would perform better, but it isn't. (no correlated subquery in an in-line view, right?)

UPDATE cg1000
SET NBANAT5#DD0
= NVL(( SELECT count(*)

        FROM ( SELECT * from cg t1
               where t1.bank_account_number = cg.bank_account_number
               and t1.chargedate < cg.chargedate
               order by t1.chargedate desc) t
        WHERE ROWNUM <= 5
        and t.account_type = cg.account_type),0) ;

conceivable something like

SUM(decode(entrytype, this.entrytype, 1, 0) OVER (partition by cardnumber order by chargedate desc rows between 5 preceding and 1 preceding)

would work as well, but i don't see any means of using this in the context of an udpate statement. (can't use aggregate or analytic functions in update statements, right?) (and several customers are using 8i standard which i don't think has analytic functions, right?)

any suggestions?

thank you,
Wil Received on Sat Aug 07 2004 - 01:10:06 CEST

Original text of this message