calculating aggregates over row intervals in an update
Date: 6 Aug 2004 16:10:06 -0700
Message-ID: <36016a92.0408061510.687ecf92_at_posting.google.com>
what is the best way to calculate: (in the context of an update) the number of transactions that were entered in the same way as the current transaction out of the 5 last transactions by the cardholder (regardless of entry method)
or more generally stated: calculate some aggregate function over a subset of rows from an interval of rows defined around this row, i.e. the one being updated
the set-theory happy version:
CREATE TABLE cg (
cardnumber varchar2(16) not null,
entrytype char(1) not null,
chargedate date not null,
NBANAT5#DD0 numeric(7) null) ;
UPDATE cg
SET NBANAT5#DD0
= NVL(( SELECT COUNT(*)
FROM cg t WHERE t.cardnumber = cg.cardnumber AND t.entrytype = cg.entrytype AND ( SELECT COUNT(*) FROM cg t1 WHERE t1.cardnumber = cg.cardnumber AND t1.chargedate >= t.chargedate AND t1.chargedate < cg.chargedate) <= 5 AND t.chargedate <= cg.chargedate ),0) ;
so, for every single row in the transaction history for the cardholder, it calculates how many rows exist between it and the current transctions to determine if it is in the last 5 or not. it works and when each cardholder has a sufficiently small transaction history then the performance is not TOO bad, but when the transaction history becomes long it is grotesque. (well, it's grotesque in the first place, but the response time will change from maybe a minute over a million rows with small transaction histories to being several hours over a million rows with long transaction histories)
if it were legal, then the following would perform better, but it isn't. (no correlated subquery in an in-line view, right?)
UPDATE cg1000
SET NBANAT5#DD0
= NVL(( SELECT count(*)
FROM ( SELECT * from cg t1 where t1.bank_account_number = cg.bank_account_number and t1.chargedate < cg.chargedate order by t1.chargedate desc) t WHERE ROWNUM <= 5 and t.account_type = cg.account_type),0) ;
conceivable something like
SUM(decode(entrytype, this.entrytype, 1, 0) OVER (partition by cardnumber order by chargedate desc rows between 5 preceding and 1 preceding)
would work as well, but i don't see any means of using this in the context of an udpate statement. (can't use aggregate or analytic functions in update statements, right?) (and several customers are using 8i standard which i don't think has analytic functions, right?)
any suggestions?
thank you,
Wil
Received on Sat Aug 07 2004 - 01:10:06 CEST