Re: How do you like this SQL?

From: John Hurley <johnthehurley_at_gmail.com>
Date: Sat, 15 Dec 2012 16:20:51 -0800 (PST)
Message-ID: <e3233e32-eb04-4987-92d2-633a29af0eab_at_a2g2000yqh.googlegroups.com>



On Dec 15, 2:57 pm, Mladen Gogala <gogala.mla..._at_gmail.com> wrote:
> On Sat, 15 Dec 2012 12:19:44 +0000, Jonathan Lewis wrote:
> > It's one of the standard methods for "find the most recent occurrence"
>
> This can be generalized  to find the most numerous item as well, also
> frequently needed in the applications. And this generic method also
> inevitably leads to performance problems when the data grows over the
> threshold. The problem with the method is self join which needs to scan
> the table twice. The first stab at solving the
> problem would, in my opinion, look something like this:
>
> WITH INNER AS
>      SELECT PO_ID, LINE_ITEM_PO, COMMENT_PO, DATE_ADD_COMMENT,
>       ROW_TIMESTAMP
>       MAX(ROW_TIMESTAMP) OVER (PARTITION BY (PO_ID,LINE_ITEM_PO)) AS REC
>       FROM PO_COMMENT
>       WHERE   TYPE_PO_COMMENT='LE')
> SELECT PO_ID, LINE_ITEM_PO, COMMENT_PO, DATE_ADD_COMMENT
> FROM INNER
> WHERE ROW_TIMESTAMP=REC
>
> Generally speaking, this would go through the table only once, which
> makes it faster. Nevertheless, this too will hit the limit when the table
> outgrows the threshold.
>
> The trick that would solve the problem with the query would involve
> creating the new table, say PO_LATEST_COMMENT, containing the columns
>
> PO_ID, LINE_ITEM_PO, TYPE_PO_COMMENT and ROW_TIMESTAMP
>
> The table would be updated from the insert/update trigger on PO_COMMENT
> table and the most recent timestamp would be maintained.
>
> Of course, this is a report, looks like rather big one to me, and the
> company is probably flexible about the schedule. This can be scheduled to
> run overnight or made into a materialized view which can be accessed
> during working hours and refreshed nightly. I have seen such thing
> several times and the trigger based solution seems to be the easiest one
> to implement.
> Of course, there is a price to pay for triggers, too: no direct load.
> That can sometimes be a problem and the solution is to partition the big
> table and write specialized load scripts with doing multithreaded bulk
> loads, one per partition. I have never run across the situation in which
> such customized load scripts would not be fast enough.
>
> --
> Mladen Gogala
> The Oracle Whispererhttp://mgogala.byethost5.com

Thanks Mladen will take a look at your suggestion for the "WITH INNER AS" ... next week ... Received on Sun Dec 16 2012 - 01:20:51 CET

Original text of this message