RE: Estimations are as good as it can be, what else can be done here?

From: Clay Jackson <"Clay>
Date: Wed, 22 Nov 2023 23:50:09 +0000
Message-ID: <CO1PR19MB4984D7DA8F758F282AC474A89BBAA_at_CO1PR19MB4984.namprd19.prod.outlook.com>





One of my most favorite customers froze a copy of a standby clone weekly after the process “generate receivables” was complete Friday night. Many aggregations were then performed on that frozen set of detail data the rest of the weekend and used for analysis the following week.

In at least two places I worked, we did this or some variation on the theme; and I’ve recommended it MANY times.

Thanks, again, Mark!

Get creative, think out of the box and experiment! There are MANY ways to solve almost all problems!

Clay Jackson
Database Solutions Sales Engineer
[cid:image001.jpg_at_01DA1D5B.91550AB0]<https://www.quest.com/solutions/database-performance-monitoring/> clay.jackson_at_quest.com<mailto:clay.jackson_at_quest.com> office 949-754-1203 mobile 425-802-9603

From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> On Behalf Of Mark W. Farnham Sent: Wednesday, November 22, 2023 9:51 AM To: learnerdatabase99_at_gmail.com
Cc: 'Pap' <oracle.developer35_at_gmail.com>; 'Jonathan Lewis' <jlewisoracle_at_gmail.com>; 'Oracle L' <oracle-l_at_freelists.org> Subject: RE: Estimations are as good as it can be, what else can be done here?

CAUTION: This email originated from outside of the organization. Do not follow guidance, click links, or open attachments unless you recognize the sender and know the content is safe.

Mark,

I understood as Jonathan pointed the best strategy might be to flattened the required data in one table and then fetch the pk or rowid with minimal sorting overhead(with an index fully covering all the columns and same sorting order) and then traverse the table data from those selected rowids.

The below strategy which you pointed, got bounced over my head. Do you mean adding some additional predicate to the query to minimize the amount of sorting? What are 'x' and 'y' here? Can you please explain a bit more.

If a small set of columns (like sometimes one column) can distinctly provide the top n ordering, it may be useful to provide that additional predicate, as an and x => y gathering up the data to present.

For slowly changing data and some “n” that it may even be useful to update a table containing the top “m” rows where “m” is as least as big as the biggest “n” your users will use and update that table if a new combination should be in it. Two situations:
case 1: IF your users are running top n reports repeatedly with selections on less slowly changing data. in addition to what JL and others already suggested, it MAY be cheaper to yank out just the ordering columns, and then yank out the lowest value for each column from those n rows and filter on greater or equal to the lowest value seen in the top n rows for each column at its source. Notice you can’t just take the lowest row if you’re doing all the columns as filters because (two columns) Z A first row would get filtered out by the second column if the last row of the top n was Y Y. You CAN take just greater than or equal to the first column alone in the order by. (But I will repeat this is not valid for hierarchical queries).

IF the top N is the result of the value of a group by aggregation this also won’t help, because you can’t toss individual rows away for a value less than the aggregate they become a part of.

case 2: IF your users are running top n reports repeatedly with selections on slowly changing data where n may vary but never be over m.

with the same caveats as case 1, it may actually be useful in some data cases to keep a single row table with the lowest value that should be included and update that table if a transaction changes the low values to be included.

Lots of top-n reports are either hierarchical or aggregations and are unsuited for these approaches. But quite a few top-n determinations are also quite simple, where as few as one ordering column on something could be the gating predicate and huge relational assemblies are tacked together only to be pruned down at the end. IF that gating value (or set of values) can be easily looked up or kept updated, then expensive reports can become quick.

IF you are operating on a frozen standby clone (such as when a daily, weekly, monthly, quarterly, or annual trigger event is used to keep analysis on a static as was basis, aggregations and filtering aggregations are much more frequently useful to instantiate and use. Denormalizations are quite often perishable, but they are not when you are operating on a frozen set of underlying data.

One of my most favorite customers froze a copy of a standby clone weekly after the process “generate receivables” was complete Friday night. Many aggregations were then performed on that frozen set of detail data the rest of the weekend and used for analysis the following week.

Good luck,

mwf

<snip>



--
http://www.freelists.org/webpage/oracle-l


image001.jpg
(image/jpeg attachment: image001.jpg)

Received on Thu Nov 23 2023 - 00:50:09 CET

Original text of this message