Re: Improving query performance further

From: Andy Sayer <andysayer_at_gmail.com>
Date: Tue, 11 Oct 2022 18:05:17 -0700
Message-ID: <CACj1VR7SqicHSAbVbVdfPcsCcJtPdXKAHjqFMLJ8e_1QUsTRfA_at_mail.gmail.com>



I have a demo for a domain index which allows this sort of query to be very fast. Essentially it creates an index entry for each value (with your desired granularity) between your min and max columns and does look ups based on that. The domain index knows what the granularity is so as well as :bind between x and y it queries new_domain_column between :bind - granularity and :bind + granularity.

Unfortunately, the demo is on the pc that I left in the UK when I moved to Canada. This might be enough information for you to get going with, otherwise I was planning on publishing my work when I visit the UK in December.

The other, easier but less guaranteed, way of combining two range predicates on different columns is to partition the index by one of the columns. Eg your index would be on some_key, start_date and your partitioning would be on end_date. Oracle would then need to do a range scan using your some_key and start_date filters as access predicates once per partition which matches your end_date filter. You’d have to decide which column is the partition key and how wide to make the partitions, and you might not gain a lot.

Thanks,
Andy

On Tue, Oct 11, 2022 at 5:38 PM, Sayan Malakshinov <xt.and.r_at_gmail.com> wrote:

> I tuned a lot of similar queries: tried different methods (even spatial
> RTree indexes) and made a lot of tests, so I'll try to describe and
> summarize them all later, when I have more time, but for now just in short:
>
> Let''s simplify it just to:
> - table T(A,B)
> - index (A,B)
> - select * from T where :X between A and B
> - ":X between A and B" is very selective
>
> The main problem in this case is that you need to check all index entries
> A<:X, for example you have:
>
> A | B
> --------
> 1 1
> 1 2
> 1 3
> .....
> 1 100
> 2 107
> 2 111
> ....
> 2 215
> 3 204
> 3 206
> ....
> 3 299
> ...
> 998 99799
> 999 99801
> ...
> 999 99900
>
> where each A has about 100 different B,
> and your :X = 700, so, though there are just about 100 rows satisfying
> both conditions, INDEX RANGE SCAN with access predicate A<:X will scan
> 700*100 index entries, and filter out 699*100 of them by filter predicate
> (B>:X).
>
> So we have just a few different situations and key points:
>
> ===================================================================================
>
> *0. There is a limitation of max(B-A) and it's quite small, ie
> max(B[x]-A[x]) << max(B) - min(A)*In this case, you can simply add a
> couple of additional predicates:
> select *
> from T
> where A between :X-:MAX_DIFF and :X
> and B between :X-:MAX_DIFF and :X
> ;
> So you''ll get a short search range for A (ACCESS predicates for the first
> column of the index).
> //For example, it may be short intervals or start/end dates of short time
> events, with check constraints like (b-a<N)
>
> ===================================================================================
>
>
> *1. There is a perfect positive correlation between A and B, ie if a2>a1,
> must be b2>b1*
> -----------------------------------------------------------------------------------
>
> *1.1 There is no intersection of any 2 intervals (Ax - Bx), (Ay - By)*
> This means that for any :X you need to find just 1 row, so you can stop to
> scan the index after the first found row.
> We can rewrite our query for that like this:
>
> select *
> from
> (select * from T where A<=:X order by A desc fetch first 1 row only) --
> to force IRS DESCENDING (IRS - index range scan) and stop after 1 row
> where
> B>:X -- our second predicate is one level upper, so we check just 1 row
> maximum
>
> -- IIRC there were some problems with IRS DESC in case of "fetch first",
> so I'll rewrite it to make it more reliable:
> select *
> from
> (select *
> from
> (select * from T where A<=:X order by A desc) -- to force IRS
> DESCENDING
> where rownum=1) -- and stop after 1 row
> where
> B>:X -- our second predicate
>
>
> -----------------------------------------------------------------------------------
>
> *1.2 There may be intersections:*This means that you can't stop after the
> first found row satisfying our conditions. We need to find all of them.
> Unfortunately, there is no a documented method to make a dynamic SQL
> condition like "please stop, I don't need other rows", but we can simply
> use PL/SQL for that:
>
> declare
> X number := 34;
>
> cursor c_1(p number) is
> select * from T where A<=p order by A desc;
> R c_1%rowtype;
> begin
> open c_1(X);
> <<LL>>
> loop
> fetch c_1 into R;
> if R.B>=X then
> dbms_output.put_line(R.A ||' '||R.B); -- or pipe row or collect
> ROWIDS and return them as a collection, etc
> else
> exit LL;
> end if;
> end loop;
> end;
> /
>
> As you can see, we stop fetching cursor C_1 when R.B becomes lower than X.
> Of course to make it more convenient you can wrap it into a pipelined
> pl/sql table function or even use it with INLINE pl/sql functions.
>
>
> ===================================================================================
>
> *2. There is no correlation between A and B, ie if a2>a1, it doesn't mean
> that b2>b1 :*
> This one is the most difficult problem, because in case of unknown X we
> need to check all rows that have A<=X (or all rows with B>=X).
> Obviously, it would be great if we know some common patterns and
> distribution of A/B/X, for example:
> - A and B are dates -- like table DT (start_date date, end_date date...)
> - min(A) and min(B) ~ 10-20 years ago
> - max(A) and max(B) ~ sysdate
> - and X usually is in the range of [sysdate-10; sysdate]
> - number of rows satisfying our conditions are pretty small
> In this case we could create index (B,A), just because we have less rows
> satisfying B>=X than A<=X.
>
> Or sometimes it maybe useful to create 2 indexes (A) and (B) and use
> bitmap_and operation like this:
>
> select/*+ index_combine(t) */ * from t where :X between a and b;
> ------------------------------------------------------------
> | Id | Operation | Name |
> ------------------------------------------------------------
> | 0 | SELECT STATEMENT | |
> | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| T |
> | 2 | BITMAP CONVERSION TO ROWIDS | |
> | 3 | BITMAP AND | |
> | 4 | BITMAP CONVERSION FROM ROWIDS | |
> | 5 | SORT ORDER BY | |
> |* 6 | INDEX RANGE SCAN | IX_B |
> | 7 | BITMAP CONVERSION FROM ROWIDS | |
> | 8 | SORT ORDER BY | |
> |* 9 | INDEX RANGE SCAN | IX_A |
> ------------------------------------------------------------
>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
>
> 6 - access("B">=TO_NUMBER(:X))
> filter("B">=TO_NUMBER(:X))
> 9 - access("A"<=TO_NUMBER(:X))
> filter("A"<=TO_NUMBER(:X))
>
> ===================================================================================
> But in general, there is no silver bullet for such cases...
>
> On Tue, Oct 11, 2022 at 10:58 PM Tim Gorman <tim.evdbt_at_gmail.com> wrote:
>
>> Have you considered making TABLE1 an IOT? No, not an "internet of
>> things" but an "index-organized table"...
>>
>> If the primary method of access is the primary key, and TABLE1 is has
>> "narrow" rows (i.e. AVG_ROW_LEN less than 20-25 bytes or so), then an IOT
>> could save on the number of logical reads. There's a lot of "if"s, but the
>> best approach is not to think about it, but just go ahead and test it,
>> side-by-side with the existing table. After all, it's only about ~100MB in
>> size, right?
>>
>> But, at the very least, it shouldn't be difficult to just put the table
>> and the PK index together into the KEEP pool of the Buffer Cache? After
>> all, although the ideal is to size the KEEP pool to accommodate the entire
>> objects assigned to it, you certainly aren't required to size it that way.
>> You just want to size it so that buffers flush out far more slowly than
>> they do in the DEFAULT pool.
>>
>> <rant>Too many DBAs think that the SGA should only use about 50% of the
>> available physical memory on a host, which is nonsense. The Linux/UNIX
>> operating systems only need a few GB of memory, and AWR can tell you
>> unequivocally how much space is needed for PGA, so the SGA should be sized
>> closer to the Oracle-imposed maximum of 90% of host physical memory. It's
>> there. It's not being used. Use it. If I had a nickel for every unused
>> GB of RAM on Oracle database servers, I could buy my own Hawaiian
>> island.</rant>
>>
>> Hope this helps!
>>
>> Enjoy!
>>
>>
>>
>> On 10/11/2022 2:04 PM, yudhi s wrote:
>>
>> Hello Listers, We have a customer database on Oracle version 19C. We have
>> a simple query as below. and as per current design is executing ~200 to
>> 300 times per second and it's part of a bigger process and thus is one of
>> the top consumers in that. Now as we are working to change the design to
>> make the number of execution of this query lesser to help the process. But
>> that needs much more impact analysis, so we were thinking of any possible
>> easy way to make the individual execution of this query faster? Or say any
>> structural change(new index etc.) which can further drop the IO/CPU
>> requirement for individual execution of this query?
>>
>> Currently this query is accessing table TABLE1 through a primary key
>> which is on three columns (PART_COL,MIN_VALUE,MAX_VAL). The table is
>> partitioned on column PART_COL. This table contains ~400K rows and is
>> ~100MB in size. It's a master data kind of table.
>>
>> SELECT column1 FROM TABLE1 WHERE PART_COL = :B2 AND :B1 BETWEEN MIN_VAL
>> AND MAX_VALUE
>>
>> Global Information
>> ------------------------------
>> Status : DONE (ALL ROWS)
>> Instance ID : 1
>> SQL Execution ID : 16777216
>> Execution Started : 10/11/2022 09:36:48
>> First Refresh Time : 10/11/2022 09:36:48
>> Last Refresh Time : 10/11/2022 09:36:48
>> Duration : .06173s
>> Module/Action : SQL*Plus/-
>> Program : sqlplus.exe
>> Fetch Calls : 1
>>
>> Binds
>>
>> ========================================================================================================================
>> | Name | Position | Type | Value
>> |
>>
>> ========================================================================================================================
>> | :B2 | 1 | NUMBER | 2 |
>> | :B1 | 2 | VARCHAR2(4000) | XXXXXXXXXXX
>> |
>>
>> ========================================================================================================================
>>
>> Global Stats
>>
>> =========================================================================================
>> | Elapsed | Cpu | IO | Concurrency | Cluster | Fetch | Buffer | Read
>> | Read |
>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Calls | Gets |
>> Reqs | Bytes |
>>
>> =========================================================================================
>> | 0.06 | 0.04 | 0.02 | 0.00 | 0.00 | 1 | 911 | 778 | 6MB |
>>
>> =========================================================================================
>>
>> SQL Plan Monitoring Details (Plan Hash Value=692467662)
>>
>> ======================================================================================================================================================================================
>> | Id | Operation | Name | Rows | Cost | Time
>> | Start | Execs | Rows | Read | Read | Activity | Activity Detail |
>> | | | | (Estim) | | Active(s) |
>> Active | | (Actual) | Reqs | Bytes | (%) | (# samples) |
>>
>> ======================================================================================================================================================================================
>> | 0 | SELECT STATEMENT | | | | |
>> | 1 | | | | | |
>> | 1 | PARTITION RANGE SINGLE | | 10610 | 928 |
>> | | 1 | | | | | |
>> | 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | TABLE1 |
>> 10610 | 928 | | | 1 | | | | | |
>> | 3 | INDEX RANGE SCAN | PK_TABLE1 | 10610 | 771 |
>> | | 1 | | 770 | 6MB | | |
>>
>> ======================================================================================================================================================================================
>>
>> Predicate Information (identified by operation id):
>> ---------------------------------------------------
>> 3 - access("PART_COL"=TO_NUMBER(:B2) AND "MAX_VALUE">=:B1 AND
>> "MIN_VAL"<=:B1)
>> filter("MAX_VALUE">=:B1)
>>
>> Statistics
>> ----------------------------------------------------------
>> 37 recursive calls
>> 0 db block gets
>> 911 consistent gets
>> 778 physical reads
>> 41076 redo size
>> 260 bytes sent via SQL*Net to client
>> 489 bytes received via SQL*Net from client
>> 1 SQL*Net roundtrips to/from client
>> 28 sorts (memory)
>> 0 sorts (disk)
>> 0 rows processed
>>
>>
>>
>
> --
> Best regards,
> Sayan Malakshinov
> Oracle performance tuning engineer
> Oracle ACE
> http://orasql.org
>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Oct 12 2022 - 03:05:17 CEST

Original text of this message