RE: scalar subquery costing

From: Noveljic Nenad <nenad.noveljic_at_vontobel.com>
Date: Thu, 30 May 2019 09:49:42 +0000
Message-ID: <130489_1559209792_5CEFA740_130489_5297_1_72ae632d93824b34b40a3d89985bcb30_at_vontobel.com>



Jonathan‘s post that hasn‘t arrived.

Von: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk<mailto:jonathan_at_jlcomp.demon.co.uk>> Datum Mittwoch, 29. Mai 2019, 1:28 PM
An: Noveljic Nenad <nenad.noveljic_at_vontobel.com<mailto:nenad.noveljic_at_vontobel.com>>, ORACLE-L <oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org>> Betreff: Re: scalar subquery costing

Nenad,

I've run a few tests based on your model and have got perfect matches (plus or minus 1 due to rounding errors only) across the board. A critical piece of the arithmetic is the size Oracle derives for each element in the cache. For this the number of values assumed cached is actually non-integral so, for example, I have used the following formula in a pl/sql loop where I have 100,000 rows in your table t_100k and set the num_distinct to the value in m_distcnt

m_cost := round(33 + 17 * (10922.667 + 100000 * (1 - (10922.667 / m_distcnt))));

33 is the IO cost of the tablescan
17 is the IO cost of executing the subquery once Oracle derived the cache unit as 12, so the number of values cached is 10922.66666666 100,000 rows in the table.

The figures match perfectly (allowing rounding) as I vary m_distcnt from 10923 to 500,000.

Similarly. using lpad(rownum,10)) for the defining column, Oracle uses 24 as the cache unit, so 5461.333333 for the number of cached results m_cost := round(33 + 17 * ( 5461.333 + 100000 * (1 - ( 5461.333 / m_distcnt))));

Finally using lpad(rownum,14) - so the tablescan cost goes up to 48, and the cache unit to 32 so an exact 4096 for results cached. m_cost := round(48 + 17 * ( 4096.000 + 100000 * (1 - ( 4096.000 / m_distcnt))));

It's the avg_col_len from the t_100K table that affects the cache unit size - Oracle seems to round it up to an even number then double it. (At least, that's the appearance in this test - I can't think of any good reason why this should be the chosen strategy).

Regards
Jonathan Lewis



From: Noveljic Nenad <nenad.noveljic_at_vontobel.com> Sent: 28 May 2019 20:46:15
To: Jonathan Lewis; ORACLE-L
Subject: RE: scalar subquery costing

Jonathan,

What do you think it would be lambda if the cost follows the Poisson distribution?

As an alternative, I'd like to offer a formula which approximately matches the data from your experiment.

Please find my hypothesis below:

The linearity for NDV <= 10923 suggests that 10923 values returned by the scalar subquery can be cached simultaneously. Therefore, there isn't any extra cost when matching the cached value in this case.

For NDV > 10923 the probability p of finding the value in the cache can be calculated as follows:

p = 10923/NDV

Consequently, the probability of not finding the result in the cache is:

1-p = 1 - 10923/NDV

The total number of cache misses is:

cache_misses = num_rows * (1 - p) = num_rows * ( 1 - 10923 / NDV )

The total cost of the cache misses can be calculated as follows:

cache_misses_cost = cache_misses * cost_of_scalar_subquery_single_execution.

The total cost is approximately:

total_cost ~ cache_misses_cost + cost_of_filling_the_cache

The cost of filling the cache is the cost to fetch 10923 NDVs. Since we don't have the exact value I'll use the cost of 11000 NDVs from your experiment.

cost_of_filling_the_cache ~ cache_misses_cost + cost(11000)

In our case:

num_rows = 100000
cost_of_scalar_subquery_single_execution = 3

=>

total_cost(NDV) ~ 100000 * ( 1 - 10923/NDV ) * 3 + 34945 = 300000 * ( 1 - 10923/NDV ) + 34945

Let's calculate the cost for different values accross the range:

total_cost(12) ~ 61870 vs. 59769
total_cost(24) ~ 198407 vs. 196303
total_cost(36) ~ 243920 vs. 241814

As we can see, this formula calculates the cost with ~ 97% precision.

I can think of the following refinements:

1. First and foremost, try out "selection with replacement" expected value calculation. I think "with" replacement would be appropriate here as the values come back after have been evicted from cache.
2. Use cost(10923) instead of the approximation cost(11000).
3. Use the value of cost(10923) from the optimizer trace, to increase the precision.

Furthermore, I'd need to automate these calculation to verify the hypothesis across different datasets.

I'm, though, still bothered by the fact that the cost remains constant for NDV > num_rows/2 , as I couldn't think of any plausible explanation

What do you think?

Best regards,

Nenad

https://nenadnoveljic.com/blog/

-----Original Message-----
From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk> Sent: Dienstag, 28. Mai 2019 17:32
To: Noveljic Nenad <nenad.noveljic_at_vontobel.com>; ORACLE-L <oracle-l_at_freelists.org> Subject: Re: scalar subquery costing

With a couple more tests is looks like

  1. There's a magic value of 10,922 rather than 10% where the linear increase in cost stops - this seemed to be fixed for a few tests where the number of rows in table varied from 100,000 to 500,000 - there may be other limits for other ranges of table sizes.
  2. From 10,923 to 50% of the rows in the table the change in cost looked roughly Poisson - sudden sharp increase for a small increase in num_distinct then long slow tail.
  3. The cost stops growing after num_distinct reaches 50% of num_rows.

Regards
JOnathan Lewis



From: Noveljic Nenad <nenad.noveljic_at_vontobel.com> Sent: 28 May 2019 11:02:14
To: Jonathan Lewis; ORACLE-L
Subject: RE: scalar subquery costing

If we consider cost as a function of the number of distinct values (NDV), we can clearly identify three different subsets of its domain:

  1. NDV <= 0.1 * rows_in_table. A purely linear function, probably under the assumption that *all* of the scalar subquery values are being cached.
  2. 0.1 * rows_in_table < NDV <= 0.5 * rows_in_table. It's a logarithmically-shaped function with the asymptote ~ f(rows_in_table). The rational is probably: with large NDVs it will be much less likely to find its scalar subquery result in the cache. A pitfall here is that the function seems too steep at the beginning - a small increase in NDV will lead to a huge leap in cost.
  3. NDV > 0.5 * rows_in_table. f(NDV) = 267300. A possible assumption in this case: no cache hits at all. Just the value seems a bit odd as the real cost should be by around 10% higher if the results aren't cached.

Best regards,

Nenad

-----Original Message-----
From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk> Sent: Dienstag, 28. Mai 2019 10:22
To: ORACLE-L (oracle-l_at_freelists.org) <oracle-l_at_freelists.org>; Noveljic Nenad <nenad.noveljic_at_vontobel.com> Subject: Re: scalar subquery costing

FYI - a quick test reporting costs for your query as I fake the column stats on t_100K to vary Statement N corresponds to N * 1,000 distinct values, with low_value 1 and high_value N * 1,000. I've picked the cost, cpu_cost, and io_cost for the top line of the execution plan in each case.

Interesting point(a) somewhere between 10 and 12 thousand distinct values, and another when num_distinct = 0.5 * rows in table.

STATEMENT_ID                      IO_COST    IO_DIFF   CPU_COST   CPU_DIFF       COST
------------------------------ ---------- ---------- ---------- ---------- ----------
 1                                   3068             252344831                  3094
 2                                   6068       3000  487952031  235607200       6119
 3                                   9068       3000  723559231  235607200       9143
 4                                  12068       3000  959166431  235607200      12167
 5                                  15068       3000 1194773631  235607200      15192
 6                                  18068       3000 1430380831  235607200      18216
 7                                  21068       3000 1665988031  235607200      21240
 8                                  24068       3000 1901595231  235607200      24265
 9                                  27068       3000 2137202431  235607200      27289
 10                                 30068       3000 2372809631  235607200      30314
 11                                 34945       4877 2755835542  383025911      35230
 12                                 59769      24824 4705425626 1949590084      60256
 13                                 80774      21005 6355078773 1649653147      81432
 14                                 98779      18005 7769067185 1413988412      99583
 15                                114383      15604 8994523809 1225456624     115314
 16                                128036      13653 1.0067E+10 1072274546     129078
 17                                140083      12047 1.1013E+10  946124599     141223
 18                                150792      10709 1.1854E+10  840999644     152019
 19                                160373       9581 1.2606E+10  752473365     161678
 20                                168996       8623 1.3284E+10  677226029     170371
 21                                176798       7802 1.3896E+10  612728312     178236
 22                                183891       7093 1.4453E+10  557025738     185387
 23                                190366       6475 1.4962E+10  508588717     191915
 24                                196303       5937 1.5428E+10  466206324     197900
 25                                201764       5461 1.5857E+10  428909819     203406
 26                                206805       5041 1.6253E+10  395916755     208488
 27                                211473       4668 1.6620E+10  366589588     213193
 28                                215807       4334 1.6960E+10  340404618     217563
 29                                219843       4036 1.7277E+10  316928437     221631
 30                                223609       3766 1.7573E+10  295799875     225428
 31                                227133       3524 1.7849E+10  276716012     228981
 32                                230436       3303 1.8109E+10  259421261     232311
 33                                233539       3103 1.8353E+10  243698760     235439
 34                                236460       2921 1.8582E+10  229363540     238383
 35                                239213       2753 1.8798E+10  216257051     241159
 36                                241814       2601 1.9002E+10  204242770     243781
 37                                244274       2460 1.9196E+10  193202621     246261
 38                                246604       2330 1.9379E+10  183034062     248610
 39                                248815       2211 1.9552E+10  173647700     250840
 40                                250916       2101 1.9717E+10  164965315     252957
 41                                252914       1998 1.9874E+10  156918226     254971
 42                                254817       1903 2.0024E+10  149445929     256890
 43                                256631       1814 2.0166E+10  142494957     258719
 44                                258363       1732 2.0302E+10  136017913     260465
 45                                260018       1655 2.0432E+10  129972672     262133
 46                                261601       1583 2.0556E+10  124321686     263729
 47                                263117       1516 2.0675E+10  119031402     265257
 48                                264569       1452 2.0790E+10  114071760     266721
 49                                265963       1394 2.0899E+10  109415770     268126
 50                                267300       1337 2.1004E+10  105039139     269474
 51                                267300          0 2.1004E+10          0     269474
 52                                267300          0 2.1004E+10          0     269474
...
 98                                267300          0 2.1004E+10          0     269474
 99                                267300          0 2.1004E+10          0     269474
 100                               267300          0 2.1004E+10          0     269474


Regards
Jonathan Lewis



From: oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> on behalf of Noveljic Nenad <nenad.noveljic_at_vontobel.com> Sent: 24 May 2019 16:46:30
To: ORACLE-L (oracle-l_at_freelists.org) Subject: scalar subquery costing

The calculated cost for the following query is 262K (on Oracle 12.2):

select /*+ qb_name(QB_MAIN) */
(
  select /*+ qb_name(QB_SUBQ) */ count(*)

              from t_1k
              where t_1k.n1 = t_100k.n1

)
from t_100k ;

Plan Table
--------------------------------------+-----------------------------------+
| Id  | Operation           | Name    | Rows  | Bytes | Cost  | Time      |
--------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT    |         |       |       |  262K |           |
| 1   |  SORT AGGREGATE     |         |     1 |     4 |       |           |
| 2   |   TABLE ACCESS FULL | T_1K    |     1 |     4 |     3 |  00:00:01 |
| 3 | TABLE ACCESS FULL | T_100K | 98K | 488K | 69 | 00:00:01 |
--------------------------------------+-----------------------------------+
Predicate Information:

2 - filter("T_1K"."N1"=:B1)

From the optimizer trace we can get more precise cardinalities and costs:

Final cost for query block QB_SUBQ (#0) - All Rows Plan:   Best join order: 1
  Cost: 3.009811 Degree: 1 Card: 1.000000 Bytes: 4.000000

Final cost for query block QB_MAIN (#0) - All Rows Plan:   Best join order: 1
  Cost: 268174.664510 Degree: 1 Card: 100000.000000 Bytes: 500000.000000

  Table: T_100K Alias: T_100K
    Card: Original: 100000.000000 Rounded: 100000 Computed: 100000.000000 Non Adjusted: 100000.000000 ...

            Access Path: TableScan
    Cost: 68.697001 Resp: 68.697001 Degree: 0       Cost_io: 68.000000 Cost_cpu: 16737631

In theory, the total cost should be caculated as follows. cardinality(QB_MAIN) * cost(QB_SUBQ) + cost(full table scan T_100K) = 100000 * 3.009811 + 68.69701 = 301049.79701

How the optimizer came up with 268174.664510?

Best regards,

Nenad



Please consider the environment before printing this e-mail. Bitte denken Sie an die Umwelt, bevor Sie dieses E-Mail drucken.

Important Notice

This message is intended only for the individual named. It may contain confidential or privileged information. If you are not the named addressee you should in particular not disseminate, distribute, modify or copy this e-mail. Please notify the sender immediately by e-mail, if you have received this message by mistake and delete it from your system. Without prejudice to any contractual agreements between you and us which shall prevail in any case, we take it as your authorization to correspond with you by e-mail if you send us messages by e-mail. However, we reserve the right not to execute orders and instructions transmitted by e-mail at any time and without further explanation. E-mail transmission may not be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also processing of incoming e-mails cannot be guaranteed. All liability of Vontobel Holding Ltd. and any of its affiliates (hereinafter collectively referred to as "Vontobel Group") for any damages resulting from e-mail use is excluded. You are advised that urgent and time sensitive messages should not be sent by e-mail and if verification is required please request a printed version. Please note that all e-mail communications to and from the Vontobel Group are subject to electronic storage and review by Vontobel Group. Unless stated to the contrary and without prejudice to any contractual agreements between you and Vontobel Group which shall prevail in any case, e-mail-communication is for informational purposes only and is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction. The legal basis for the processing of your personal data is the legitimate interest to develop a commercial relationship with you, as well as your consent to forward you commercial communications. You can exercise, at any time and under the terms established under current regulation, your rights. If you prefer not to receive any further communications, please contact your client relationship manager if you are a client of Vontobel Group or notify the sender. Please note for an exact reference to the affected group entity the corporate e-mail signature. For further information about data privacy at Vontobel Group please consult www.vontobel.com<https://www.vontobel.com>.

Important Notice

 This message is intended only for the individual named. It may contain confidential or privileged information. If you are not the named addressee you should in particular not disseminate, distribute, modify or copy this e-mail. Please notify the sender immediately by e-mail, if you have received this message by mistake and delete it from your system.  Without prejudice to any contractual agreements between you and us which shall prevail in any case, we take it as your authorization to correspond with you by e-mail if you send us messages by e-mail. However, we reserve the right not to execute orders and instructions transmitted by e-mail at any time and without further explanation.  E-mail transmission may not be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also processing of incoming e-mails cannot be guaranteed. All liability of Vontobel Holding Ltd. and any of its affiliates (hereinafter collectively referred to as "Vontobel Group") for any damages resulting from e-mail use is excluded. You are advised that urgent and time sensitive messages should not be sent by e-mail and if verification is required please request a printed version.  Please note that all e-mail communications to and from the Vontobel Group are subject to electronic storage and review by Vontobel Group. Unless stated to the contrary and without prejudice to any contractual agreements between you and Vontobel Group which shall prevail in any case, e-mail-communication is for informational purposes only and is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction.  The legal basis for the processing of your personal data is the legitimate interest to develop a commercial relationship with you, as well as your consent to forward you commercial communications. You can exercise, at any time and under the terms established under current regulation, your rights. If you prefer not to receive any further communications, please contact your client relationship manager if you are a client of Vontobel Group or notify the sender.  Please note for an exact reference to the affected group entity the corporate e-mail signature.  For further information about data privacy at Vontobel Group please consult www.vontobel.com<http://www.vontobel.com>.

Important Notice

 This message is intended only for the individual named. It may contain confidential or privileged information. If you are not the named addressee you should in particular not disseminate, distribute, modify or copy this e-mail. Please notify the sender immediately by e-mail, if you have received this message by mistake and delete it from your system.  Without prejudice to any contractual agreements between you and us which shall prevail in any case, we take it as your authorization to correspond with you by e-mail if you send us messages by e-mail. However, we reserve the right not to execute orders and instructions transmitted by e-mail at any time and without further explanation.  E-mail transmission may not be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also processing of incoming e-mails cannot be guaranteed. All liability of Vontobel Holding Ltd. and any of its affiliates (hereinafter collectively referred to as "Vontobel Group") for any damages resulting from e-mail use is excluded. You are advised that urgent and time sensitive messages should not be sent by e-mail and if verification is required please request a printed version.  Please note that all e-mail communications to and from the Vontobel Group are subject to electronic storage and review by Vontobel Group. Unless stated to the contrary and without prejudice to any contractual agreements between you and Vontobel Group which shall prevail in any case, e-mail-communication is for informational purposes only and is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction.  The legal basis for the processing of your personal data is the legitimate interest to develop a commercial relationship with you, as well as your consent to forward you commercial communications. You can exercise, at any time and under the terms established under current regulation, your rights. If you prefer not to receive any further communications, please contact your client relationship manager if you are a client of Vontobel Group or notify the sender.  Please note for an exact reference to the affected group entity the corporate e-mail signature.  For further information about data privacy at Vontobel Group please consult www.vontobel.com<http://www.vontobel.com>.

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css">p { font-family: Arial;font-size:9pt }</style>
</head>
<body>
<p>
<br>Important Notice</br>
<br />

This message is intended only for the individual named. It may contain confidential or privileged information. If you are not the named addressee you should in particular not disseminate, distribute, modify or copy this e-mail. Please notify the sender immediately by e-mail, if you have received this message by mistake and delete it from your system.<br /> Without prejudice to any contractual agreements between you and us which shall prevail in any case, we take it as your authorization to correspond with you by e-mail if you send us messages by e-mail. However, we reserve the right not to execute orders and instructions transmitted by e-mail at any time and without further explanation.<br /> E-mail transmission may not be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also processing of incoming e-mails cannot be guaranteed. All liability of Vontobel Holding Ltd. and any of its affiliates (hereinafter collectively referred to as "Vontobel Group") for any damages resulting from e-mail use is excluded. You are advised that urgent and time sensitive messages should not be sent by e-mail and if verification is required please request a printed version.</br> Please note that all e-mail communications to and from the Vontobel Group are subject to electronic storage and review by Vontobel Group. Unless stated to the contrary and without prejudice to any contractual agreements between you and Vontobel Group which shall prevail in any case, e-mail-communication is for informational purposes only and is not intended as an offer or solicitation for the purchase or sale of any financial instrument or as an official confirmation of any transaction.<br /> The legal basis for the processing of your personal data is the legitimate interest to develop a commercial relationship with you, as well as your consent to forward you commercial communications. You can exercise, at any time and under the terms established under current regulation, your rights. If you prefer not to receive any further communications, please contact your client relationship manager if you are a client of Vontobel Group or notify the sender. Please note for an exact reference to the affected group entity the corporate e-mail signature. For further information about data privacy at Vontobel Group please consult <a href="https://www.vontobel.com">www.vontobel.com</a>.<br />
</p>
</body>
</html>


--
http://www.freelists.org/webpage/oracle-l
Received on Thu May 30 2019 - 11:49:42 CEST

Original text of this message