Home » SQL & PL/SQL » SQL & PL/SQL » Adding analytic function-based column changes another column result (Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production)
Adding analytic function-based column changes another column result [message #634697] Fri, 13 March 2015 07:37 Go to next message
lewap02
Messages: 5
Registered: March 2015
Junior Member
Hello everybody. While working on a query I have encountered situation which I cannot explain nor understand. Given below are DDL and the query. Data (inserts) are in the attachment.

Case is that I have a query to which I want to add new analytic function-based column. This new column is independent of the already existing columns. The problem is that adding this column - uncommeting min_id - actually changes the result of another analytic column - RevMatchingPair, . The only thing that they have "in common" is that part of the ordering in both of them is middle_qry.something_id. Indeed, if I would change ordering in min_id to id it would work fine.

Is it possible that one analytic function changes order of rows on which another such column is operating? My intuition is that every analytic column should operate on some kind of "local" order. If it is not some kind of my misuse/misunderstanding then such behaviuor of the database is very dangerous in my opinion. By adding just one new independent (from a logical point of view) value I can destroy report results.

Could you please take a look and help me understand this case?

CREATE TABLE TEST_TABLE
   (	ID NUMBER, 
	SOMETHING_ID VARCHAR2(255), 
	SOMETHING_NAME VARCHAR2(255)
   );

select id,
       something_id,
       something_name,
       RowNumber,
       RevRowNumber,
       RevMatchingCount,
       RevMatchingPair
  from (select id,
               something_id,
               something_name,
               RowNumber,
               RevRowNumber,
               RevMatchingCount,
               
               CASE
                 WHEN ((
                       middle_qry.RevMatchingCount = LAG(middle_qry.RevMatchingCount, 1, 0)
                       OVER(ORDER BY middle_qry.something_id DESC,
                             middle_qry.something_name DESC)) OR
                      (
                       middle_qry.RevMatchingCount = LEAD(middle_qry.RevMatchingCount, 1, 0)
                       OVER(ORDER BY middle_qry.something_id DESC,
                             middle_qry.something_name DESC))) AND
                      Mod(middle_qry.RevMatchingCount, 2) = 0 THEN
                  0
                 ELSE
                  1
               END as RevMatchingPair
               --,min(id) over (ORDER BY middle_qry.something_id ) min_id
          from (select
                 id,
                 something_id,
                 something_name,
                 RowNumber,
                 RevRowNumber,
                 CASE
                   WHEN inner_qry.RevRowNumber = FIRST_VALUE(inner_qry.RevRowNumber)
                    OVER(PARTITION BY inner_qry.something_id ORDER BY inner_qry.RevRowNumber DESC) OR
                        MOD(inner_qry.RevRowNumber, 2) = 0 THEN
                    inner_qry.RevRowNumber
                   ELSE
                    inner_qry.RevRowNumber + 1
                 END AS RevMatchingCount
                
                  from (select 
                               id,
                               something_id,
                               something_name,
                               ROW_NUMBER() OVER(PARTITION BY something_id ORDER BY something_id desc, something_name) as RowNumber,
                               ROW_NUMBER() OVER(PARTITION BY something_id ORDER BY something_id, something_name desc) as RevRowNumber
                        
                          From test_table

                         order by id) inner_qry
                 order by id) middle_qry
         order by id)
where id = 237;


Re: Adding analytic function-based column changes another column result [message #634707 is a reply to message #634697] Fri, 13 March 2015 10:48 Go to previous messageGo to next message
Michel Cadot
Messages: 68643
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Quote:
Is it possible that one analytic function changes order of rows on which another such column is operating?


It should not.

Show us the result you have with and without the columns for the data you gave, so we can compare with what we get in other versions.

Re: Adding analytic function-based column changes another column result [message #634788 is a reply to message #634707] Mon, 16 March 2015 03:19 Go to previous messageGo to next message
lewap02
Messages: 5
Registered: March 2015
Junior Member
Results below.

Without new column:

"ID";"SOMETHING_ID";"SOMETHING_NAME";"ROWNUMBER";"REVROWNUMBER";"REVMATCHINGCOUNT";"REVMATCHINGPAIR"
"237";"TEST_VALUE";"TOM AND JERRY";"42";"1";"2";"0"

With new column:

"ID";"SOMETHING_ID";"SOMETHING_NAME";"ROWNUMBER";"REVROWNUMBER";"REVMATCHINGCOUNT";"REVMATCHINGPAIR"
"237";"TEST_VALUE";"TOM AND JERRY";"42";"1";"2";"1"

Diffrence is seen in the last column: REVMATCHINGPAIR.
Re: Adding analytic function-based column changes another column result [message #634800 is a reply to message #634788] Mon, 16 March 2015 06:22 Go to previous messageGo to next message
Lalit Kumar B
Messages: 3174
Registered: May 2013
Location: World Wide on the Web
Senior Member
Hi lewap02,

Welcome to the forum!

Please read and follow the OraFAQ Forum Guide and How to use [code] tags, to enable us to help you.

lewap02 wrote on Mon, 16 March 2015 13:49

Diffrence is seen in the last column: REVMATCHINGPAIR.


The values for REVMATCHINGPAIR as 0 or 1 depends on the output of the following case statement:


CASE
                 WHEN ((
                       middle_qry.RevMatchingCount = LAG(middle_qry.RevMatchingCount, 1, 0)
                       OVER(ORDER BY middle_qry.something_id DESC,
                             middle_qry.something_name DESC)) OR
                      (
                       middle_qry.RevMatchingCount = LEAD(middle_qry.RevMatchingCount, 1, 0)
                       OVER(ORDER BY middle_qry.something_id DESC,
                             middle_qry.something_name DESC))) AND
                      Mod(middle_qry.RevMatchingCount, 2) = 0 THEN
                  0
                 ELSE
                  1
               END as RevMatchingPair


As you have not provided any test case, only you could know the reason.

[Updated on: Mon, 16 March 2015 06:26]

Report message to a moderator

Re: Adding analytic function-based column changes another column result [message #634803 is a reply to message #634800] Mon, 16 March 2015 06:43 Go to previous messageGo to next message
Michel Cadot
Messages: 68643
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

The test case is attached to the first post.
And OP has really formatted his big SQL and provided complete Oracle version.
I think he is one of the best new posters we had.


[Updated on: Mon, 16 March 2015 06:46]

Report message to a moderator

Re: Adding analytic function-based column changes another column result [message #634805 is a reply to message #634803] Mon, 16 March 2015 07:11 Go to previous messageGo to next message
Lalit Kumar B
Messages: 3174
Registered: May 2013
Location: World Wide on the Web
Senior Member
Ah, yes. There is an attachment with insert statements. I got confused with the combination of text and attachment. Natural instinct, I just looked at the posted code.

@OP, I have tested it on 12.1 version, I get the following outputs with and without the MIN(id):

With:

        ID SOMETHING_ID         SOMETHING_NAME                  ROWNUMBER REVROWNUMBER REVMATCHINGCOUNT REVMATCHINGPAIR
---------- -------------------- ------------------------------ ---------- ------------ ---------------- ---------------
       237 TEST_VALUE           TOM AND JERRY                          21           10               10               1

SQL>


Without:

        ID SOMETHING_ID         SOMETHING_NAME                  ROWNUMBER REVROWNUMBER REVMATCHINGCOUNT REVMATCHINGPAIR
---------- -------------------- ------------------------------ ---------- ------------ ---------------- ---------------
       237 TEST_VALUE           TOM AND JERRY                          21           10               10               0

SQL>


Breaking the code into smaller pieces, I found the issue with this:

ORDER BY middle_qry.something_id DESC


Your case expression using LAG and LEAD is sorting the rows in descending order, however, in your MIN(id) you have not specified DESC, so the ORDER BY middle_qry.something_id for the min_id column is sorted in ascending order by default. All you need to do is, add DESC to the min_id as well:

SQL>  SELECT id,
  2         something_id,
  3         something_name,
  4         rownumber,
  5         revrownumber,
  6         revmatchingcount,
  7         revmatchingpair
  8  FROM   (SELECT id,
  9                 something_id,
 10                 something_name,
 11                 rownumber,
 12                 revrownumber,
 13                 revmatchingcount,
 14                 CASE
 15                   WHEN ( ( middle_qry.revmatchingcount =
 16                            Lag(middle_qry.revmatchingcount, 1, 0)
 17                              over(
 18                                ORDER BY middle_qry.something_id
 19                              DESC,
 20                              middle_qry.something_name DESC) )
 21                           OR ( middle_qry.revmatchingcount = Lead(
 22                                middle_qry.revmatchingcount, 1
 23                                                              , 0)
 24                                over(
 25                                  ORDER BY middle_qry.something_id DESC,
 26                                middle_qry.something_name DESC) ) )
 27                        AND MOD(middle_qry.revmatchingcount, 2) = 0 THEN 0
 28                   ELSE 1
 29                 END                                        AS RevMatchingPair,
 30                 Min(id)
 31                   over (
 32                     ORDER BY middle_qry.something_id DESC) min_id
 33          --> Add DESC to sort something_id in descending order as well
 34          FROM   (SELECT id,
 35                         something_id,
 36                         something_name,
 37                         rownumber,
 38                         revrownumber,
 39                         CASE
 40                           WHEN inner_qry.revrownumber = First_value(
 41                                                         inner_qry.revrownumber)
 42                                over(
 43                                  PARTITION BY inner_qry.something_id
 44                                  ORDER BY inner_qry.revrownumber DESC
 45                                )
 46                                 OR MOD(inner_qry.revrownumber, 2) = 0 THEN
 47                           inner_qry.revrownumber
 48                           ELSE inner_qry.revrownumber + 1
 49                         END AS RevMatchingCount
 50                  FROM   (SELECT id,
 51                                 something_id,
 52                                 something_name,
 53                                 Row_number()
 54                                   over(
 55                                     PARTITION BY something_id
 56                                     ORDER BY something_id DESC, something_name)
 57                                 AS
 58                                 RowNumber,
 59                                 Row_number()
 60                                   over(
 61                                     PARTITION BY something_id
 62                                     ORDER BY something_id, something_name DESC)
 63                                 AS
 64                                 RevRowNumber
 65                          FROM   test_table
 66                          ORDER  BY id) inner_qry
 67                  ORDER  BY id) middle_qry
 68          ORDER  BY id)
 69  WHERE  id = 237;

        ID SOMETHING_ID         SOMETHING_NAME                  ROWNUMBER REVROWNUMBER REVMATCHINGCOUNT REVMATCHINGPAIR
---------- -------------------- ------------------------------ ---------- ------------ ---------------- ---------------
       237 TEST_VALUE           TOM AND JERRY                          21           10               10               0

SQL>



P.S. : Thanks for the test case by the way.

[Updated on: Mon, 16 March 2015 07:11]

Report message to a moderator

Re: Adding analytic function-based column changes another column result [message #634806 is a reply to message #634697] Mon, 16 March 2015 08:15 Go to previous messageGo to next message
lewap02
Messages: 5
Registered: March 2015
Junior Member
At first: sorry for the test case in the attachment but there were a few of these inserts and I didn't want to mess up the post and make it unreadibly big. Also thank you for investigating the problem.

And now to business Smile Indeed, adding DESC to the ordering of min_id prevents changing of the result but it is not satisfactory solution nor it explains the main problem. The questions are:


    1. Why does it change anything? What is the Oracle theory behind it?
    2. Is it a documented rule/feature for analytic functions that each of them must have the same direction of ordering by the given column? What if business requiers me to calculate this min_id in the ASC order? Why adding of a new column which implements some totally diffrent purpose would affect another?
    3. Interesting thing that I have noticed today, which additionally negates the idea that the problem is non-DESC ordering of min_id: replacing
    min(id) over (ORDER BY middle_qry.something_id) min_id

    with
    last_value(id) over (ORDER BY middle_qry.something_id) min_id

    also "fixes" the result and without changing sorting direction. Basically this point answers point 2. - such a rule/feature cannot exist.

So the problem is still open and I'm looking forward to any suggestions.

[Updated on: Mon, 16 March 2015 08:17]

Report message to a moderator

Re: Adding analytic function-based column changes another column result [message #634807 is a reply to message #634806] Mon, 16 March 2015 08:26 Go to previous messageGo to next message
Michel Cadot
Messages: 68643
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

1. It should not change, if it does it is a bug
2. No, they do not need to be the same ORDER BY clause, of course otherwise it would decrease the interest of them

I think there is an unfortunate shortcut in the optimizer.
I remember in a previous version if you used ORDER BY in an analytic function and ORDER BY <same thing> DESC at the end of the query this latter one was ignored due an abusive optimization which checked only the expression and not the direction.

As you can see in the following you can have different order and the "cst" columns show the result is consistent:
SQL> select ename, sal,
  2         row_number() over (order by sal) sal_asc,
  3         row_number() over (order by sal desc) sal_desc,
  4         row_number() over (order by sal) + row_number() over (order by sal desc) sal_cst,
  5         row_number() over (order by ename) name_asc,
  6         row_number() over (order by ename desc) name_desc,
  7         row_number() over (order by ename) + row_number() over (order by ename desc) name_cst
  8  from emp
  9  order by 1
 10  /
ENAME             SAL    SAL_ASC   SAL_DESC    SAL_CST   NAME_ASC  NAME_DESC   NAME_CST
---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
ADAMS            1100          3         12         15          1         14         15
ALLEN            1600          8          7         15          2         13         15
BLAKE            2850         10          5         15          3         12         15
CLARK            2450          9          6         15          4         11         15
FORD             3000         13          2         15          5         10         15
JAMES             950          2         13         15          6          9         15
JONES            2975         11          4         15          7          8         15
KING             5000         14          1         15          8          7         15
MARTIN           1250          5         10         15          9          6         15
MILLER           1300          6          9         15         10          5         15
SCOTT            3000         12          3         15         11          4         15
SMITH             800          1         14         15         12          3         15
TURNER           1500          7          8         15         13          2         15
WARD             1250          4         11         15         14          1         15

14 rows selected.

Re: Adding analytic function-based column changes another column result [message #634863 is a reply to message #634807] Tue, 17 March 2015 02:34 Go to previous messageGo to next message
lewap02
Messages: 5
Registered: March 2015
Junior Member
I see. Thank You.

Would You have any suggestion except rewriting the query without using analytic functions? Originally the query had a lot of repeating subqueries (slow version) which were replaced by analytic functions. Maybe I will try to rewrite them as some pipelined procedures.

And secondly: would You bother reporting this to Oracle? Bug is still present even in 12.1 and, in my opinion, it is nasty...
Re: Adding analytic function-based column changes another column result [message #634866 is a reply to message #634863] Tue, 17 March 2015 03:31 Go to previous messageGo to next message
Michel Cadot
Messages: 68643
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

11.1 is out of support, so it is useless to open a SR this for this version.
But you can try to report it with the test case for all versions, maybe you will not have any direct answer but the bug will be fixed in a future patch for the current versions.

Re: Adding analytic function-based column changes another column result [message #634926 is a reply to message #634866] Wed, 18 March 2015 02:35 Go to previous messageGo to next message
lewap02
Messages: 5
Registered: March 2015
Junior Member
We've given the case second, third and fourth thought and now I don't think it is an Oracle bug anymore. Please review explanation given below.

Inside inner_query we are using row_number analytic function which has an order by clause defined on non-unique set of columns within a group. I presume that in such case row_number is assigned randomly and it is fine and understandable. Without min_id this order once randomized seems to remain static. After adding min_id column order changes and it is still clear because execution plan changes and we have "the golden rule" that Oracle does not guarantee any order unless "order by" is used. But now this changed order affected values that are randomly assigned by row_number - diffrent numbers have been assigned within the group. It appears to be no matter that min_id is in the middle_query - it probably is calculated before row_number in the inner_query.

What do you think of that? If I would have been assigning row_number using unique column in the "order by" clause, all would've been fine.

[Updated on: Wed, 18 March 2015 02:37]

Report message to a moderator

Re: Adding analytic function-based column changes another column result [message #634928 is a reply to message #634926] Wed, 18 March 2015 02:45 Go to previous message
Michel Cadot
Messages: 68643
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Quote:
Inside inner_query we are using row_number analytic function which has an order by clause defined on non-unique set of columns within a group. I presume that in such case row_number is assigned randomly and it is fine and understandable. Without min_id this order once randomized seems to remain static. After adding min_id column order changes and it is still clear because execution plan changes and we have "the golden rule" that Oracle does not guarantee any order unless "order by" is used. But now this changed order affected values that are randomly assigned by row_number - diffrent numbers have been assigned within the group.


I think this is a correct analysis.

Quote:
If I would have been assigning row_number using unique column in the "order by" clause, all would've been fine.


This is expected.

Thanks for this feedback.

Previous Topic: OCI8 ERROR
Next Topic: CASE in WHERE Clause
Goto Forum:
  


Current Time: Tue Apr 23 06:09:16 CDT 2024