Home » SQL & PL/SQL » SQL & PL/SQL » Adding analytic function-based column changes another column result (Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production)
Adding analytic function-based column changes another column result [message #634697] |
Fri, 13 March 2015 07:37 |
|
lewap02
Messages: 5 Registered: March 2015
|
Junior Member |
|
|
Hello everybody. While working on a query I have encountered situation which I cannot explain nor understand. Given below are DDL and the query. Data (inserts) are in the attachment.
Case is that I have a query to which I want to add new analytic function-based column. This new column is independent of the already existing columns. The problem is that adding this column - uncommeting min_id - actually changes the result of another analytic column - RevMatchingPair, . The only thing that they have "in common" is that part of the ordering in both of them is middle_qry.something_id. Indeed, if I would change ordering in min_id to id it would work fine.
Is it possible that one analytic function changes order of rows on which another such column is operating? My intuition is that every analytic column should operate on some kind of "local" order. If it is not some kind of my misuse/misunderstanding then such behaviuor of the database is very dangerous in my opinion. By adding just one new independent (from a logical point of view) value I can destroy report results.
Could you please take a look and help me understand this case?
CREATE TABLE TEST_TABLE
( ID NUMBER,
SOMETHING_ID VARCHAR2(255),
SOMETHING_NAME VARCHAR2(255)
);
select id,
something_id,
something_name,
RowNumber,
RevRowNumber,
RevMatchingCount,
RevMatchingPair
from (select id,
something_id,
something_name,
RowNumber,
RevRowNumber,
RevMatchingCount,
CASE
WHEN ((
middle_qry.RevMatchingCount = LAG(middle_qry.RevMatchingCount, 1, 0)
OVER(ORDER BY middle_qry.something_id DESC,
middle_qry.something_name DESC)) OR
(
middle_qry.RevMatchingCount = LEAD(middle_qry.RevMatchingCount, 1, 0)
OVER(ORDER BY middle_qry.something_id DESC,
middle_qry.something_name DESC))) AND
Mod(middle_qry.RevMatchingCount, 2) = 0 THEN
0
ELSE
1
END as RevMatchingPair
--,min(id) over (ORDER BY middle_qry.something_id ) min_id
from (select
id,
something_id,
something_name,
RowNumber,
RevRowNumber,
CASE
WHEN inner_qry.RevRowNumber = FIRST_VALUE(inner_qry.RevRowNumber)
OVER(PARTITION BY inner_qry.something_id ORDER BY inner_qry.RevRowNumber DESC) OR
MOD(inner_qry.RevRowNumber, 2) = 0 THEN
inner_qry.RevRowNumber
ELSE
inner_qry.RevRowNumber + 1
END AS RevMatchingCount
from (select
id,
something_id,
something_name,
ROW_NUMBER() OVER(PARTITION BY something_id ORDER BY something_id desc, something_name) as RowNumber,
ROW_NUMBER() OVER(PARTITION BY something_id ORDER BY something_id, something_name desc) as RevRowNumber
From test_table
order by id) inner_qry
order by id) middle_qry
order by id)
where id = 237;
|
|
|
|
|
Re: Adding analytic function-based column changes another column result [message #634800 is a reply to message #634788] |
Mon, 16 March 2015 06:22 |
Lalit Kumar B
Messages: 3174 Registered: May 2013 Location: World Wide on the Web
|
Senior Member |
|
|
Hi lewap02,
Welcome to the forum!
Please read and follow the OraFAQ Forum Guide and How to use [code] tags, to enable us to help you.
lewap02 wrote on Mon, 16 March 2015 13:49
Diffrence is seen in the last column: REVMATCHINGPAIR.
The values for REVMATCHINGPAIR as 0 or 1 depends on the output of the following case statement:
CASE
WHEN ((
middle_qry.RevMatchingCount = LAG(middle_qry.RevMatchingCount, 1, 0)
OVER(ORDER BY middle_qry.something_id DESC,
middle_qry.something_name DESC)) OR
(
middle_qry.RevMatchingCount = LEAD(middle_qry.RevMatchingCount, 1, 0)
OVER(ORDER BY middle_qry.something_id DESC,
middle_qry.something_name DESC))) AND
Mod(middle_qry.RevMatchingCount, 2) = 0 THEN
0
ELSE
1
END as RevMatchingPair
As you have not provided any test case, only you could know the reason.
[Updated on: Mon, 16 March 2015 06:26] Report message to a moderator
|
|
|
|
Re: Adding analytic function-based column changes another column result [message #634805 is a reply to message #634803] |
Mon, 16 March 2015 07:11 |
Lalit Kumar B
Messages: 3174 Registered: May 2013 Location: World Wide on the Web
|
Senior Member |
|
|
Ah, yes. There is an attachment with insert statements. I got confused with the combination of text and attachment. Natural instinct, I just looked at the posted code.
@OP, I have tested it on 12.1 version, I get the following outputs with and without the MIN(id):
With:
ID SOMETHING_ID SOMETHING_NAME ROWNUMBER REVROWNUMBER REVMATCHINGCOUNT REVMATCHINGPAIR
---------- -------------------- ------------------------------ ---------- ------------ ---------------- ---------------
237 TEST_VALUE TOM AND JERRY 21 10 10 1
SQL>
Without:
ID SOMETHING_ID SOMETHING_NAME ROWNUMBER REVROWNUMBER REVMATCHINGCOUNT REVMATCHINGPAIR
---------- -------------------- ------------------------------ ---------- ------------ ---------------- ---------------
237 TEST_VALUE TOM AND JERRY 21 10 10 0
SQL>
Breaking the code into smaller pieces, I found the issue with this:
ORDER BY middle_qry.something_id DESC
Your case expression using LAG and LEAD is sorting the rows in descending order, however, in your MIN(id) you have not specified DESC, so the ORDER BY middle_qry.something_id for the min_id column is sorted in ascending order by default. All you need to do is, add DESC to the min_id as well:
SQL> SELECT id,
2 something_id,
3 something_name,
4 rownumber,
5 revrownumber,
6 revmatchingcount,
7 revmatchingpair
8 FROM (SELECT id,
9 something_id,
10 something_name,
11 rownumber,
12 revrownumber,
13 revmatchingcount,
14 CASE
15 WHEN ( ( middle_qry.revmatchingcount =
16 Lag(middle_qry.revmatchingcount, 1, 0)
17 over(
18 ORDER BY middle_qry.something_id
19 DESC,
20 middle_qry.something_name DESC) )
21 OR ( middle_qry.revmatchingcount = Lead(
22 middle_qry.revmatchingcount, 1
23 , 0)
24 over(
25 ORDER BY middle_qry.something_id DESC,
26 middle_qry.something_name DESC) ) )
27 AND MOD(middle_qry.revmatchingcount, 2) = 0 THEN 0
28 ELSE 1
29 END AS RevMatchingPair,
30 Min(id)
31 over (
32 ORDER BY middle_qry.something_id DESC) min_id
33 --> Add DESC to sort something_id in descending order as well
34 FROM (SELECT id,
35 something_id,
36 something_name,
37 rownumber,
38 revrownumber,
39 CASE
40 WHEN inner_qry.revrownumber = First_value(
41 inner_qry.revrownumber)
42 over(
43 PARTITION BY inner_qry.something_id
44 ORDER BY inner_qry.revrownumber DESC
45 )
46 OR MOD(inner_qry.revrownumber, 2) = 0 THEN
47 inner_qry.revrownumber
48 ELSE inner_qry.revrownumber + 1
49 END AS RevMatchingCount
50 FROM (SELECT id,
51 something_id,
52 something_name,
53 Row_number()
54 over(
55 PARTITION BY something_id
56 ORDER BY something_id DESC, something_name)
57 AS
58 RowNumber,
59 Row_number()
60 over(
61 PARTITION BY something_id
62 ORDER BY something_id, something_name DESC)
63 AS
64 RevRowNumber
65 FROM test_table
66 ORDER BY id) inner_qry
67 ORDER BY id) middle_qry
68 ORDER BY id)
69 WHERE id = 237;
ID SOMETHING_ID SOMETHING_NAME ROWNUMBER REVROWNUMBER REVMATCHINGCOUNT REVMATCHINGPAIR
---------- -------------------- ------------------------------ ---------- ------------ ---------------- ---------------
237 TEST_VALUE TOM AND JERRY 21 10 10 0
SQL>
P.S. : Thanks for the test case by the way.
[Updated on: Mon, 16 March 2015 07:11] Report message to a moderator
|
|
|
Re: Adding analytic function-based column changes another column result [message #634806 is a reply to message #634697] |
Mon, 16 March 2015 08:15 |
|
lewap02
Messages: 5 Registered: March 2015
|
Junior Member |
|
|
At first: sorry for the test case in the attachment but there were a few of these inserts and I didn't want to mess up the post and make it unreadibly big. Also thank you for investigating the problem.
And now to business Indeed, adding DESC to the ordering of min_id prevents changing of the result but it is not satisfactory solution nor it explains the main problem. The questions are:
1. Why does it change anything? What is the Oracle theory behind it?
2. Is it a documented rule/feature for analytic functions that each of them must have the same direction of ordering by the given column? What if business requiers me to calculate this min_id in the ASC order? Why adding of a new column which implements some totally diffrent purpose would affect another?
3. Interesting thing that I have noticed today, which additionally negates the idea that the problem is non-DESC ordering of min_id: replacing
min(id) over (ORDER BY middle_qry.something_id) min_id
with
last_value(id) over (ORDER BY middle_qry.something_id) min_id
also "fixes" the result and without changing sorting direction. Basically this point answers point 2. - such a rule/feature cannot exist.
So the problem is still open and I'm looking forward to any suggestions.
[Updated on: Mon, 16 March 2015 08:17] Report message to a moderator
|
|
|
Re: Adding analytic function-based column changes another column result [message #634807 is a reply to message #634806] |
Mon, 16 March 2015 08:26 |
|
Michel Cadot
Messages: 68643 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
1. It should not change, if it does it is a bug
2. No, they do not need to be the same ORDER BY clause, of course otherwise it would decrease the interest of them
I think there is an unfortunate shortcut in the optimizer.
I remember in a previous version if you used ORDER BY in an analytic function and ORDER BY <same thing> DESC at the end of the query this latter one was ignored due an abusive optimization which checked only the expression and not the direction.
As you can see in the following you can have different order and the "cst" columns show the result is consistent:
SQL> select ename, sal,
2 row_number() over (order by sal) sal_asc,
3 row_number() over (order by sal desc) sal_desc,
4 row_number() over (order by sal) + row_number() over (order by sal desc) sal_cst,
5 row_number() over (order by ename) name_asc,
6 row_number() over (order by ename desc) name_desc,
7 row_number() over (order by ename) + row_number() over (order by ename desc) name_cst
8 from emp
9 order by 1
10 /
ENAME SAL SAL_ASC SAL_DESC SAL_CST NAME_ASC NAME_DESC NAME_CST
---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
ADAMS 1100 3 12 15 1 14 15
ALLEN 1600 8 7 15 2 13 15
BLAKE 2850 10 5 15 3 12 15
CLARK 2450 9 6 15 4 11 15
FORD 3000 13 2 15 5 10 15
JAMES 950 2 13 15 6 9 15
JONES 2975 11 4 15 7 8 15
KING 5000 14 1 15 8 7 15
MARTIN 1250 5 10 15 9 6 15
MILLER 1300 6 9 15 10 5 15
SCOTT 3000 12 3 15 11 4 15
SMITH 800 1 14 15 12 3 15
TURNER 1500 7 8 15 13 2 15
WARD 1250 4 11 15 14 1 15
14 rows selected.
|
|
|
|
|
Re: Adding analytic function-based column changes another column result [message #634926 is a reply to message #634866] |
Wed, 18 March 2015 02:35 |
|
lewap02
Messages: 5 Registered: March 2015
|
Junior Member |
|
|
We've given the case second, third and fourth thought and now I don't think it is an Oracle bug anymore. Please review explanation given below.
Inside inner_query we are using row_number analytic function which has an order by clause defined on non-unique set of columns within a group. I presume that in such case row_number is assigned randomly and it is fine and understandable. Without min_id this order once randomized seems to remain static. After adding min_id column order changes and it is still clear because execution plan changes and we have "the golden rule" that Oracle does not guarantee any order unless "order by" is used. But now this changed order affected values that are randomly assigned by row_number - diffrent numbers have been assigned within the group. It appears to be no matter that min_id is in the middle_query - it probably is calculated before row_number in the inner_query.
What do you think of that? If I would have been assigning row_number using unique column in the "order by" clause, all would've been fine.
[Updated on: Wed, 18 March 2015 02:37] Report message to a moderator
|
|
|
Re: Adding analytic function-based column changes another column result [message #634928 is a reply to message #634926] |
Wed, 18 March 2015 02:45 |
|
Michel Cadot
Messages: 68643 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
Quote:Inside inner_query we are using row_number analytic function which has an order by clause defined on non-unique set of columns within a group. I presume that in such case row_number is assigned randomly and it is fine and understandable. Without min_id this order once randomized seems to remain static. After adding min_id column order changes and it is still clear because execution plan changes and we have "the golden rule" that Oracle does not guarantee any order unless "order by" is used. But now this changed order affected values that are randomly assigned by row_number - diffrent numbers have been assigned within the group.
I think this is a correct analysis.
Quote:If I would have been assigning row_number using unique column in the "order by" clause, all would've been fine.
This is expected.
Thanks for this feedback.
|
|
|
Goto Forum:
Current Time: Tue Apr 23 06:09:16 CDT 2024
|