Home » SQL & PL/SQL » SQL & PL/SQL » sql query (19c)
sql query [message #687540] 
Wed, 29 March 2023 17:34 
ora9a
Messages: 42 Registered: June 2010

Member 


Hi
I have this data set in table ‘students’
Id term module
123 2023 2100
123 2023 2101
123 2023 1001
999 2022 2400
999 2022 2401
999 2022 2402
I want a query that flags up the modules (Y or N) where the first digit is different to the others in that term. There will only be 1 module which is different. This may occur in some terms but not all. So in this case it would be 1(001) module because the other two modules in 2023 start with a 2.
So the result would be:
Id term module mod_flag
123 2023 2100 N
123 2023 2101 N
123 2023 1001 Y
999 2022 2400 N
999 2022 2401 N
999 2022 2402 N



Re: sql query [message #687542 is a reply to message #687540] 
Wed, 29 March 2023 19:55 

Barbara Boehmer
Messages: 9005 Registered: November 2002 Location: California, USA

Senior Member 


SCOTT@orcl_12.1.0.2.0> COLUMN mod_flag FORMAT A8
SCOTT@orcl_12.1.0.2.0>  test data you provided:
SCOTT@orcl_12.1.0.2.0> WITH
2 students (id, term, module) AS
3 (SELECT 123, 2023, 2100 FROM DUAL UNION ALL
4 SELECT 123, 2023, 2101 FROM DUAL UNION ALL
5 SELECT 123, 2023, 1001 FROM DUAL UNION ALL
6 SELECT 999, 2022, 2400 FROM DUAL UNION ALL
7 SELECT 999, 2022, 2401 FROM DUAL UNION ALL
8 SELECT 999, 2022, 2402 FROM DUAL)
9  query:
10 SELECT id, term, module,
11 CASE
12 WHEN COUNT (*) OVER (PARTITION BY id, term, SUBSTR (module, 1, 1)) = 1
13 AND COUNT (*) OVER (PARTITION BY id, term, SUBSTR (module, 1, 1)) <
14 COUNT (*) OVER (PARTITION BY id, term)
15 THEN 'Y'
16 ELSE 'N'
17 END mod_flag
18 FROM students
19 ORDER BY id, term, mod_flag, module
20 /
ID TERM MODULE MOD_FLAG
   
123 2023 2100 N
123 2023 2101 N
123 2023 1001 Y
999 2022 2400 N
999 2022 2401 N
999 2022 2402 N
6 rows selected.
[Updated on: Wed, 29 March 2023 19:57] Report message to a moderator



Re: sql query [message #687543 is a reply to message #687542] 
Thu, 30 March 2023 06:12 
Solomon Yakobson
Messages: 3227 Registered: January 2010 Location: Connecticut, USA

Senior Member 


Question is can there be only two rows per ID, TERM. If it can, then we can't tell which to flag Y and which to flag N. So all we cans do if flag them arbitrarily  one N other one Y. Barbara's solution will not work in case of two rows per ID. It will mark both Y:
WITH students(id, term, module)
AS (
 SELECT 123, 2023, 2100 FROM DUAL UNION ALL
SELECT 123, 2023, 2101 FROM DUAL UNION ALL
SELECT 123, 2023, 1001 FROM DUAL UNION ALL
SELECT 999, 2022, 2400 FROM DUAL UNION ALL
SELECT 999, 2022, 2401 FROM DUAL UNION ALL
SELECT 999, 2022, 2402 FROM DUAL
)
SELECT id,
term,
module,
CASE
WHEN COUNT(*) OVER(PARTITION BY id,term,SUBSTR(module,1,1)) = 1
AND COUNT(*) OVER(PARTITION BY id, term, SUBSTR(module,1,1)) < COUNT(*) OVER(PARTITION BY id,term) THEN 'Y'
ELSE 'N'
END mod_flag
FROM students
ORDER BY id,
term,
mod_flag,
module
/
ID TERM MODULE M
   
123 2023 1001 Y
123 2023 2101 Y
999 2022 2400 N
999 2022 2401 N
999 2022 2402 N
SQL>
So if there can be only two rows per ID, TERM:
WITH STUDENTS(ID, TERM, MODULE)
AS (
 SELECT 123, 2023, 2100 FROM DUAL UNION ALL
SELECT 123, 2023, 2101 FROM DUAL UNION ALL
SELECT 123, 2023, 1001 FROM DUAL UNION ALL
SELECT 999, 2022, 2400 FROM DUAL UNION ALL
SELECT 999, 2022, 2401 FROM DUAL UNION ALL
SELECT 999, 2022, 2402 FROM DUAL
)
SELECT ID,
TERM,
MODULE,
CASE
WHEN COUNT(*) OVER(PARTITION BY ID,TERM) <= 2
THEN CASE ROW_NUMBER() OVER(PARTITION BY ID,TERM ORDER BY SUBSTR(MODULE,1,1))
WHEN 1 THEN 'N'
ELSE 'Y'
END
WHEN COUNT(*) OVER(PARTITION BY ID,TERM,SUBSTR(MODULE,1,1)) > 1 THEN 'N'
ELSE 'Y'
END MOD_FLAG
FROM STUDENTS
ORDER BY ID,
TERM,
MOD_FLAG,
MODULE
/
ID TERM MODULE M
   
123 2023 1001 N
123 2023 2101 Y
999 2022 2400 N
999 2022 2401 N
999 2022 2402 N
SQL>
And if there can't be only two rows per ID:
WITH STUDENTS(ID, TERM, MODULE)
AS (
SELECT 123, 2023, 2100 FROM DUAL UNION ALL
SELECT 123, 2023, 2101 FROM DUAL UNION ALL
SELECT 123, 2023, 1001 FROM DUAL UNION ALL
SELECT 999, 2022, 2400 FROM DUAL UNION ALL
SELECT 999, 2022, 2401 FROM DUAL UNION ALL
SELECT 999, 2022, 2402 FROM DUAL
)
SELECT ID,
TERM,
MODULE,
CASE
WHEN COUNT(*) OVER(PARTITION BY ID,TERM,SUBSTR(MODULE,1,1)) > 1 THEN 'N'
ELSE 'Y'
END MOD_FLAG
FROM STUDENTS
ORDER BY ID,
TERM,
MOD_FLAG,
MODULE
/
ID TERM MODULE M
   
123 2023 2100 N
123 2023 2101 N
123 2023 1001 Y
999 2022 2400 N
999 2022 2401 N
999 2022 2402 N
6 rows selected.
SQL>
SY.
[Updated on: Thu, 30 March 2023 06:16] Report message to a moderator



Re: sql query [message #687544 is a reply to message #687543] 
Thu, 30 March 2023 09:56 
ora9a
Messages: 42 Registered: June 2010

Member 


Thanks for the responses.
Asked the business, and yes  even though very rare  it is possible to have only one or two modules. In the case of two, the lowest number would be flagged as 'Y'. In the case of one then that defaults to N.



Re: sql query [message #687545 is a reply to message #687544] 
Thu, 30 March 2023 10:45 
ora9a
Messages: 42 Registered: June 2010

Member 


In fact, it is always the lowest number, regardless of the number of records. e.g.
1 record only:
2001 N

2 records:
2001 N
1002 Y

2 +
3001 N
3002 N
3003 N
2001 Y



Re: sql query [message #687546 is a reply to message #687544] 
Thu, 30 March 2023 10:51 

mathguy
Messages: 54 Registered: January 2023

Member 


OK, so this sounds like it can be reformulated as follows: For a group of rows as classified by (ID, TERM), inspect the first digit of MODULE on each row, and also note the maximum of all these firstdigits in the group. If the first digit on a particular row is < than the max, mark the flag as 'Y', otherwise as 'N'.
Then the query is trivial to write:
with
students (id, term, module) as (
select 123, 2023, 2100 from dual union all
select 123, 2023, 2101 from dual union all
select 123, 2023, 1001 from dual union all
select 999, 2022, 2400 from dual union all
select 999, 2022, 2401 from dual union all
select 999, 2022, 2402 from dual union all
select 200, 2023, 1000 from dual union all
select 550, 2021, 1833 from dual union all
select 550, 2021, 2340 from dual union all
select 550, 2022, 3230 from dual union all
select 550, 2022, 3231 from dual
)
select id, term, module,
case when substr(module, 1, 1) < max(substr(module, 1, 1)) over (partition by id, term)
then 'Y' else 'N' end as flag
from students
order by id, term, module  or whatever is needed
;
ID TERM MODULE FLAG
   
123 2023 1001 Y
123 2023 2100 N
123 2023 2101 N
200 2023 1000 N
550 2021 1833 Y
550 2021 2340 N
550 2022 3230 N
550 2022 3231 N
999 2022 2400 N
999 2022 2401 N
999 2022 2402 N
[Updated on: Thu, 30 March 2023 10:51] Report message to a moderator



Re: sql query [message #687547 is a reply to message #687546] 
Thu, 30 March 2023 13:25 

mathguy
Messages: 54 Registered: January 2023

Member 


Or ... actually my reformulation is not equivalent to your problem statement, because a group by (ID, TERM) may have three or more rows, and in that case you said that at most one MODULE has a different first digit from the other MODULEs in the same group; you didn't say that in that case (which is the most common), the 'Y' flag will have the first digit lower than the other MODULEs in the group. It will only be different. Correct?
If so, then the query must be modified  perhaps something like this:
with
students (id, term, module) as (
select 123, 2023, 2100 from dual union all
select 123, 2023, 2101 from dual union all
select 123, 2023, 1001 from dual union all
select 123, 2024, 1100 from dual union all
select 123, 2024, 1200 from dual union all
select 123, 2024, 2304 from dual union all
select 999, 2022, 2400 from dual union all
select 999, 2022, 2401 from dual union all
select 999, 2022, 2402 from dual union all
select 200, 2023, 1000 from dual union all
select 550, 2021, 1833 from dual union all
select 550, 2021, 2340 from dual union all
select 550, 2022, 3230 from dual union all
select 550, 2022, 3231 from dual
)
select id, term, module,
case when count(*) over (partition by id, term) >= 3
and count(*) over (partition by id, term, substr(module, 1, 1)) = 1
then 'Y'
when count(*) over (partition by id, term) = 2
and substr(module, 1, 1) < max(substr(module, 1, 1)) over (partition by id, term)
then 'Y'
else 'N' end as flag
from students
order by id, term, module  or whatever is needed
;
ID TERM MODULE FLAG
   
123 2023 1001 Y
123 2023 2100 N
123 2023 2101 N
123 2024 1100 N
123 2024 1200 N
123 2024 2304 Y
200 2023 1000 N
550 2021 1833 Y
550 2021 2340 N
550 2022 3230 N
550 2022 3231 N
999 2022 2400 N
999 2022 2401 N
999 2022 2402 N



Re: sql query [message #687549 is a reply to message #687547] 
Fri, 31 March 2023 09:40 
ora9a
Messages: 42 Registered: June 2010

Member 


Hi, the first one was ok  i.e. the lowest digit. However, there are some other anomalies currently discussing with the business..




Re: sql query [message #687600 is a reply to message #687550] 
Wed, 12 April 2023 11:54 
ora9a
Messages: 42 Registered: June 2010

Member 


Hi again,
So, have this scenario where there is no one distinct first digit, so result is:
ID TERM MODULE FLAG
   
123 2023 3001 N
123 2023 2100 Y
123 2023 2101 Y
123 2023 3100 N
In this type of situation, all should be N.
The original scenario remains the same:
ID TERM MODULE FLAG
   
123 2023 3001 N
123 2023 3100 N
123 2023 2101 Y
123 2023 3200 N
Thanks



Re: sql query [message #687603 is a reply to message #687600] 
Wed, 12 April 2023 13:18 

mathguy
Messages: 54 Registered: January 2023

Member 


You can add a condition, like I was doing in my "other" answer (where I didn't assume the 'Y' row is necessarily the one with the lowest first digit of MODULE value), something like this:
select id, term, module,
case when substr(module, 1, 1) < max(substr(module, 1, 1)) over (partition by id, term)
and count(*) over (partition by id, term, substr(module, 1, 1)) = 1
then 'Y' else 'N' end as flag
from students
order by id, term, module  or whatever is needed
;
If this becomes more and more complicated (for example, if there may be three different firstdigits, etc.), you could write a more compact solution using MATCH_RECOGNIZE. If you aren't familiar with it (it was new in Oracle 12.1), it is absolutely worth the effort to learn it, since it can do a lot of things that simply can't be done any other way in SQL (and it can do a lot of things in a simple way, that can be done in other ways but only in a very complex manner).
select id, term, module, cls
from students
match_recognize (
partition by id, term
order by module
measures classifier() as cls
all rows per match
pattern ( ^ (YN) N* )
define Y as module < trunc(next(module), 3)
)
;
In the DEFINE clause of MATCH_RECOGNIZE, notice the different way I wrote the condition on the "first digit". This assumes (like all the earlier discussion) that all "modules" are exactly fourdigit integers; there are no "modules" like 320 or 114303.
The PATTERN clause says that only the first row in a partition (the minimum value of MODULE) can be classified as Y, while everything else is N (also the first row, if it doesn't satisfy the condition for Y). DEFINE says that the condition for Y (which again, remember, can only be for the first row, having the minimum value of MODULE) is that the MODULE value be strictly less than the next MODULE value, rounded down to the thousand; TRUNC(2843, 3), for example, is 2000. The numerical inequality comparing to that TRUNC is equivalent to the condition on first digits (if all values are fourdigit integers).



Goto Forum:
Current Time: Sun May 28 04:08:22 CDT 2023
