Sequence reduction [message #400806] |
Wed, 29 April 2009 13:01  |
tmcallister
Messages: 107 Registered: December 2007
|
Senior Member |
|
|
I'm tasked with coming up with a translation from our current system (with a 4 digit sequence code which must be unique), to a older system (with a 2 digit sequence code which is not required to be unique).
Of course I want to try and encapsulate as much of the largest divisions as possible.
I've come up with a method to mark the largest seperations (i.e. where I'd like the sequence to increment), but can't quite seem to make the leap to getting the actual sequence.
Here is a trivial example (2->1 digits as opposed to 4->2):
1 0 0
5 1 1
8 0 1
12 1 2
15 0 2
30 1 3
33 0 3
37 1 4
40 0 4
44 1 5
50 1 6
60 1 7
61 0 7
62 0 7
64 0 7
67 0 7
70 0 7
71 0 7
80 1 8
99 1 9
I have the first column, can make the second, and want the third.
But if there is a better way to do it, I'm open to it. Just keep in mind the sequence is still important.
|
|
|
Re: Sequence reduction [message #400811 is a reply to message #400806] |
Wed, 29 April 2009 13:08   |
 |
Michel Cadot
Messages: 68737 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
As usual:
Post a working Test case: create table and insert statements along with the result you want with these data.
Also always post your Oracle version (4 decimals).
Quote: | I'm tasked with coming up with a translation from our current system (with a 4 digit sequence code which must be unique), to a older system (with a 2 digit sequence code which is not required to be unique).
|
Just take the last or first 2 digits.
Quote: | I have the first column, can make the second, and want the third.
|
Completly clear as mud.
Regards
Michel
[Updated on: Wed, 29 April 2009 13:10] Report message to a moderator
|
|
|
|
Re: Sequence reduction [message #400816 is a reply to message #400806] |
Wed, 29 April 2009 13:59   |
tmcallister
Messages: 107 Registered: December 2007
|
Senior Member |
|
|
Sorry about the improper format, I think I got it now.
SQLPlus version: 11.1.0.6.0
Oracle version: 11.1.0.6.0
Here is a test case
CREATE TABLE TEST
(
old_seq VARCHAR2(8)
);
insert into TEST VALUES ('1');
insert into TEST VALUES ('5');
insert into TEST VALUES ('8');
insert into TEST VALUES ('12');
insert into TEST VALUES ('15');
insert into TEST VALUES ('30');
insert into TEST VALUES ('33');
insert into TEST VALUES ('37');
insert into TEST VALUES ('40');
insert into TEST VALUES ('44');
insert into TEST VALUES ('50');
insert into TEST VALUES ('60');
insert into TEST VALUES ('61');
insert into TEST VALUES ('62');
insert into TEST VALUES ('64');
insert into TEST VALUES ('67');
insert into TEST VALUES ('70');
insert into TEST VALUES ('71');
insert into TEST VALUES ('80');
insert into TEST VALUES ('99');
Desired output
old_seq new_seq
------- -------
1 0
5 1
8 1
12 2
15 2
30 3
33 3
37 4
40 4
44 5
50 6
60 7
61 7
62 7
64 7
67 7
70 7
71 7
80 8
99 9
The journey so far (I'm embarrassed to post this monstrosity, it makes me look like a bad first year student)...
SELECT old_seq, seq, diff, inc,
CASE
WHEN inc = 0
THEN 0
ELSE ROW_NUMBER () OVER (PARTITION BY inc ORDER BY seq)
END almost_new_seq
FROM (SELECT old_seq, seq, diff, CASE
WHEN diff_order > 10
THEN 0
ELSE 1
END inc
FROM (SELECT old_seq, seq, diff,
ROW_NUMBER () OVER (ORDER BY diff DESC) diff_order
FROM (SELECT a.old_seq, a.seq, a.old_seq - b.old_seq diff
FROM (SELECT old_seq,
ROW_NUMBER () OVER (ORDER BY LENGTH
(old_seq),
old_seq) seq
FROM TEST) a,
(SELECT old_seq,
ROW_NUMBER () OVER (ORDER BY LENGTH
(old_seq),
old_seq) seq
FROM TEST) b
WHERE a.seq - 1 = b.seq
OR (a.seq = 1 AND b.seq = 1))))
ORDER BY seq
[Updated on: Wed, 29 April 2009 14:21] Report message to a moderator
|
|
|
|
Re: Sequence reduction [message #400818 is a reply to message #400806] |
Wed, 29 April 2009 14:24   |
tmcallister
Messages: 107 Registered: December 2007
|
Senior Member |
|
|
continue from previous: results of the current query:
OLD_SEQ SEQ DIFF INC ALMOST_NEW_SEQ
1 1 0 0 0
5 2 4 1 1
8 3 3 0 0
12 4 4 1 2
15 5 3 0 0
30 6 15 1 3
33 7 3 0 0
37 8 4 1 4
40 9 3 1 5
44 10 4 1 6
50 11 6 1 7
60 12 10 1 8
61 13 1 0 0
62 14 1 0 0
64 15 2 0 0
67 16 3 0 0
70 17 3 0 0
71 18 1 0 0
80 19 9 1 9
99 20 19 1 10
Now just those pesky extra 0's to take care of.
I just noticed that my almost_new_seq departs from my previous specified desired sequence, due to the non deterministic nature of row_number(). Differences of this sort can be ignored.
But I feel like I'm barking up the completely wrong tree and there is a much more trivial way to do this (in SQL).
Quote: |
What is the algorithm or logic that transforms old to new values?
|
It's based on the largest difference, this isn't necessarily the best or the required method, but it is my current line of thought.
[Updated on: Wed, 29 April 2009 14:32] Report message to a moderator
|
|
|
|
Re: Sequence reduction [message #400822 is a reply to message #400821] |
Wed, 29 April 2009 14:38   |
tmcallister
Messages: 107 Registered: December 2007
|
Senior Member |
|
|
BlackSwan wrote on Wed, 29 April 2009 13:31 | >Sorry, but I am not a mind reader.
The "difference" between what & what?
|
The current OLD_SEQ value and the previous OLD_SEQ value with order determined from the ascending sort of the OLD_SEQ values and denoted by SEQ
BlackSwan wrote on Wed, 29 April 2009 13:31 | Cute names "OLD_SEQ SEQ DIFF INC ALMOST_NEW_SEQ" but what do they mean & why do they contain the values they have?
|
OLD_SEQ: the 2 digit seqence that I'm attempting to reduce to 1 digit.
SEQ: the ascending sort of OLD_SEQ values
DIFF: The difference between an OLD_SEQ value and the previous OLD_SEQ value with order determined from the ascending sort of the OLD_SEQ values.
INC: Indicates where the desired sequence should be incremented, chosen from the 10 largest DIFF values.
ALMOST_NEW_SEQ: aside from some extra zeroes, the desired output.
|
|
|
|
Re: Sequence reduction [message #400827 is a reply to message #400825] |
Wed, 29 April 2009 14:55   |
tmcallister
Messages: 107 Registered: December 2007
|
Senior Member |
|
|
BlackSwan wrote on Wed, 29 April 2009 13:47 | >INC: Indicates where the desired sequence should be incremented,
"indicated" based upon what logic or decision criteria?
>chosen from the 10 largest DIFF values.
Chosen based upon what logic or decision criteria?
Sorry to ask this, but why don't you just use an Oracle SEQUENCE starting at 1000?
|
... Perhaps I'm misunderstanding, but are you asking me how I determine when one number is larger than another?
Because
(1) in this example I'm working with 20 numbers that need to fit in 10 buckets, a sequence will only assign 1 number per bucket and I need at least 2, and would like some intelligence about when the number is incremented as opposed to doing it blindly.
(2) In my real world case I can about 100 difference groups of 300-400 numbers that need to fit into 100 buckets each.
|
|
|
|
Re: Sequence reduction [message #400829 is a reply to message #400828] |
Wed, 29 April 2009 15:10   |
tmcallister
Messages: 107 Registered: December 2007
|
Senior Member |
|
|
The OLD_SEQ value with the SEQ number X, is compared with the OLD_SEQ value with the SEQ number X-1. If X=1 then DIFF is 0 by definition.
So as an example
OLD_SEQ 5 has a SEQ value of 2, this is compared with the OLD_SEQ which has the corresponding SEQ value of 1, which is also 1.
5-1=4 which is were the DIFF values comes from.
Carry this out over all values and you get:
[code]
OLD_SEQ SEQ DIFF
1 1 0
5 2 4
8 3 3
12 4 4
15 5 3
30 6 15
33 7 3
37 8 4
40 9 3
44 10 4
50 11 6
60 12 10
61 13 1
62 14 1
64 15 2
67 16 3
70 17 3
71 18 1
80 19 9
99 20 19
[code]
Now pick the 10 largest. This will be every DIFF value of 4 and over, and 1 DIFF value of 3, this is picked randomly from all 3 values, but so we agree lets pick the same value (OLD_SEQ 40)
This yields
1 1 0 0
5 2 4 1
8 3 3 0
12 4 4 1
15 5 3 0
30 6 15 1
33 7 3 0
37 8 4 1
40 9 3 1
44 10 4 1
50 11 6 1
60 12 10 1
61 13 1 0
62 14 1 0
64 15 2 0
67 16 3 0
70 17 3 0
71 18 1 0
80 19 9 1
99 20 19 1
[Updated on: Wed, 29 April 2009 15:15] Report message to a moderator
|
|
|
Re: Sequence reduction [message #400830 is a reply to message #400828] |
Wed, 29 April 2009 15:10   |
andrew again
Messages: 2577 Registered: March 2000
|
Senior Member |
|
|
4 is the magic answer.
SQL> set pagesize 56
SQL>
SQL> CREATE TABLE TEST (old_seq number);
Table created.
SQL> insert into TEST VALUES (1);
1 row created.
SQL> insert into TEST VALUES (5);
1 row created.
SQL> insert into TEST VALUES (8);
1 row created.
SQL> insert into TEST VALUES (12);
1 row created.
SQL> insert into TEST VALUES (15);
1 row created.
SQL> insert into TEST VALUES (30);
1 row created.
SQL> insert into TEST VALUES (33);
1 row created.
SQL> insert into TEST VALUES (37);
1 row created.
SQL> insert into TEST VALUES (40);
1 row created.
SQL> insert into TEST VALUES (44);
1 row created.
SQL> insert into TEST VALUES (50);
1 row created.
SQL> insert into TEST VALUES (60);
1 row created.
SQL> insert into TEST VALUES (61);
1 row created.
SQL> insert into TEST VALUES (62);
1 row created.
SQL> insert into TEST VALUES (64);
1 row created.
SQL> insert into TEST VALUES (67);
1 row created.
SQL> insert into TEST VALUES (70);
1 row created.
SQL> insert into TEST VALUES (71);
1 row created.
SQL> insert into TEST VALUES (80);
1 row created.
SQL> insert into TEST VALUES (99);
1 row created.
SQL> commit;
Commit complete.
SQL>
SQL> SELECT old_seq, prev_seq, gap1, SUM (gap1) OVER (ORDER BY old_seq) new_seq
2 FROM (SELECT old_seq, prev_seq, old_seq - prev_seq gap,
3 CASE
4 WHEN (old_seq - prev_seq) < 4 THEN 0
5 ELSE 1
6 END gap1
7 FROM (SELECT old_seq,
8 LAG (old_seq, 1, old_seq) OVER (ORDER BY old_seq) prev_seq
9 FROM TEST));
OLD_SEQ PREV_SEQ GAP1 NEW_SEQ
---------- ---------- ---------- ----------
1 1 0 0
5 1 1 1
8 5 0 1
12 8 1 2
15 12 0 2
30 15 1 3
33 30 0 3
37 33 1 4
40 37 0 4
44 40 1 5
50 44 1 6
60 50 1 7
61 60 0 7
62 61 0 7
64 62 0 7
67 64 0 7
70 67 0 7
71 70 0 7
80 71 1 8
99 80 1 9
20 rows selected.
SQL>
|
|
|
|