Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Usenet -> c.d.o.server -> Re: Identifying pairs as groups
Daniel Morgan <damorgan_at_x.washington.edu> wrote in message news:<1079973490.55117_at_yasure>...
>
> Someone might be able to if you could clearly state the rules: Something
> you have not done. Examples are fine ... but rules stated in clear,
> descriptive, declarative sentences, are essential. I have no idea from
> what you've written what exactly defines a pair or a group. I could
> guess ... but that is not sufficient.
OK, point taken. I thought I was simplifying things, but obviously not (I bet you'd guess right, though :-)). I'll start at the beginning...
Some legacy data defines processes and the tasks that make up the procedures. So a process might be "Log Complaint" and the activities within it might be "Answer phone", "Enter customer details", "Log details", "Inform supervisor"
So the actual data of the child table looks like this:
PROCESS_CODE TASK_CODE TASK_COUNT
------------ --------- ----------
4268 231 1 4268 244 1 4268 257 1 4268 268 2 4268 272 1 4268 276 1 4270 231 1 4270 244 1 4270 248 1 4270 256 1 4270 272 1 4270 276 1 4273 231 1 4273 244 1 4273 257 1 4273 268 2 4273 272 1 4273 276 1 4285 231 1 4285 256 1 4285 269 1 4285 270 1 4285 272 1
Now over a period of time, "new" proceeses have been created that have actually been duplicates of existing one. The defintion of a duplicate is if each process in a pair has identical tasks and task count. So in the above example process 4268 and 4273 are duplicates.
So in order to identify duplicates (which is what I've been asked to do), I wrote the following SQL to identify "pairs" of duplicates:
select x.process_code code1,
y.process_code code2
from
(select b.process_count,
a.process_code, a.task_code, a.task_count from ag a, (select process_code,count(*) process_count from ag group by process_code) bwhere a.process_code = b.process_code) x, (select b.process_count,
a.process_code, a.task_code, a.task_count from ag a, (select process_code,count(*) process_count from ag group by process_code) bwhere a.process_code = b.process_code) y where x.process_code <> y.process_code
and x.process_count = y.process_count and x.task_code = y.task_code and x.task_count = y.task_count group by x.process_code, y.process_code
This identifies each duplicate process and the process(es) that it is a duplicate of. So in the above example, I'd get
CODE1 CODE2
--------- ---------
4268 4273 4273 4268
However, I found that in many cases there were more than 2 processes that had identical tasks. So again, in the above example, if 4300 had the same tasks as 4268 and 4273, I'd get the following:
CODE1 CODE2
--------- ---------
4268 4273 4268 4300 4273 4268 4273 4300 4300 4268 4300 4273
So this brings me to my original problem. I've identified that 4268 is a duplicate of 4273 and that 4268 is a duplicate of 4300 (and also that 4273 is a duplicate of 4300, obviously). But what I want to do is say that 4268, 4273 and 4300 are all part of the same group of duplicates.
Of course maybe I'm barking up the wrong tree with my idea of identifying pairs, but any suggestions would be welcome.
Thanks
CE Received on Tue Mar 23 2004 - 06:10:05 CST