Home » SQL & PL/SQL » SQL & PL/SQL » Group together data from multiple columns
Group together data from multiple columns [message #292400] Tue, 08 January 2008 21:30 Go to next message
stevekerver
Messages: 19
Registered: January 2008
Junior Member
Assume the following result set, which comes from a SELECT statement that joins the ID values of several highly normalized tables together:
--------------------------------------------------

DEVICE_ID DEVICE_SCREEN_ID EVENT_ID MEDIA_TEMPLATE_ID
---------- ---------------- ------------- -----------------
1 | 1 | 1 | 2
1 | 1 | 2 | 3
1 | 1 | 3 | 5
1 | 1 | 4 | 7
1 | 1 | 5 | 9
1 | 1 | 6 | 2
1 | 1 | 7 | 3
---------- ---------------- ------------- -----------------
2 | 1 | 1 | 2
2 | 1 | 2 | 3
2 | 1 | 3 | 5
2 | 1 | 4 | 7
2 | 1 | 5 | 9
2 | 1 | 6 | 2
2 | 1 | 7 | 3
---------- ---------------- ------------- -----------------
3 | 1 | 1 | 2
3 | 1 | 2 | 3
3 | 1 | 3 | 5
3 | 1 | 4 | 7
3 | 1 | 5 | 9
3 | 1 | 6 | 2
3 | 1 | 7 | 3
3 | 2 | 1 | 1
3 | 2 | 2 | 2
3 | 2 | 3 | 3
3 | 2 | 4 | 4
3 | 2 | 5 | 5
3 | 2 | 6 | 3
3 | 2 | 7 | 4
---------- ---------------- ------------- -----------------
4 | 1 | 1 | 6
4 | 1 | 2 | 3
4 | 1 | 3 | 8
4 | 1 | 4 | 12
4 | 1 | 5 | 14
4 | 1 | 6 | 11
4 | 1 | 7 | 18
---------- ---------------- ------------- -----------------
5 | 1 | 1 | 2
5 | 1 | 2 | 3
5 | 1 | 3 | 5
5 | 1 | 4 | 7
5 | 1 | 5 | 9
5 | 1 | 6 | 2
----------------------------------------------------------

The question is:
In an elegant and scalable way- How do I use a SELECT statement to to create a view that groups together the DEVICE_ID's that have the exact same values for the columns DEVICE_SCREEN_ID, EVENT_ID, and MEDIA_TEMPLATE_ID??
For example, note that DEVICE_ID's #1 & #2 share the exact same data. The SELECT statement should therefore show them to be in the same DEVICE_GROUP.
But also notice that while device_id #5 is similar to #1 and #2, it is missing EVENT_ID #7, and therefore should be in its own DEVICE_GROUP.
That is, unless EVENT_ID #7 is added to DEVICE_ID #5, and with 3 as the value for MEDIA_TEMPLATE_ID. If this ever becomes the case, the SELECT statement should then automatically show that DEVICE_ID #5 now belongs to the same DEVICE_GROUP as DEVICE_ID #1 and DEVICE_ID #2.
Also notice that DEVICE_ID #4 is essentially doing its own thing- it should be in its own group.
DEVICE_ID #3 should also be in its own group, because while it does share the same MEDIA_TEMPLATE_ID's and EVENT_ID's as DEVICE_ID's #1 and #2, it has a DEVICE_SCREEN_ID that the other DEVICE_ID's don't have.

The final query results should look something like this:


DEVICE_ID DEVICE_GROUP
---------- ----------------
1 | 1
2 | 1
3 | 2
4 | 3
5 | 4


On this one, I'm truly, truly stumped, and any help you could provide would be greatly appreciated.
Ok ORACLE EXPERTS, have at it...
Re: Group together data from multiple columns [message #292414 is a reply to message #292400] Tue, 08 January 2008 22:09 Go to previous messageGo to next message
BlackSwan
Messages: 26766
Registered: January 2009
Location: SoCal
Senior Member
Please read & follow posting guideline as stated in URL below:
http://www.orafaq.com/forum/t/88153/0/
Re: Group together data from multiple columns [message #292484 is a reply to message #292400] Wed, 09 January 2008 01:08 Go to previous messageGo to next message
Michel Cadot
Messages: 68733
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
First, please read and follow OraFAQ Forum Guide, especially "How to format your post?" section.
Make sure that lines of code do not exceed 80 characters when you format. Use the "Preview Message" button to verify.
Post a full test case: create table and insert statements.

Then it is possible if you create/imagine a function that generates a uniq number from the data columns then another one to group each value in something like a bitmap and the compare each bitmap.

Regards
Michel
Re: Group together data from multiple columns [message #292537 is a reply to message #292400] Wed, 09 January 2008 03:07 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9104
Registered: November 2002
Location: California, USA
Senior Member

SCOTT@orcl_11g> SELECT * FROM result_set
  2  /

 DEVICE_ID DEVICE_SCREEN_ID   EVENT_ID MEDIA_TEMPLATE_ID
---------- ---------------- ---------- -----------------
         1                1          1                 2
         1                1          2                 3
         1                1          3                 5
         1                1          4                 7
         1                1          5                 9
         1                1          6                 2
         1                1          7                 3
         2                1          1                 2
         2                1          2                 3
         2                1          3                 5
         2                1          4                 7
         2                1          5                 9
         2                1          6                 2
         2                1          7                 3
         3                1          1                 2
         3                1          2                 3
         3                1          3                 5
         3                1          4                 7
         3                1          5                 9
         3                1          6                 2
         3                1          7                 3
         3                2          1                 1
         3                2          2                 2
         3                2          3                 3
         3                2          4                 4
         3                2          5                 5
         3                2          6                 3
         3                2          7                 4
         4                1          1                 6
         4                1          2                 3
         4                1          3                 8
         4                1          4                12
         4                1          5                14
         4                1          6                11
         4                1          7                18
         5                1          1                 2
         5                1          2                 3
         5                1          3                 5
         5                1          4                 7
         5                1          5                 9
         5                1          6                 2

41 rows selected.

SCOTT@orcl_11g> CREATE OR REPLACE TYPE other_ids AS OBJECT
  2    (device_screen_id  NUMBER,
  3  	event_id	  NUMBER,
  4  	media_template_id NUMBER);
  5  /

Type created.

SCOTT@orcl_11g> CREATE OR REPLACE TYPE other_ids_tab AS TABLE OF other_ids
  2  /

Type created.

SCOTT@orcl_11g> SELECT rso.device_id,
  2  	    CAST
  3  	      (MULTISET
  4  		(SELECT rsi.device_screen_id,
  5  			rsi.event_id,
  6  			rsi.media_template_id
  7  		 FROM	result_set rsi
  8  		 WHERE	rsi.device_id = rso.device_id)
  9  	       AS other_ids_tab)
 10  	      AS others
 11  FROM   (SELECT DISTINCT device_id
 12  	     FROM   result_set) rso
 13  /

 DEVICE_ID
----------
OTHERS(DEVICE_SCREEN_ID, EVENT_ID, MEDIA_TEMPLATE_ID)
--------------------------------------------------------------------------------
         1
OTHER_IDS_TAB(OTHER_IDS(1, 1, 2), OTHER_IDS(1, 2, 3), OTHER_IDS(1, 3, 5), OTHER_
IDS(1, 4, 7), OTHER_IDS(1, 5, 9), OTHER_IDS(1, 6, 2), OTHER_IDS(1, 7, 3))

         2
OTHER_IDS_TAB(OTHER_IDS(1, 1, 2), OTHER_IDS(1, 2, 3), OTHER_IDS(1, 3, 5), OTHER_
IDS(1, 4, 7), OTHER_IDS(1, 5, 9), OTHER_IDS(1, 6, 2), OTHER_IDS(1, 7, 3))

         4
OTHER_IDS_TAB(OTHER_IDS(1, 1, 6), OTHER_IDS(1, 2, 3), OTHER_IDS(1, 3, 8), OTHER_
IDS(1, 4, 12), OTHER_IDS(1, 5, 14), OTHER_IDS(1, 6, 11), OTHER_IDS(1, 7, 18))

         5
OTHER_IDS_TAB(OTHER_IDS(1, 1, 2), OTHER_IDS(1, 2, 3), OTHER_IDS(1, 3, 5), OTHER_
IDS(1, 4, 7), OTHER_IDS(1, 5, 9), OTHER_IDS(1, 6, 2))

         3
OTHER_IDS_TAB(OTHER_IDS(1, 1, 2), OTHER_IDS(1, 2, 3), OTHER_IDS(1, 3, 5), OTHER_
IDS(1, 4, 7), OTHER_IDS(1, 5, 9), OTHER_IDS(1, 6, 2), OTHER_IDS(1, 7, 3), OTHER_
IDS(2, 1, 1), OTHER_IDS(2, 2, 2), OTHER_IDS(2, 3, 3), OTHER_IDS(2, 4, 4), OTHER_
IDS(2, 5, 5), OTHER_IDS(2, 6, 3), OTHER_IDS(2, 7, 4))


SCOTT@orcl_11g> WITH subquery AS
  2    (SELECT rso.device_id,
  3  	       CAST
  4  		 (MULTISET
  5  		   (SELECT rsi.device_screen_id,
  6  			   rsi.event_id,
  7  			   rsi.media_template_id
  8  		    FROM   result_set rsi
  9  		    WHERE  rsi.device_id = rso.device_id)
 10  		  AS other_ids_tab)
 11  		 AS others
 12  	FROM   (SELECT DISTINCT device_id
 13  		FROM   result_set) rso)
 14  SELECT s1.device_id AS device_id1, s2.device_id AS device_id2
 15  FROM   subquery s1, subquery s2
 16  WHERE  s1.others SUBMULTISET OF s2.others
 17  AND    s2.others SUBMULTISET OF s1.others
 18  AND    s1.device_id <= s2.device_id
 19  /

DEVICE_ID1 DEVICE_ID2
---------- ----------
         1          1
         1          2
         2          2
         3          3
         4          4
         5          5

6 rows selected.

SCOTT@orcl_11g> SELECT device_id,
  2  	    DENSE_RANK () OVER (ORDER BY group_id) AS device_group
  3  FROM   (SELECT device_id2 AS device_id, MIN (device_id1) AS group_id
  4  	     FROM   (WITH subquery AS
  5  		       (SELECT rso.device_id,
  6  			       CAST
  7  				 (MULTISET
  8  				   (SELECT rsi.device_screen_id,
  9  					   rsi.event_id,
 10  					   rsi.media_template_id
 11  				    FROM   result_set rsi
 12  				    WHERE  rsi.device_id = rso.device_id)
 13  				  AS other_ids_tab)
 14  				 AS others
 15  			FROM   (SELECT DISTINCT device_id
 16  				FROM   result_set) rso)
 17  		     SELECT s1.device_id AS device_id1, s2.device_id AS device_id2
 18  		     FROM   subquery s1, subquery s2
 19  		     WHERE  s1.others SUBMULTISET OF s2.others
 20  		     AND    s2.others SUBMULTISET OF s1.others
 21  		     AND    s1.device_id <= s2.device_id)
 22  	     GROUP  BY device_id2)
 23  ORDER  BY device_group, device_id
 24  /

 DEVICE_ID DEVICE_GROUP
---------- ------------
         1            1
         2            1
         3            2
         4            3
         5            4

SCOTT@orcl_11g> 

Re: Group together data from multiple columns [message #292546 is a reply to message #292537] Wed, 09 January 2008 03:25 Go to previous messageGo to next message
Michel Cadot
Messages: 68733
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Barbara,

As you created the statements to build the test case, it would be nice if you posted it.

It's a nice piece of code.

For reference: Multiset Conditions

Regards
Michel
Re: Group together data from multiple columns [message #292701 is a reply to message #292400] Wed, 09 January 2008 12:26 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member

When it comes to SQL coding , Barabara is Perfectionist .
Hats Off !!!

Thumbs Up
Rajuvan

[Updated on: Wed, 09 January 2008 12:39]

Report message to a moderator

Re: Group together data from multiple columns [message #292719 is a reply to message #292546] Wed, 09 January 2008 13:35 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9104
Registered: November 2002
Location: California, USA
Senior Member
Michel Cadot wrote on Wed, 09 January 2008 01:25

Barbara,

As you created the statements to build the test case, it would be nice if you posted it.

It's a nice piece of code.

For reference: Multiset Conditions

Regards
Michel




Originally, I copied and pasted the data into a text file and loaded it with SQL*Loader, but here is a version with standard insert statements:

 
CREATE TABLE result_set
  (device_id         NUMBER,
   device_screen_id  NUMBER,
   event_id          NUMBER,
   media_template_id NUMBER)
/

INSERT ALL
INTO result_set VALUES (1 , 1 , 1 , 2)
INTO result_set VALUES (1 , 1 , 2 , 3)
INTO result_set VALUES (1 , 1 , 3 , 5)
INTO result_set VALUES (1 , 1 , 4 , 7)
INTO result_set VALUES (1 , 1 , 5 , 9)
INTO result_set VALUES (1 , 1 , 6 , 2)
INTO result_set VALUES (1 , 1 , 7 , 3)
INTO result_set VALUES (2 , 1 , 1 , 2)
INTO result_set VALUES (2 , 1 , 2 , 3)
INTO result_set VALUES (2 , 1 , 3 , 5)
INTO result_set VALUES (2 , 1 , 4 , 7)
INTO result_set VALUES (2 , 1 , 5 , 9)
INTO result_set VALUES (2 , 1 , 6 , 2)
INTO result_set VALUES (2 , 1 , 7 , 3)
INTO result_set VALUES (3 , 1 , 1 , 2)
INTO result_set VALUES (3 , 1 , 2 , 3)
INTO result_set VALUES (3 , 1 , 3 , 5)
INTO result_set VALUES (3 , 1 , 4 , 7)
INTO result_set VALUES (3 , 1 , 5 , 9)
INTO result_set VALUES (3 , 1 , 6 , 2)
INTO result_set VALUES (3 , 1 , 7 , 3)
INTO result_set VALUES (3 , 2 , 1 , 1)
INTO result_set VALUES (3 , 2 , 2 , 2)
INTO result_set VALUES (3 , 2 , 3 , 3)
INTO result_set VALUES (3 , 2 , 4 , 4)
INTO result_set VALUES (3 , 2 , 5 , 5)
INTO result_set VALUES (3 , 2 , 6 , 3)
INTO result_set VALUES (3 , 2 , 7 , 4)
INTO result_set VALUES (4 , 1 , 1 , 6)
INTO result_set VALUES (4 , 1 , 2 , 3)
INTO result_set VALUES (4 , 1 , 3 , 8)
INTO result_set VALUES (4 , 1 , 4 , 12)
INTO result_set VALUES (4 , 1 , 5 , 14)
INTO result_set VALUES (4 , 1 , 6 , 11)
INTO result_set VALUES (4 , 1 , 7 , 18)
INTO result_set VALUES (5 , 1 , 1 , 2)
INTO result_set VALUES (5 , 1 , 2 , 3)
INTO result_set VALUES (5 , 1 , 3 , 5)
INTO result_set VALUES (5 , 1 , 4 , 7)
INTO result_set VALUES (5 , 1 , 5 , 9)
INTO result_set VALUES (5 , 1 , 6 , 2)
SELECT * FROM DUAL
/

SELECT * FROM result_set
/

CREATE OR REPLACE TYPE other_ids AS OBJECT
  (device_screen_id  NUMBER,
   event_id          NUMBER,
   media_template_id NUMBER);
/
CREATE OR REPLACE TYPE other_ids_tab AS TABLE OF other_ids
/

SELECT rso.device_id, 
       CAST 
         (MULTISET 
           (SELECT rsi.device_screen_id, 
                   rsi.event_id,
                   rsi.media_template_id
            FROM   result_set rsi
            WHERE  rsi.device_id = rso.device_id)
          AS other_ids_tab)
         AS others
FROM   (SELECT DISTINCT device_id 
        FROM   result_set) rso
/

WITH subquery AS
  (SELECT rso.device_id, 
          CAST 
            (MULTISET 
              (SELECT rsi.device_screen_id, 
                      rsi.event_id,
                      rsi.media_template_id
               FROM   result_set rsi
               WHERE  rsi.device_id = rso.device_id)
             AS other_ids_tab) 
            AS others
   FROM   (SELECT DISTINCT device_id 
           FROM   result_set) rso)
SELECT s1.device_id AS device_id1, s2.device_id AS device_id2
FROM   subquery s1, subquery s2
WHERE  s1.others SUBMULTISET OF s2.others
AND    s2.others SUBMULTISET OF s1.others
AND    s1.device_id <= s2.device_id
/   

SELECT device_id, 
       DENSE_RANK () OVER (ORDER BY group_id) AS device_group
FROM   (SELECT device_id2 AS device_id, MIN (device_id1) AS group_id
        FROM   (WITH subquery AS
                  (SELECT rso.device_id, 
                          CAST 
                            (MULTISET 
                              (SELECT rsi.device_screen_id, 
                                      rsi.event_id,
                                      rsi.media_template_id
                               FROM   result_set rsi
                               WHERE  rsi.device_id = rso.device_id)
                             AS other_ids_tab) 
                            AS others
                   FROM   (SELECT DISTINCT device_id 
                           FROM   result_set) rso)
                SELECT s1.device_id AS device_id1, s2.device_id AS device_id2
                FROM   subquery s1, subquery s2         
                WHERE  s1.others SUBMULTISET OF s2.others
                AND    s2.others SUBMULTISET OF s1.others
                AND    s1.device_id <= s2.device_id)
        GROUP  BY device_id2)
ORDER  BY device_group, device_id
/   

DROP TYPE other_ids_tab
/
DROP TYPE other_ids
/
DROP TABLE result_set
/
 
 

Re: Group together data from multiple columns [message #292722 is a reply to message #292719] Wed, 09 January 2008 14:08 Go to previous messageGo to next message
Michel Cadot
Messages: 68733
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Many thanks and above all for this demonstration on very little known multiset operators.

Regards
Michel
Re: Group together data from multiple columns [message #292758 is a reply to message #292537] Wed, 09 January 2008 15:53 Go to previous messageGo to next message
stevekerver
Messages: 19
Registered: January 2008
Junior Member
No Message Body

[Updated on: Wed, 09 January 2008 16:25]

Report message to a moderator

Re: Group together data from multiple columns [message #293078 is a reply to message #292758] Thu, 10 January 2008 16:28 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9104
Registered: November 2002
Location: California, USA
Senior Member
I don't know why you deleted your message body, which was,"

"That definitely looks like it has a lot of potential. (If it works, I will bless you, and hope that you live to be a thousand...)

...Would it also be possible to offer an explanation of what's going on here??

Assuming the result set could be obtained by a standard select that joins several tables together, how exactly would the solution be implemented using a view?
...Would it be better to use a standard wiew, or a materialized view?

...I've been doing DBA work for a long time, and I've never heard of the MULTISET. Fascinating. ...Oracle always has that strange power to surprise you..."

As to the explanation as to what is going on, it uses cast and multiset to put the rows for each device_id into a collection that can then be compared using the submultiset operator. If each is a subset of the other, then they are equal. Because you are comparing collections or nested tables, you cannot just compare them with an = operator. You can search the online documentation for multiset and submultiset for additional information and examples. Michel provided one link.

As to implementation, just use your select that joins several tables together as an inline view in place of the result_set table that I used. You could then create your view or materialized view as 'create ... view ... as ' followed by the entire select statement, including your inline view that replaces the result_set table. As to view versus materialized view, not sure. It may depend on your needs.

I hope you did not delete the body of your last post because you thought I was ignoring you. I was just busy doing other things, including responding to follow-ups on other forums and hadn't gotten around to you yet. If you deleted it because you searched the documentation and answered your own questions, that's good.
Re: Group together data from multiple columns [message #293255 is a reply to message #292400] Fri, 11 January 2008 07:24 Go to previous messageGo to next message
Michel Cadot
Messages: 68733
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
Here's another way to do it that follows the algorithm I gave in my first post.

First, create a unique identifier for "device_screen_id, event_id, media_template_id". At the same time, number and count the rows in each device_id category (to limit the output size I add a restriction on number of rows returned). I assumed there are less than 100 numbers of distinct id in each column:
SQL>     select device_id, 
  2             100*100*device_screen_id
  3               + 100*event_id
  4               + media_template_id rest_id,
  5             row_number() 
  6               over (partition by device_id
  7                     order by device_screen_id, event_id, media_template_id)
  8               rn,
  9             count(*) over (partition by device_id) cnt
 10      from result_set
 11      where rownum < 10
 12  /
 DEVICE_ID    REST_ID         RN        CNT
---------- ---------- ---------- ----------
         1      10102          1          7
         1      10203          2          7
         1      10305          3          7
         1      10407          4          7
         1      10509          5          7
         1      10602          6          7
         1      10703          7          7
         2      10102          1          2
         2      10203          2          2

Then we will need to number and know the number of these new ids:
SQL> with 
  2    data as (
  3      select device_id, 
  4             100*100*device_screen_id
  5               + 100*event_id
  6               + media_template_id rest_id,
  7             row_number() 
  8               over (partition by device_id
  9                     order by device_screen_id, event_id, media_template_id)
 10               rn,
 11             count(*) over (partition by device_id) cnt
 12      from result_set
 13      where rownum > 0
 14    )
 15      select rest_id, 
 16             dense_rank() over (order by rest_id) rk_id,
 17             count(*) over () nb_id
 18      from ( select distinct rest_id from data )
 19      where rownum < 10
 20  /
   REST_ID      RK_ID      NB_ID
---------- ---------- ----------
     10102          1          9
     10106          2          9
     10203          3          9
     10305          4          9
     10308          5          9
     10407          6          9
     10412          7          9
     10509          8          9
     10514          9          9

Now put all these information in the first query, this is our new data set. We no more need to know the initial values, only the rank of the new rest_id is useful:
SQL> with 
  2    data as (
  3      select device_id, 
  4             100*100*device_screen_id
  5               + 100*event_id
  6               + media_template_id rest_id,
  7             row_number() 
  8               over (partition by device_id
  9                     order by device_screen_id, event_id, media_template_id)
 10               rn,
 11             count(*) over (partition by device_id) cnt
 12      from result_set
 13      where rownum > 0
 14    ),
 15    ref_ids as (
 16      select rest_id, 
 17             dense_rank() over (order by rest_id) rk_id,
 18             count(*) over () nb_id
 19      from ( select distinct rest_id from data )
 20    )
 21      select a.device_id, a.rn, a.cnt, b.rk_id, b.nb_id
 22      from data a, ref_ids b
 23      where b.rest_id = a.rest_id
 24        and rownum < 10
 25  /
 DEVICE_ID         RN        CNT      RK_ID      NB_ID
---------- ---------- ---------- ---------- ----------
         1          1          7          1         20
         1          2          7          3         20
         1          3          7          4         20
         1          4          7          6         20
         1          5          7          8         20
         1          6          7         10         20
         1          7          7         12         20
         2          1          7          1         20
         2          2          7          3         20

Now create the bitmap for each device_id, here we assume there are less than 100 distinct "rest_id" (so 2 digits in number).
As we have number both rows and rest_id in the same order, we can use hierarchical query to build these bitmaps:
SQL> col bitmap format a40
SQL> with 
  2    data as (
  3      select device_id, 
  4             100*100*device_screen_id
  5               + 100*event_id
  6               + media_template_id rest_id,
  7             row_number() 
  8               over (partition by device_id
  9                     order by device_screen_id, event_id, media_template_id)
 10               rn,
 11             count(*) over (partition by device_id) cnt
 12      from result_set
 13      where rownum > 0
 14    ),
 15    ref_ids as (
 16      select rest_id, 
 17             dense_rank() over (order by rest_id) rk_id,
 18             count(*) over () nb_id
 19      from ( select distinct rest_id from data )
 20    ),
 21    new_data as (
 22      select a.device_id, a.rn, a.cnt, b.rk_id, b.nb_id
 23      from data a, ref_ids b
 24      where b.rest_id = a.rest_id
 25    )
 26      select device_id, 
 27             replace(sys_connect_by_path(
 28                       lpad(to_char(rk_id,'fm00'), 
 29                            2*(rk_id-nvl(prior rk_id,0)),
 30                            '0'),
 31                       '/'),
 32                     '/', '') bitmap
 33      from new_data
 34      where rn = cnt
 35      connect by prior device_id = device_id
 36             and prior rn = rn-1
 37      start with rn = 1
 38  /
 DEVICE_ID BITMAP
---------- ----------------------------------------
         1 010003040006000800100012
         2 010003040006000800100012
         3 0100030400060008001000120014151617181920
         4 00020300050007000900110013
         5 01000304000600080010

5 rows selected.

Note that the bitmaps are not filled up to their full size. It is easy to modify the query using nb_id value (which is not used here) to fill them with 0 but this is useless for our purpose.

Now, the final step is to number the bitmaps (and call it device group):
SQL> with 
  2    data as (
  3      select device_id, 
  4             100*100*device_screen_id
  5               + 100*event_id
  6               + media_template_id rest_id,
  7             row_number() 
  8               over (partition by device_id
  9                     order by device_screen_id, event_id, media_template_id)
 10               rn,
 11             count(*) over (partition by device_id) cnt
 12      from result_set
 13      where rownum > 0
 14    ),
 15    ref_ids as (
 16      select rest_id, 
 17             dense_rank() over (order by rest_id) rk_id,
 18             count(*) over () nb_id
 19      from ( select distinct rest_id from data )
 20    ),
 21    new_data as (
 22      select a.device_id, a.rn, a.cnt, b.rk_id, b.nb_id
 23      from data a, ref_ids b
 24      where b.rest_id = a.rest_id
 25    ),
 26    bitmaps as (
 27      select device_id, 
 28             replace(sys_connect_by_path(
 29                       lpad(to_char(rk_id,'fm00'), 
 30                            2*(rk_id-nvl(prior rk_id,0)),
 31                            '0'),
 32                       '/'),
 33                     '/', '') bitmap
 34      from new_data
 35      where rn = cnt
 36      connect by prior device_id = device_id
 37             and prior rn = rn-1
 38      start with rn = 1
 39    )
 40  select device_id,
 41         dense_rank() over (order by bitmap) device_group
 42  from bitmaps
 43  order by 2
 44  /
 DEVICE_ID DEVICE_GROUP
---------- ------------
         4            1
         5            2
         1            3
         2            3
         3            4

5 rows selected.

Regards
Michel
Re: Group together data from multiple columns [message #293326 is a reply to message #293255] Fri, 11 January 2008 12:51 Go to previous messageGo to next message
stevekerver
Messages: 19
Registered: January 2008
Junior Member
I deleted my last post because I wanted to verify that the proposed solution would actually work before claiming victory.

Ok- so here's the problem we've uncovered that is inherent in Barbara's solution: As the data changes for each device_id, the group_id numbers don't stay constant. That is, a group_id cannot be used as something that will always uniquely identify *that* group.

To demonstrate this: Suppose, for example, we add device_id's 6 and 7, and each of these device_id's vary enough in their data to be assigned to their own seperate device_groups. ...So- let's assume that the device_id's #6 and #7 get device_groups 5 and 6, respectively.
Now, suppose device_id #5 has a record added so that it will now belong to device_group 1. (That is, device_id #5 suddenly gets a new row with device_screen_id 1, event_id 7, and media_template 3).
...While the change will correctly move device_id 5 to group 1, the query results will then also state that device_id's 6 and 7 now belong to group_id's 4 and 5, respectively- which would be incorrect. (Device_id's 6 and 7 should still belong to device_groups 5 and 6, respectively.)
...The data for the device_id's 6 and 7 hasn't changed, so it shouldn't auto-reassign them different groups.

...Let me tinker around with Michel's solution, and see if proves to be fruitful. ...A lot of the suggestions we're getting around here are of the flavor: "Just concactenate everything together and compare the strings". But, to me that smells of unscalable- especially with thousands of rows of data.
Re: Group together data from multiple columns [message #293329 is a reply to message #293326] Fri, 11 January 2008 13:07 Go to previous messageGo to next message
Michel Cadot
Messages: 68733
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
How could a query knows that before your insert/update the result was something that must be left unchanged?
If you find an answer to this, I would be glad to know it.
What happens to device_group device_id=3? Do you now have device_groups 1, 3 and 4 (in your example)? How a query that now see the new data (without device_id=3) must not return a device_group=2?

In addition, what you want to do is far from relational model, SQL is a relational language (at least as its roots), so there is little chance you get a very scalable answer in SQL (nor, in my opinion, in another language).

Regards
Michel
Re: Group together data from multiple columns [message #293393 is a reply to message #293326] Sat, 12 January 2008 01:05 Go to previous message
Barbara Boehmer
Messages: 9104
Registered: November 2002
Location: California, USA
Senior Member
Your requirements may be unattainable, so you may need to rethink your design. If you are willing to accept that each time a row is added for a device_id, that changes which group it belongs to, otherwise it retains the same group id, then the following example may work. It creates a groups table to store unique groups to which a unique group_id is automatically assigned by a trigger that uses a sequence. Any new data must be inserted through a procedure that checks if the group is already in the groups table and inserts it if it is not. Then you can easily select the device_id and group_id by comparing the two tables using submultiset.


SCOTT@orcl_11g> CREATE TABLE result_set
  2    (device_id	  NUMBER,
  3  	device_screen_id  NUMBER,
  4  	event_id	  NUMBER,
  5  	media_template_id NUMBER)
  6  /

Table created.

SCOTT@orcl_11g> CREATE OR REPLACE TYPE other_ids AS OBJECT
  2    (device_screen_id  NUMBER,
  3  	event_id	  NUMBER,
  4  	media_template_id NUMBER);
  5  /

Type created.

SCOTT@orcl_11g> CREATE OR REPLACE TYPE other_ids_tab AS TABLE OF other_ids
  2  /

Type created.

SCOTT@orcl_11g> CREATE TABLE groups
  2    (group_id	  NUMBER,
  3  	others		  other_ids_tab)
  4    NESTED TABLE others STORE AS others_nt
  5  /

Table created.

SCOTT@orcl_11g> CREATE SEQUENCE group_id_seq
  2  /

Sequence created.

SCOTT@orcl_11g> CREATE OR REPLACE TRIGGER groups_trigger
  2    BEFORE INSERT ON groups
  3    FOR EACH ROW
  4  BEGIN
  5    SELECT group_id_seq.NEXTVAL
  6    INTO   :NEW.group_id
  7    FROM   DUAL;
  8  END groups_trigger;
  9  /

Trigger created.

SCOTT@orcl_11g> SHOW ERRORS
No errors.
SCOTT@orcl_11g> CREATE OR REPLACE PROCEDURE insert_data
  2    (p_device_id	     result_set.device_id%TYPE,
  3  	p_device_screen_id   result_set.device_screen_id%TYPE,
  4  	p_event_id	     result_set.event_id%TYPE,
  5  	p_media_template_id  result_set.media_template_id%TYPE)
  6  AS
  7    v_count NUMBER;
  8  BEGIN
  9    INSERT INTO result_set (device_id, device_screen_id, event_id, media_template_id)
 10    VALUES (p_device_id, p_device_screen_id, p_event_id, p_media_template_id);
 11    COMMIT;
 12    SELECT COUNT (*)
 13    INTO   v_count
 14    FROM   groups g,
 15  	      (SELECT CAST
 16  			(MULTISET
 17  			  (SELECT device_screen_id,
 18  				  event_id,
 19  				  media_template_id
 20  			   FROM   result_set
 21  			   WHERE  device_id = p_device_id)
 22  			 AS other_ids_tab) AS others
 23  	       FROM   DUAL) rs
 24    WHERE  g.others SUBMULTISET OF rs.others
 25    AND    rs.others SUBMULTISET OF g.others;
 26    IF v_count = 0 THEN
 27  	 INSERT INTO groups (others)
 28  	 SELECT CAST
 29  		  (MULTISET
 30  		    (SELECT device_screen_id,
 31  			    event_id,
 32  			    media_template_id
 33  		     FROM   result_set
 34  		     WHERE  device_id = p_device_id)
 35  		   AS other_ids_tab)
 36  	 FROM	DUAL;
 37    END IF;
 38  END insert_data;
 39  /

Procedure created.

SCOTT@orcl_11g> SHOW ERRORS
No errors.
SCOTT@orcl_11g> EXEC insert_data (1 , 1 , 1 , 2)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (1 , 1 , 2 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (1 , 1 , 3 , 5)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (1 , 1 , 4 , 7)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (1 , 1 , 5 , 9)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (1 , 1 , 6 , 2)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (1 , 1 , 7 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (2 , 1 , 1 , 2)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (2 , 1 , 2 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (2 , 1 , 3 , 5)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (2 , 1 , 4 , 7)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (2 , 1 , 5 , 9)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (2 , 1 , 6 , 2)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (2 , 1 , 7 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 1 , 1 , 2)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 1 , 2 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 1 , 3 , 5)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 1 , 4 , 7)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 1 , 5 , 9)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 1 , 6 , 2)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 1 , 7 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 2 , 1 , 1)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 2 , 2 , 2)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 2 , 3 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 2 , 4 , 4)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 2 , 5 , 5)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 2 , 6 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (3 , 2 , 7 , 4)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (4 , 1 , 1 , 6)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (4 , 1 , 2 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (4 , 1 , 3 , 8)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (4 , 1 , 4 , 12)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (4 , 1 , 5 , 14)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (4 , 1 , 6 , 11)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (4 , 1 , 7 , 18)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (5 , 1 , 1 , 2)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (5 , 1 , 2 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (5 , 1 , 3 , 5)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (5 , 1 , 4 , 7)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (5 , 1 , 5 , 9)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (5 , 1 , 6 , 2)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> SELECT rs.device_id, g.group_id
  2  FROM   groups g,
  3  	    (SELECT rso.device_id,
  4  		    CAST
  5  		      (MULTISET
  6  			(SELECT rsi.device_screen_id,
  7  				rsi.event_id,
  8  				rsi.media_template_id
  9  			 FROM	result_set rsi
 10  			 WHERE	rsi.device_id = rso.device_id)
 11  		       AS other_ids_tab)
 12  		      AS others
 13  	     FROM   (SELECT DISTINCT device_id
 14  		     FROM   result_set) rso) rs
 15  WHERE  g.others SUBMULTISET OF rs.others
 16  AND    rs.others SUBMULTISET OF g.others
 17  ORDER  BY device_id
 18  /

 DEVICE_ID   GROUP_ID
---------- ----------
         1          7
         2          7
         3         14
         4         21
         5          6

SCOTT@orcl_11g> EXEC insert_data (6 , 88 , 88 , 88)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (7 , 99 , 99 , 99)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> EXEC insert_data (5 , 1 , 7 , 3)

PL/SQL procedure successfully completed.

SCOTT@orcl_11g> SELECT rs.device_id, g.group_id
  2  FROM   groups g,
  3  	    (SELECT rso.device_id,
  4  		    CAST
  5  		      (MULTISET
  6  			(SELECT rsi.device_screen_id,
  7  				rsi.event_id,
  8  				rsi.media_template_id
  9  			 FROM	result_set rsi
 10  			 WHERE	rsi.device_id = rso.device_id)
 11  		       AS other_ids_tab)
 12  		      AS others
 13  	     FROM   (SELECT DISTINCT device_id
 14  		     FROM   result_set) rso) rs
 15  WHERE  g.others SUBMULTISET OF rs.others
 16  AND    rs.others SUBMULTISET OF g.others
 17  ORDER  BY device_id
 18  /

 DEVICE_ID   GROUP_ID
---------- ----------
         1          7
         2          7
         3         14
         4         21
         5          7
         6         22
         7         23

7 rows selected.

SCOTT@orcl_11g> 




Previous Topic: Complex String Query
Next Topic: Complex Days Count In One SQL
Goto Forum:
  


Current Time: Tue Feb 11 11:59:15 CST 2025