Home » SQL & PL/SQL » SQL & PL/SQL » repeating items in a string (Oracle 11G R2)
repeating items in a string [message #607155] |
Mon, 03 February 2014 06:02 |
vesile_taskiran
Messages: 66 Registered: August 2008 Location: Turkey
|
Member |
|
|
Hi,
Is there any analytic function which i can find the repeating items in a string?
Quote:'green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William'
result should be
Quote:
green
blue
yellow
purple
William
|
|
|
|
Re: repeating items in a string [message #607166 is a reply to message #607158] |
Mon, 03 February 2014 06:30 |
|
Littlefoot
Messages: 21808 Registered: June 2005 Location: Croatia, Europe
|
Senior Member Account Moderator |
|
|
However, you can write your own code, such as
SQL> WITH test
2 AS (SELECT 'green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William'
3 col
4 FROM DUAL),
5 pieces AS ( SELECT TRIM (REGEXP_SUBSTR (col,
6 '[^,]+',
7 1,
8 LEVEL))
9 just_one
10 FROM test
11 CONNECT BY REGEXP_SUBSTR (col,
12 '[^,]+',
13 1,
14 LEVEL)
15 IS NOT NULL)
16 SELECT just_one, COUNT (*) cnt
17 FROM pieces
18 GROUP BY just_One
19 HAVING COUNT (*) > 1;
JUST_ONE CNT
-------------------- ----------
green 3
William 2
blue 2
yellow 2
purple 3
SQL>
|
|
|
|
|
Re: repeating items in a string [message #607170 is a reply to message #607169] |
Mon, 03 February 2014 07:26 |
|
Michel Cadot
Messages: 68647 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
It depends what you mean by "better".
I can give you a query using the old functions:
SQL> col just_one format a15
SQL> WITH
2 test AS (
3 SELECT 'green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William'
4 col
5 FROM DUAL
6 ),
7 data as ( select ','||col||',' col from test ),
8 pieces as (
9 select trim(substr(col,
10 instr(col, ',', 1, level) + 1,
11 instr(col, ',', 1, level+1)
12 - instr(col, ',', 1, level) - 1)) just_one
13 from data
14 connect by level <= length(col)-length(replace(col,',',''))
15 )
16 SELECT just_one, COUNT (*) cnt
17 FROM pieces
18 GROUP BY just_One
19 HAVING COUNT (*) > 1
20 /
JUST_ONE CNT
--------------- ----------
green 3
William 2
blue 2
yellow 2
purple 3
5 rows selected.
|
|
|
Re: repeating items in a string [message #607253 is a reply to message #607170] |
Tue, 04 February 2014 04:37 |
|
bluefred
Messages: 16 Registered: June 2012 Location: Qatar
|
Junior Member |
|
|
you could also use:
select REGxp
from (
SELECT ExtractValue(column_value, '/element') REGxp
FROM TABLE
(XMLSEQUENCE
(XMLTYPE
('<list><element>'
|| replace ('GREEN,BLUE,YELLOW,PURPLE,GREEN,YELLOW,PURPLE,GEORGE,WILLIAM,GREEN,PURPLE,BLUE,WILLIAM',
',', '</element><element>')
|| '</element></list>').extract ('/list/*')))
) group by REGxp
having count(1)>1;
|
|
|
Re: repeating items in a string [message #607296 is a reply to message #607253] |
Tue, 04 February 2014 14:29 |
Solomon Yakobson
Messages: 3273 Registered: January 2010 Location: Connecticut, USA
|
Senior Member |
|
|
Or much simpler:
with t as (
select 'GREEN,BLUE,YELLOW,PURPLE,GREEN,YELLOW,PURPLE,GEORGE,WILLIAM,GREEN,PURPLE,BLUE,WILLIAM' words from dual
)
select word
from t,
xmltable(
'
for $word in distinct-values(ora:tokenize($words,","))
where ora:matches($words,fn:concat(",",$word,",(.*,)?",$word,","))
return $word
'
passing ',' || words || ',' as "words"
columns
word varchar2(30) path '.'
)
/
WORD
------------------------------
GREEN
PURPLE
BLUE
YELLOW
WILLIAM
SY.
|
|
|
|
Re: repeating items in a string [message #607319 is a reply to message #607297] |
Tue, 04 February 2014 19:25 |
Solomon Yakobson
Messages: 3273 Registered: January 2010 Location: Connecticut, USA
|
Senior Member |
|
|
I see what you mean. My reply was just to bluefred's XML solution, so I should have said "Or much simpoler XML solution":
SQL> set timing on
SQL> begin
2 for i in 1..10000 loop
3 for rec in (select REGxp
4 from (
5 SELECT ExtractValue(column_value, '/element') REGxp
6 FROM TABLE
7 (XMLSEQUENCE
8 (XMLTYPE
9 ('<list><element>'
10 || replace ('GREEN,BLUE,YELLOW,PURPLE,GREEN,YELLOW,PURPLE,GEORGE,WILLIAM,GREEN,PURPLE,BLUE,WILLIAM',
11 ',', '</element><element>')
12 || '</element></list>').extract ('/list/*')))
13 ) group by REGxp
14 having count(1)>1) loop
15 null;
16 end loop;
17 end loop;
18 end;
19 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:36.38
SQL> begin
2 for i in 1..10000 loop
3 for rec in (with t as (
4 select 'GREEN,BLUE,YELLOW,PURPLE,GREEN,YELLOW,PURPLE,GEORGE,WILLIAM,GREEN,PURPLE,BLUE,WILLIAM' words from dual
5 )
6 select word
7 from t,
8 xmltable(
9 '
10 for $word in distinct-values(ora:tokenize($words,","))
11 where ora:matches($words,fn:concat(",",$word,",(.*,)?",$word,","))
12 return $word
13 '
14 passing ',' || words || ',' as "words"
15 columns
16 word varchar2(30) path '.'
17 )
18 ) loop
19 null;
20 end loop;
21 end loop;
22 end;
23 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:07.54
SQL>
Obviously non-XML connect by + substr is much more efficient.
SY.
|
|
|
Re: repeating items in a string [message #607329 is a reply to message #607319] |
Tue, 04 February 2014 23:30 |
|
bluefred
Messages: 16 Registered: June 2012 Location: Qatar
|
Junior Member |
|
|
pipelined function "should be" much efficient:
CREATE OR REPLACE
PACKAGE DUP_COL IS
TYPE DUP_COL_REC IS RECORD (DUP_COL VARCHAR2(2000));
TYPE DUP_COL_TBL IS TABLE OF DUP_COL_REC;
FUNCTION PIPE_DUP_COL(STR_COL IN VARCHAR2) RETURN DUP_COL_TBL PIPELINED;
END DUP_COL;
/
------------------------------------------------------------------------
CREATE OR REPLACE
PACKAGE BODY DUP_COL IS
FUNCTION PIPE_DUP_COL(STR_COL IN VARCHAR2) RETURN DUP_COL_TBL PIPELINED AS
STR VARCHAR2(2000);
COMMA_POS NUMBER:=0;
START_POS NUMBER:=1;
I NUMBER:=0;
DUP_COL_ROW DUP_COL_REC;
BEGIN
STR:=STR_COL||',';
LOOP
COMMA_POS := INSTR(STR, ',', START_POS);
EXIT WHEN COMMA_POS = 0;
DUP_COL_ROW.DUP_COL:=UPPER(SUBSTR(STR, START_POS, COMMA_POS - START_POS));
START_POS := COMMA_POS + 1;
PIPE ROW(DUP_COL_ROW);
I:=I+1;
END LOOP;
END PIPE_DUP_COL;
END DUP_COL;
/
------------------------------------------------------------------------
--Call the pipe function
SELECT DUP_COL FROM
TABLE(DUP_COL.PIPE_DUP_COL('green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William'))
GROUP BY DUP_COL
HAVING COUNT(1) >1;
------------------------------------------------------------------------
DUP_COL
---------
WILLIAM
GREEN
PURPLE
BLUE
YELLOW
Elapsed: 00:00:00.016
[Edit MC: trim long lines]
[Updated on: Wed, 05 February 2014 00:26] by Moderator Report message to a moderator
|
|
|
|
|
Re: repeating items in a string [message #607337 is a reply to message #607335] |
Wed, 05 February 2014 00:39 |
|
bluefred
Messages: 16 Registered: June 2012 Location: Qatar
|
Junior Member |
|
|
Yes, it is:
set timing on
BEGIN
for i in 1..10000 loop
FOR REC IN (
SELECT DUP_COL FROM
TABLE(DUP_COL.PIPE_DUP_COL('green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William'))
GROUP BY DUP_COL
HAVING COUNT(1) >1
) loop
null;
end loop;
end loop;
END;
/
------------------------------------------------------------------------
anonymous block completed
Elapsed: 00:00:03.203
|
|
|
|
|
Re: repeating items in a string [message #607342 is a reply to message #607340] |
Wed, 05 February 2014 01:10 |
|
bluefred
Messages: 16 Registered: June 2012 Location: Qatar
|
Junior Member |
|
|
Quote:
Your time and others time cannot be compare, you have to do it from ALL solutions.
You are absolutely right:
SET TIMING ON
BEGIN
for i in 1..10000 loop
for rec in (with t as (
select 'GREEN,BLUE,YELLOW,PURPLE,GREEN,YELLOW,PURPLE,GEORGE,WILLIAM,GREEN,PURPLE,BLUE,WILLIAM' words from dual
)
select word
from t,
xmltable(
'
for $word in distinct-values(ora:tokenize($words,","))
where ora:matches($words,fn:concat(",",$word,",(.*,)?",$word,","))
return $word
'
passing ',' || words || ',' as "words"
columns
word varchar2(30) path '.'
)
) loop
null;
end loop;
end loop;
end;
/
------------------------------------------------------------------------
BEGIN
for i in 1..10000 loop
FOR REC IN (
SELECT DUP_COL FROM
TABLE(DUP_COL.PIPE_DUP_COL('green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William'))
GROUP BY DUP_COL
HAVING COUNT(1) >1
) loop
null;
end loop;
end loop;
END;
/
------------------------------------------------------------------------
anonymous block completed
Elapsed: 00:00:08.703
anonymous block completed
Elapsed: 00:00:02.672
Quote:
I found another solution to use it within select statement.
Could you please share your findings!
|
|
|
|
Re: repeating items in a string [message #607345 is a reply to message #607340] |
Wed, 05 February 2014 01:22 |
|
Michel Cadot
Messages: 68647 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
vesile_taskiran wrote on Wed, 05 February 2014 08:03Dear All ,
I found another solution to use it within select statement.
Thanks for all your help.
I agree with bluefred, share it with us (and all people who will read this topic).
[Updated on: Wed, 05 February 2014 01:22] Report message to a moderator
|
|
|
Re: repeating items in a string [message #607349 is a reply to message #607344] |
Wed, 05 February 2014 01:54 |
|
Michel Cadot
Messages: 68647 Registered: March 2007 Location: Nanterre, France, http://...
|
Senior Member Account Moderator |
|
|
Here's what I got on my laptop (3 executions):
SQL> declare
2 start_time pls_integer;
3 end_time pls_integer;
4 begin
5 -- Littlefoot's query
6 start_time := sys.dbms_utility.get_time();
7 for i in 1..10000 loop
8 for rec in (
9 WITH test
10 AS (SELECT 'green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William'
11 col
12 FROM DUAL),
13 pieces AS ( SELECT TRIM (REGEXP_SUBSTR (col,
14 '[^,]+',
15 1,
16 LEVEL))
17 just_one
18 FROM test
19 CONNECT BY REGEXP_SUBSTR (col,
20 '[^,]+',
21 1,
22 LEVEL)
23 IS NOT NULL)
24 SELECT just_one, COUNT (*) cnt
25 FROM pieces
26 GROUP BY just_One
27 HAVING COUNT (*) > 1
28 ) loop
29 null;
30 end loop;
31 end loop;
32 end_time := sys.dbms_utility.get_time();
33 dbms_output.put_line ('Littlefoot''s solution: '||((end_time-start_time)/100)||' seconds');
34 -- Michel's query
35 start_time := sys.dbms_utility.get_time();
36 for i in 1..10000 loop
37 for rec in (
38 WITH
39 test AS (
40 SELECT 'green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William'
41 col
42 FROM DUAL
43 ),
44 data as ( select ','||col||',' col from test ),
45 pieces as (
46 select trim(substr(col,
47 instr(col, ',', 1, level) + 1,
48 instr(col, ',', 1, level+1)
49 - instr(col, ',', 1, level) - 1)) just_one
50 from data
51 connect by level <= length(col)-length(replace(col,',',''))
52 )
53 SELECT just_one, COUNT (*) cnt
54 FROM pieces
55 GROUP BY just_One
56 HAVING COUNT (*) > 1
57 ) loop
58 null;
59 end loop;
60 end loop;
61 end_time := sys.dbms_utility.get_time();
62 dbms_output.put_line ('Michel''s solution: '||((end_time-start_time)/100)||' seconds');
63 -- bluefred 's query 1
64 start_time := sys.dbms_utility.get_time();
65 for i in 1..10000 loop
66 for rec in (
67 select REGxp
68 from (
69 SELECT ExtractValue(column_value, '/element') REGxp
70 FROM TABLE
71 (XMLSEQUENCE
72 (XMLTYPE
73 ('<list><element>'
74 || replace ('GREEN,BLUE,YELLOW,PURPLE,GREEN,YELLOW,PURPLE,GEORGE,WILLIAM,GREEN,PURPLE,BLUE,WILLIAM',
75 ',', '</element><element>')
76 || '</element></list>').extract ('/list/*')))
77 ) group by REGxp
78 having count(1)>1
79 ) loop
80 null;
81 end loop;
82 end loop;
83 end_time := sys.dbms_utility.get_time();
84 dbms_output.put_line ('bluefred''s solution 1: '||((end_time-start_time)/100)||' seconds');
85 -- Solomon's query
86 start_time := sys.dbms_utility.get_time();
87 for i in 1..10000 loop
88 for rec in (
89 with t as (
90 select 'GREEN,BLUE,YELLOW,PURPLE,GREEN,YELLOW,PURPLE,GEORGE,WILLIAM,GREEN,PURPLE,BLUE,WILLIAM' words from dual
91 )
92 select word
93 from t,
94 xmltable(
95 '
96 for $word in distinct-values(ora:tokenize($words,","))
97 where ora:matches($words,fn:concat(",",$word,",(.*,)?",$word,","))
98 return $word
99 '
100 passing ',' || words || ',' as "words"
101 columns
102 word varchar2(30) path '.'
103 )
104 ) loop
105 null;
106 end loop;
107 end loop;
108 end_time := sys.dbms_utility.get_time();
109 dbms_output.put_line ('Solomon''s solution: '||((end_time-start_time)/100)||' seconds');
110 -- bluefred's query 2
111 start_time := sys.dbms_utility.get_time();
112 for i in 1..10000 loop
113 for rec in (
114 SELECT DUP_COL FROM
115 TABLE(DUP_COL.PIPE_DUP_COL('green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William'))
116 GROUP BY DUP_COL
117 HAVING COUNT(1) >1
118 ) loop
119 null;
120 end loop;
121 end loop;
122 end_time := sys.dbms_utility.get_time();
123 dbms_output.put_line ('bluefred''s solution 2: '||((end_time-start_time)/100)||' seconds');
124 end;
125 /
Littlefoot's solution: 8.64 seconds
Michel's solution: 4.08 seconds
bluefred's solution 1: 83.64 seconds
Solomon's solution: 8.3 seconds
bluefred's solution: 2 8.36 seconds
PL/SQL procedure successfully completed.
SQL> /
Littlefoot's solution: 11.81 seconds
Michel's solution: 7.83 seconds
bluefred's solution 1: 84.09 seconds
Solomon's solution: 8.47 seconds
bluefred's solution 2: 7.93 seconds
PL/SQL procedure successfully completed.
SQL> /
Littlefoot's solution: 12.03 seconds
Michel's solution: 7.95 seconds
bluefred's solution 1: 84.83 seconds
Solomon's solution: 8.28 seconds
bluefred's solution 2: 8.03 seconds
PL/SQL procedure successfully completed.
|
|
|
|
Re: repeating items in a string [message #607353 is a reply to message #607352] |
Wed, 05 February 2014 02:10 |
vesile_taskiran
Messages: 66 Registered: August 2008 Location: Turkey
|
Member |
|
|
Here is my solution.
SELECT CELL_NAME
FROM
(
SELECT CELL_NAME CELL_NAME,COUNT(*) CNT
FROM
(
SELECT trim(regexp_substr('green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William', '[^,]+', 1, level)) CELL_NAME
FROM dual
CONNECT BY instr('green,blue,yellow, purple,green,yellow,purple,George,William,green,purple,blue,William', ',', 1, level - 1) > 0
)
GROUP BY CELL_NAME
)
WHERE CNT>=2
|
|
|
|
|
|
|
|
Re: repeating items in a string [message #607362 is a reply to message #607361] |
Wed, 05 February 2014 02:37 |
|
bluefred
Messages: 16 Registered: June 2012 Location: Qatar
|
Junior Member |
|
|
now, how do you intend to use this?
1. is it simple query: once-off?
2. against a table? then how? since you are getting a multi-row result?!
Fred.
there's no point in knowledge if you don't share it...
|
|
|
Goto Forum:
Current Time: Fri Apr 26 13:10:03 CDT 2024
|