Home » SQL & PL/SQL » SQL & PL/SQL » REGEXP_REPLACE DISTINCT Mixed results (ORACLE 11)
REGEXP_REPLACE DISTINCT Mixed results [message #638857] Tue, 23 June 2015 10:28 Go to next message
skyschrei
Messages: 2
Registered: June 2015
Junior Member
Tried to do a distinct with REGEXP_REPLACE. Works in most cases, but I got some unexpected returns.
Is there someone how can explain why and how to do correct?

WITH DATA AS
 (SELECT '5045, 5057, 5057, 5078, 8073' VALUES1,
         '5045, 5045, 5057, 5057, 5073' VALUES2
    FROM DUAL)
SELECT VALUES1,
       REGEXP_REPLACE(VALUES1, '([^,]+)(, \1)+', '\1') DISTINCT_VALUES1,
       '5045, 5057, 5078, 8073' DISTINCT_VALUES1_EXPECTED,
       '   ',
       VALUES2,
       regexp_replace(VALUES2, '([^,]+)(,\1)+', '\1') DISTINCT_VALUES2,
       '5045, 5057, 5073' DISTINCT_VALUES2_EXPECTED
  FROM DATA
/


Return values

VALUES1:                   5045, 5057, 5057, 5078, 8073                                                              
DISTINCT_VALUES1:          5045057, 5057, 5078073                                                       
DISTINCT_VALUES1_EXPECTED: 5045, 5057, 5078, 8073

VALUES2:                   5045, 5045, 5057, 5057, 5073
DISTINCT_VALUES2:          5045, 5045, 5057, 5073                                                       
DISTINCT_VALUES2_EXPECTED: 5045, 5057, 5073


In values1 numbers join if the last digit is the same as the first digit of next number!?!
Why??????

In values2 distinct works fine except with the first number??
Why???

Found a solution with a grouped subquery, but like to understand why this is not working.


Re: REGEXP_REPLACE DISTINCT Mixed results [message #638882 is a reply to message #638857] Wed, 24 June 2015 07:43 Go to previous messageGo to next message
mnitu
Messages: 159
Registered: February 2008
Location: Reims
Senior Member
I'm not an expert in regular expressions but if you want to understand what goes wrong in your example see :
Regular Expression
Confronted with your demand I'll naturally tend to transform the string of values in a set of values and use distinct keyword or set operator on a nested collection. I do not believe that the regular expression approach is a good one.

Re: REGEXP_REPLACE DISTINCT Mixed results [message #638895 is a reply to message #638857] Wed, 24 June 2015 23:47 Go to previous messageGo to next message
bugfox
Messages: 18
Registered: October 2010
Junior Member
try this:
WITH DATA AS
 (SELECT '5045, 5057, 5057, 5078, 8073' VALUES1,
         '5045, 5045, 5057, 5057, 5073' VALUES2
    FROM DUAL)
SELECT VALUES1,
       REGEXP_REPLACE(VALUES1, '([^,]*)(, \1)+($|,)', '\1\3') DISTINCT_VALUES1,
       '5045, 5057, 5078, 8073' DISTINCT_VALUES1_EXPECTED,
       '   ',
       VALUES2,
       regexp_replace(VALUES2, '([^,]*)(, \1)+($|,)', '\1\3') DISTINCT_VALUES2,
       '5045, 5057, 5073' DISTINCT_VALUES2_EXPECTED
  FROM DATA
icon7.gif  Re: REGEXP_REPLACE DISTINCT Mixed results [message #638937 is a reply to message #638895] Thu, 25 June 2015 12:15 Go to previous messageGo to next message
skyschrei
Messages: 2
Registered: June 2015
Junior Member
Thanks a lot!!!!!!
This is a very good solution and I understand this function a bit better.
Re: REGEXP_REPLACE DISTINCT Mixed results [message #638943 is a reply to message #638937] Fri, 26 June 2015 01:50 Go to previous messageGo to next message
mnitu
Messages: 159
Registered: February 2008
Location: Reims
Senior Member
Seems to work only on a special case
WITH DATA AS
 (SELECT '5045, 5057, 5057, 5078, 5057, 8073' VALUES1,
         '5073, 5045, 5045, 5057, 5057, 5073' VALUES2
    FROM DUAL)
SELECT VALUES1,
       REGEXP_REPLACE(VALUES1, '([^,]*)(, \1)+($|,)', '\1\3') DISTINCT_VALUES1,
       '5045, 5057, 5078, 8073' DISTINCT_VALUES1_EXPECTED,
       '   ',
       VALUES2,
       regexp_replace(VALUES2, '([^,]*)(, \1)+($|,)', '\1\3') DISTINCT_VALUES2,
       '5045, 5057, 5073' DISTINCT_VALUES2_EXPECTED
  FROM DATA

Re: REGEXP_REPLACE DISTINCT Mixed results [message #638962 is a reply to message #638943] Fri, 26 June 2015 16:51 Go to previous messageGo to next message
Solomon Yakobson
Messages: 3273
Registered: January 2010
Location: Connecticut, USA
Senior Member

WITH DATA AS
 (SELECT '5045, 5057, 5057, 5078, 5057, 8073' VALUES1,
         '5073, 5045, 5045, 5057, 5057, 5073' VALUES2
    FROM DUAL)
select  values1,
        replace(x.column_value,' ',', ') distinct_values1,
        values2,
        replace(y.column_value,' ',', ') distinct_values2
  from  data,
        xmltable(('xs:string(distinct-values(("' || replace(values1,', ','","') || '")))')) x,
        xmltable(('xs:string(distinct-values(("' || replace(values2,', ','","') || '")))')) y
/

VALUES1                            DISTINCT_VALUES1               VALUES2                            DISTINCT_VALUES2
---------------------------------- ------------------------------ ---------------------------------- -----------------
5045, 5057, 5057, 5078, 5057, 8073 5045, 5057, 5078, 8073         5073, 5045, 5045, 5057, 5057, 5073 5073, 5045, 5057

SQL> 


SY.
Re: REGEXP_REPLACE DISTINCT Mixed results [message #639053 is a reply to message #638962] Tue, 30 June 2015 07:57 Go to previous messageGo to next message
mnitu
Messages: 159
Registered: February 2008
Location: Reims
Senior Member
Nice! Learned something new.
Thx
Re: REGEXP_REPLACE DISTINCT Mixed results [message #639054 is a reply to message #639053] Tue, 30 June 2015 08:41 Go to previous messageGo to next message
mnitu
Messages: 159
Registered: February 2008
Location: Reims
Senior Member
But seems not to work before 12c.
Re: REGEXP_REPLACE DISTINCT Mixed results [message #639055 is a reply to message #639054] Tue, 30 June 2015 08:52 Go to previous messageGo to next message
Michel Cadot
Messages: 68645
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

SQL> WITH DATA AS
  2   (SELECT '5045, 5057, 5057, 5078, 5057, 8073' VALUES1,
  3           '5073, 5045, 5045, 5057, 5057, 5073' VALUES2
  4      FROM DUAL)
  5  select  values1,
  6          replace(x.column_value,' ',', ') distinct_values1,
  7          values2,
  8          replace(y.column_value,' ',', ') distinct_values2
  9    from  data,
 10          xmltable(('xs:string(distinct-values(("' || replace(values1,', ','","') || '")))')) x,
 11          xmltable(('xs:string(distinct-values(("' || replace(values2,', ','","') || '")))')) y
 12  /
VALUES1
----------------------------------
DISTINCT_VALUES1
-----------------------------------------------------------------------------------------------------
VALUES2
----------------------------------
DISTINCT_VALUES2
-----------------------------------------------------------------------------------------------------
5045, 5057, 5057, 5078, 5057, 8073
5045, 5057, 5078, 8073
5073, 5045, 5045, 5057, 5057, 5073
5073, 5045, 5057

1 row selected.

SQL> @v

Version Oracle : 11.2.0.3.0

Re: REGEXP_REPLACE DISTINCT Mixed results [message #639056 is a reply to message #638962] Tue, 30 June 2015 09:37 Go to previous messageGo to next message
_jum
Messages: 577
Registered: February 2008
Senior Member
Nice, didn't know this option Shocked
Thanks @SY for sharing it!
Re: REGEXP_REPLACE DISTINCT Mixed results [message #639057 is a reply to message #639056] Tue, 30 June 2015 10:12 Go to previous messageGo to next message
Michel Cadot
Messages: 68645
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Not the first he shows us this but it is valuable to repeat it. Wink

Re: REGEXP_REPLACE DISTINCT Mixed results [message #639058 is a reply to message #639057] Tue, 30 June 2015 10:39 Go to previous messageGo to next message
mnitu
Messages: 159
Registered: February 2008
Location: Reims
Senior Member
SQL> WITH DATA AS
  2   (SELECT '5045, 5057, 5057, 5078, 5057, 8073' VALUES1,
  3           '5073, 5045, 5045, 5057, 5057, 5073' VALUES2
  4      FROM DUAL)
  5  select  values1,
  6          replace(x.column_value,' ',', ') distinct_values1,
  7          values2,
  8          replace(y.column_value,' ',', ') distinct_values2
  9    from  data,
 10          xmltable(('xs:string(distinct-values(("' || replace(values1,', ','","') || '")))')) x,
 11          xmltable(('xs:string(distinct-values(("' || replace(values2,', ','","') || '")))')) y
 12  /
        xmltable(('xs:string(distinct-values(("' || replace(values1,', ','","') || '")))')) x,
                                                 *
ERREUR à la ligne 10 :
ORA-02000: mot-clé COLUMNS absent


SQL> SELECT version
  2  FROM V$INSTANCE
  3  /

VERSION
-----------------
11.2.0.3.0


Re: REGEXP_REPLACE DISTINCT Mixed results [message #639059 is a reply to message #639058] Tue, 30 June 2015 11:53 Go to previous messageGo to next message
Michel Cadot
Messages: 68645
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

select comp_id, version, status from dba_registry order by 1;
Re: REGEXP_REPLACE DISTINCT Mixed results [message #639060 is a reply to message #639058] Tue, 30 June 2015 13:13 Go to previous messageGo to next message
Solomon Yakobson
Messages: 3273
Registered: January 2010
Location: Connecticut, USA
Senior Member
You can use:

WITH DATA AS
 (SELECT '5045, 5057, 5057, 5078, 5057, 8073' VALUES1,
         '5073, 5045, 5045, 5057, 5057, 5073' VALUES2
    FROM DUAL)
select  values1,
        distinct_values1,
        values2,
        distinct_values2
  from  data,
        xmltable(
                 'string-join(distinct-values(ora:tokenize($str,", ")),", ")'
                 passing ', ' || values1 as "str"
                 columns distinct_values1 varchar2(2000) path '.'
                ) x,
        xmltable(
                 'string-join(distinct-values(ora:tokenize($str,", ")),", ")'
                 passing ', ' || values2 as "str"
                 columns distinct_values2 varchar2(2000) path '.'
                ) y
/

VALUES1                            DISTINCT_VALUES1               VALUES2                            DISTINCT_VALUES2
---------------------------------- ------------------------------ ---------------------------------- ----------------
5045, 5057, 5057, 5078, 5057, 8073 5045, 5057, 5078, 8073         5073, 5045, 5045, 5057, 5057, 5073 5045, 5057, 5073

SQL> 


SY.
Re: REGEXP_REPLACE DISTINCT Mixed results [message #639077 is a reply to message #639060] Wed, 01 July 2015 04:56 Go to previous message
mnitu
Messages: 159
Registered: February 2008
Location: Reims
Senior Member
@Michel
Thanks for your interest but actually it does not matter to me why it does not work in my 11g database.
SQL> select comp_id, version, status from dba_registry order by 1;

COMP_ID                        VERSION                        STATUS
------------------------------ ------------------------------ -----------
APEX                           3.2.1.00.12                    INVALID
CATALOG                        11.2.0.3.0                     VALID
CATJAVA                        11.2.0.3.0                     VALID
CATPROC                        11.2.0.3.0                     VALID
EM                             11.2.0.3.0                     VALID
EXF                            11.2.0.3.0                     VALID
JAVAVM                         11.2.0.3.0                     VALID
ORDIM                          11.2.0.3.0                     VALID
OWM                            11.2.0.3.0                     VALID
RUL                            11.2.0.3.0                     VALID
XDB                            11.2.0.3.0                     VALID

COMP_ID                        VERSION                        STATUS
------------------------------ ------------------------------ -----------
XML                            11.2.0.3.0                     VALID



@SY
It worked
Thanks
Previous Topic: Optimize PL/SQL code and get rid of one by one If then else statement check
Next Topic: REGEXP_REPLACE for different name variations
Goto Forum:
  


Current Time: Thu Apr 25 04:46:08 CDT 2024