Re: Regular expressions: splitting with REGEXP_SUBSTR and "null"

From: Mark D Powell <Mark.Powell_at_eds.com>
Date: Sat, 1 Mar 2008 06:21:35 -0800 (PST)
Message-ID: <a80a3006-5f97-4bce-bfc6-eafbadd1627e@f47g2000hsd.googlegroups.com>


On Feb 29, 1:11 pm, spamb..._at_milmac.com (Doug Miller) wrote:
> In article <248f35ac-1e8c-424a-80cd-1da3edc66..._at_e6g2000prf.googlegroups.com>, Mark D Powell <Mark.Pow..._at_eds.com> wrote:
>
>
>
>
>
> >On Feb 28, 11:02=A0am, citte <nicola.ame..._at_gmail.com> wrote:
> >> Hi,
> >> I have a problem, I found on internet a way to split strings separated
> >> by pipe | (for example) with regexp, something like that:
>
> >> SELECT REGEXP_SUBSTR('One|Two|Three|Four','[^|]+', 1, 3)
> >> FROM dual;
>
> >> which gives the result:
> >> Three
>
> >> the problem comes when I have one of the words separated by the pipe
> >> is actually a Null, so the string becomes
> >> 'One|Two||Four|'
>
> >> the query
>
> >> SELECT REGEXP_SUBSTR('One|Two||Four','[^|]+', 1, 3)
> >> FROM dual;
>
> >> result in:
> >> Four
>
> >> I needed a Null (or empty string).
> >> How is this possible?
>
> >> Thank you in advance!
>
> >Will this help?
>
> >SQL> l
> >  1  select regexp_substr(replace('one|two||four','||','|NULL|'),
> >  2*  '[^|]+',1,3) from dual
> >SQL> /
>
> >REGE
> >----
> >NULL
>
> Doesn't work in a more general case, e.g. attempting to select the fourth
> string from 'one|two|||five' -- the desired result is NULL, but this method
> returns 'five'. Conversely, trying to retrieve the fifth string should return
> 'five' but instead returns NULL.
>
>
>
> >Also to return an actual NULL value
> >SQL> select regexp_substr(replace('one|two||four','||',''),
> >  2   '[^|]+',1,3) from dual;
>
> >R
> >-
>
> Also doesn't work for more general cases:
> - specifying the second substring should return 'two' but instead returns
> 'twofour'
> - specifying the fourth substring should return 'four' but instead returns a
> null string
> - specifying the third substring in 'one|two||four|five' should return a null
> string, but instead returns 'five'
>
> In general, when seeking the i-th substring, if the omitted substring is in
> the n-th position, this method is guaranteed to produce correct results only
> when 0 < i < (n - 1), or when i = n AND there are at most n non-null
> substrings. If there are multiple omitted substrings, correct results may also
> be obtained for other values of i, but only by coincidence.- Hide quoted text -
>
> - Show quoted text -

Doug, valid point. I was not thinking about general cases only the posted one and one should always be aware of what will break what you wrote.

  • Mark D Powell --
Received on Sat Mar 01 2008 - 08:21:35 CST

Original text of this message