Re: REGEXP_REPLACE help

From: George Rypysc <georgeryp+oracle_at_gmail.com>
Date: Sat, 28 Mar 2009 21:23:32 -0400
Message-ID: <op.uri4tipenjbbs0_at_t2042>



On Fri, 27 Mar 2009 03:43:51 -0400, Laurenz Albe wrote:
>
> I would use regexp_replace(name, ' .*','') for the last name and
> regexp_replace(name, '[^ ]* *','') for the first name.
>
>

But what if the last or first name has a space in it? Part of the last name is lost:

SQL> SELECT regexp_replace('Mc Coy, Leonard', ' .*','') FROM dual;

RE

--
Mc

or the first part of the first name is lost:
SQL> SELECT regexp_replace('Smith, Mary Anne', '[^ ]*  *','') FROM dual;

REGE
----
Anne


The question doesn't explicitly say if the new format (only LAST and  
FIRST), is comma delimited or not.  I'd guess it would need a comma in  
order to distinguish two-part last names from two-part first names ("Ann  
Margaret", "Mary Ann", etc)

Since I'm not that good with regular expressions, I'll make this somewhat  
unwieldy suggestion to add an alternation to the original regex to handle  
the new format:

SQL> set head off
SQL> SELECT regexp_replace('Mc Coy, Leonard', '((.+), (.+) (.+))|((.+),  
(.+))','\2\6') FROM dual;

Mc Coy

SQL> SELECT regexp_replace('Mc Coy, Leonard Herschel', '((.+), (.+)  
(.+))|((.+), (.+))','\2\6') FROM dual;

Mc Coy

...and just change the backreferences "\2\6" to "\3\7" to get the first  
name in both formats:
SQL> SELECT regexp_replace('Mc Coy, Leonard Herschel', '((.+), (.+)  
(.+))|((.+), (.+))','\3\7') FROM dual;

Leonard

SQL> SELECT regexp_replace('Mc Coy, Leonard', '((.+), (.+) (.+))|((.+),  
(.+))','\3\7') FROM dual;

Leonard

But if a comma really is used in the new format, without any further  
information, there's no way to tell if "Smith, Mary Ann" is the old format  
("Ann" being a middle name) or new format ("Mary Anne" is the first name).


-George
Received on Sat Mar 28 2009 - 20:23:32 CDT

Original text of this message