Home » SQL & PL/SQL » SQL & PL/SQL » Newbie questions about regular expressions. (oracle 10.2.0.3.0)
Newbie questions about regular expressions. [message #357793] Thu, 06 November 2008 10:50 Go to next message
genericmeme
Messages: 1
Registered: November 2008
Location: Edinburgh
Junior Member
Hi guys & gals, I'm a student on a year-long work based learning placement for university. I'm pretty much just a few days into learning SQL, and I'm currently trying to get my poor frazzled brain around regular expressions.

The thing I'm trying to do at the moment is to clean up a whole load of data that's been inputted wrongly. Specifically I'm trying to take all instances where the user has put in abbreviations, and take out the spaces between the letters.

I.E: I'd like "THE B B C CORPORATION"
To become "THE BBC CORPORATION"

Now what I've got at the moment is this:
(NOTE: when i previewed this, it was trying to add all sorts of wiki links. so the "@@" symbols represent "[[" and the "##" symbols represent "]]".

SELECT 'THE B B C CORPORATION',
       Regexp_replace('THE B B C CORPORATION',
'(\.|@@:space:##)(@@:alpha:##)(\.|@@:space:##)+(@@:alpha:##)(\.|@@:space:##)+(@@:alpha:##)(\.|@@:space:##)+',
                      '\1\2\4\6 ')
FROM   Dual;


Now this does the job, but I can't imagine it's best practice, because:
A) it's hugely long
B) it only catches abbreviations that are three letters long, none else.

Ideally I'd like to have some sort of repeating expression that takes "THE B B C CORPORATION", gets rid of the spaces in between the "B B C", but would also do the same thing for "A B" or "A B C D"

Any help on this would be greatly appreciated.

Thanks in advance people!

[Updated on: Thu, 06 November 2008 11:19] by Moderator

Report message to a moderator

Re: Newbie questions about regular expressions. [message #357840 is a reply to message #357793] Thu, 06 November 2008 21:46 Go to previous message
ebrian
Messages: 2794
Registered: April 2006
Senior Member
Here are a couple options.

This one accounts for multiple spaces in the name:

SQL> select a1,
  2      regexp_replace(regexp_replace(a1, ' {1,}','  '), '(^| )(.)( |$)','\2') initials
  3  from tab1;

A1                             INITIALS
------------------------------ --------------------
A B C Co                       ABC Co
The A B C Co                   The ABC Co
A B                            AB
A B C D E                      ABCDE
X       Y     Z                XYZ
Both A B C and X Y     Z .     Both ABC and XYZ.

If there aren't multiple spaces, you can simplify the query to:

SQL> select a1,
  2      regexp_replace(replace(a1, ' ', '  '), '(^| )(.)( |$)','\2') initials
  3  from tab1;

A1                             INITIALS
------------------------------ --------------------
A B C Co                       ABC Co
The A B C Co                   The ABC Co
A B                            AB
A B C D E                      ABCDE
Both A B C and X Y Z .         Both ABC and XYZ.

Previous Topic: REGEXP_SUBSTR
Next Topic: pass values to array parameter
Goto Forum:
  


Current Time: Sat Dec 03 20:01:41 CST 2016

Total time taken to generate the page: 0.10190 seconds