Re: Find the next word using regular expressions in Oracle 10g
Date: Tue, 10 Feb 2009 09:50:07 -0800 (PST)
Message-ID: <ebc7119e-8956-4776-af97-a6fa8bc99650_at_u13g2000yqg.googlegroups.com>
On 10 Feb., 13:17, gazzag <gar..._at_jamms.org> wrote:
> On 9 Feb, 18:17, Maxim Demenko <mdeme..._at_gmail.com> wrote:
>
> > ad..._at_rbtron.com schrieb:
>
> > > Hi,
>
> > > I have the following details in a file:
>
> > > (ADDRESS =
> > > (CITY_DETAILS = (STREET# = 416)
> > > (STREET_NAME =
> > > UNITY BLVD.)
> > > (CITY = AUSTIN))
> > > (STATE_DETAILS = (STATE_NAME =
> > > TEXAS)(ZIPCODE = 12345)))
>
> > > I need to pull word after the word STATE_NAME. In this case it is
> > > TEXAS. The required word will be enclosed within '=' and ')'. These
> > > are multiple line rows.
>
> > > How to do this using regular expressions in Oracle 10g please ?
>
> > > Thanx
>
> > > -GC
>
> > Somewhat like this ?
>
> > SQL> with t as (
> > 2 select '(ADDRESS =
> > 3 (CITY_DETAILS = (STREET# = 416)
> > 4 (STREET_NAME =
> > 5 UNITY BLVD.)
> > 6 (CITY = AUSTIN))
> > 7 (STATE_DETAILS = (STATE_NAME =
> > 8 TEXAS)(ZIPCODE = 12345)))' s
> > 9 from dual)
> > 10 select
> > 11 regexp_replace(s,'.*STATE_NAME\s*=\s*(\w+).*','\1',1,1,'mn') rr
> > 12 from t
> > 13 ;
>
> > RR
> > -----
> > TEXAS
>
> > Best regards
>
> > Maxim
>
> Maxim,
>
> Would you mind explaining how the REGEXP_REPLACE function is working
> in your example, please? I am familiar with regular expressions but
> only on a fairly basic level.
>
> Cheers,
>
> -g
Sure,
the regexp_replace(s,'.*STATE_NAME\s*=\s*(\w+).*','\1',1,1,'mn')
means:
match the original string as
1) .* # anything followed by 2) STAT_NAME # literally STAT_NAME followed by 3) \s* # zero or many whitespaces followed by 4) = # literally "=" sign followed by 5) \s* # zero or many whitespaces followed by 6) \w+ # one or many word (alphanumeric or _ ) characters followed by 7) .* anything
Part 6) is the part which was interesting for OP - any word which
occurs right after "STATE_NAME = "
To be able backreference this part in the replace expression, it
should be enclosed into brackets - so you define (up to 10) groups
within your regexp, which can be then referenced later with \1, \2, \3
and so on.
In nonformal terms it sounds: match the whole string, define group
within it which consists of a word following the "STAT_NAME = " and
replace the match (again , whole string) by first backreferenced group
(\1).
The rest of parameters means - search from the 1 position in a string
and consider only the 1st match, besides that, consider the string as
multiline string ('m' parameter) and consider the "." to match the
newline symbols ( 'n' parameter) - and this is true in this example -
the leading and trailing ".*" should cover newlines as well.
Without the last two parameters, search would not be continued after
the newline.
Best regards
Maxim Received on Tue Feb 10 2009 - 11:50:07 CST