RE: question about regexp_substr

From: Jackie Brock <J.Brock_at_cablelabs.com>
Date: Tue, 16 Dec 2014 22:53:08 +0000
Message-ID: <9FA6FCA2E9AD2E4E82FCBAC5F343BCA5237581D1_at_EXCHANGE.cablelabs.com>



After the A, the first instance of 0 or more non-numeric characters is 0 characters immediately after the 1, so an empty string. The third instance of 0 or more, including the A and the match immediately after the 1 (0 length), is B. Does that help?

From: amonte [mailto:ax.mount_at_gmail.com] Sent: Tuesday, December 16, 2014 3:04 PM To: Jackie Brock
Cc: Oracle-L Group
Subject: Re: question about regexp_substr

Hi Jackie
Not sure what do you mean, I did this:

select regexp_substr('A1B2C3D4E', '[^0-9]*', 1, 1) occur_1,
       regexp_substr('A1B2C3D4E', '[^0-9]*', 1, 2) occur_2,
       regexp_substr('A1B2C3D4E', '[^0-9]*', 1, 3) occur_3,
       regexp_substr('A1B2C3D4E', '[^0-9]*', 1, 4) occur_4,
       regexp_substr('A1B2C3D4E', '[^0-9]*', 1, 5) occur_5
from dual;

O O O O O
- - - - -

A B C

select regexp_substr('A1B2C3D4E', '[^0-9]+', 1, 1) occur_1,
       regexp_substr('A1B2C3D4E', '[^0-9]+', 1, 2) occur_2,
       regexp_substr('A1B2C3D4E', '[^0-9]+', 1, 3) occur_3,
       regexp_substr('A1B2C3D4E', '[^0-9]+', 1, 4) occur_4,
       regexp_substr('A1B2C3D4E', '[^0-9]+', 1, 5) occur_5
from dual;

O O O O O
- - - - -

A B C D E
And I dont understand very well why * gives A, B and C whereas + gives expected output. This is how I read it,

regexp_substr('A1B2C3D4E', '[^0-9]*', 1, 1) says start searching from position 1 for first pattern which is non-numeric and no matter if there is any pattern occurence so A is printed regexp_substr('A1B2C3D4E', '[^0-9]*', 1, 1) says start searching from position 1 for second pattern which is non-numeric and no matter if there is any pattern occurence. If I parse the line A is non-numeric therefore is a candidate but I am looking for the second occurence so I keep on searching, immediately see number 1 so the pattern is not matched so keep on searching, we now read B which is non-numeric, satisfies the pattern so it should be printed but it's not?

2014-12-16 22:42 GMT+01:00 Jackie Brock <J.Brock_at_cablelabs.com<mailto:J.Brock_at_cablelabs.com>>: ☺ Run it with various values where the three is – get the first occurrence, then the second, then the third, then the fourth – you’ll easily see what it’s doing.

From: amonte [mailto:ax.mount_at_gmail.com<mailto:ax.mount_at_gmail.com>] Sent: Tuesday, December 16, 2014 2:30 PM To: Jackie Brock
Cc: Oracle-L Group
Subject: Re: question about regexp_substr

Hello Jackie
I know + means > 1 and * > 0 occurence. But I dont see why they give different results in my example. I understand that what query is asking with * is

"find in the string any non-numeric character pattern, no matter if the there is character or not in the third occurence". I dont see why B satisfies such condition?

Thanks in advance

Alex

2014-12-16 22:23 GMT+01:00 Jackie Brock <J.Brock_at_cablelabs.com<mailto:J.Brock_at_cablelabs.com>>: The plus sign indicates that it expects at least 1 digit (1 or more). The * means 0 or more.

From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> [mailto:oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org>] On Behalf Of amonte Sent: Tuesday, December 16, 2014 2:07 PM To: Oracle-L Group
Subject: question about regexp_substr

Hi people
I have some difficulty understanding applying an operator to the pattren in regexp_substr. Not sure if anyone can help ? The question is, what is the difference between these two queries:

select regexp_substr('A1B2C3D4E', '[^0-9]+', 1, 3) from dual;

R
-

C

select regexp_substr('A1B2C3D4E', '[^0-9]*', 1, 3) from dual;

R
-

B
Why * and + gives different answers?
Thanks in advance

Alex
--

http://www.freelists.org/webpage/oracle-l Received on Tue Dec 16 2014 - 23:53:08 CET

Original text of this message