Home » SQL & PL/SQL » SQL & PL/SQL » Regular expression (Oracle 11g)
Regular expression [message #578402] Wed, 27 February 2013 09:57 Go to next message
rajivn786
Messages: 136
Registered: January 2010
Senior Member
Hi,

Have 2 small tasks for regexp..where we need to extract numbers from the text & multiply & get the result.

  
 Input field varchar2 :   5x a day x 10 days
 Output  :  5 * 10 = 50

Select regexp_replace('5x a day x 10 days', '[^[:digit:]]' from dual;



The code extracts numbers ..but doesn't multiply & give the result.

Have one more scenario as well.


  
 Input field varchar2 :   take 2 tablets (800 mg) by oral route every 4 hours while awake for 10 days

 Output  :  2 * 800 * 4 * 6 * 10 = 384000

Select regexp_replace('take 2 tablets (800 mg) by oral route every 4 hours while awake for 10 days', '[^[:digit:]]' from dual;



For the above code, if it's hours ..we need to convert into day by multiplying with the required factor to make it a day.
So, since it's 4, I had to multiply it by 6 to make it a day.

Re: Regular expression [message #578404 is a reply to message #578402] Wed, 27 February 2013 11:24 Go to previous messageGo to next message
Michel Cadot
Messages: 59142
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Here are 3 ways to get the result of the multiplication (the 9 first lines are always the same ones).

DBMS_XMLGEN (also works in 10g):
SQL> with
  2    data as (
  3      select 'take 2 tablets (800 mg) by oral route every 4 hours while awake for 10 days' val
  4      from dual
  5    ),
  6    inter as ( -- just to show the result of regexp_replace
  7      select trim(both '*' from regexp_replace(val,'[^[:digit:]]+','*')) multiplication
  8      from data
  9    )
 10  select multiplication, 
 11         to_number(
 12           extractvalue(
 13             dbms_xmlgen.getxmltype('select '||multiplication||' from dual'),
 14             '//text()')) total
 15  from inter
 16  /

MULTIPLICA      TOTAL
---------- ----------
2*800*4*10      64000

XMLQuery:
SQL> with
  2    data as (
  3      select 'take 2 tablets (800 mg) by oral route every 4 hours while awake for 10 days' val
  4      from dual
  5    ),
  6    inter as ( -- just to show the result of regexp_replace
  7      select trim(both '*' from regexp_replace(val,'[^[:digit:]]+','*')) multiplication
  8      from data
  9    )
 10  select multiplication, xmlquery(multiplication returning content).getNumberVal() total
 11  from inter
 12  /
MULTIPLICA      TOTAL
---------- ----------
2*800*4*10      64000

If OLAP option is installed:
SQL> with
  2    data as (
  3      select 'take 2 tablets (800 mg) by oral route every 4 hours while awake for 10 days' val
  4      from dual
  5    ),
  6    inter as ( -- just to show the result of regexp_replace
  7      select trim(both '*' from regexp_replace(val,'[^[:digit:]]+','*')) multiplication
  8      from data
  9    )
 10  select multiplication, dbms_aw.eval_number(multiplication) total
 11  from inter
 12  /
MULTIPLICA      TOTAL
---------- ----------
2*800*4*10      64000

The rest is not clear. Why "2 * 800 * 4 * 6 * 10"?

Regards
Michel

[Edit: modify the query to show the multiplication text]

[Updated on: Wed, 27 February 2013 11:32]

Report message to a moderator

Re: Regular expression [message #578405 is a reply to message #578404] Wed, 27 February 2013 11:37 Go to previous messageGo to next message
Littlefoot
Messages: 19619
Registered: June 2005
Location: Croatia, Europe
Senior Member
Account Moderator
And here's a straightforward way to do that calculation:
SQL> with test as
  2    (select '5x a day x 10 days' col
  3     from dual
  4    )
  5  select regexp_substr(col, '\d+', 1, 1) *
  6         regexp_substr(col, '\d+', 1, 2) result_1
  7  from test;

  RESULT_1
----------
        50

SQL>
SQL> with test as
  2    (select 'take 2 tablets (800 mg) by oral route every 4 hours while awake for 10 days' col
  3     from dual
  4    )
  5  select regexp_substr(col, '\d+', 1, 1) *
  6         regexp_substr(col, '\d+', 1, 2) *
  7        regexp_substr(col, '\d+', 1, 3) * 24 / regexp_substr(col, '\d+', 1, 3) *
  8        regexp_substr(col, '\d+', 1, 4) result_2
  9  from test;

  RESULT_2
----------
    384000

SQL>
Re: Regular expression [message #578406 is a reply to message #578404] Wed, 27 February 2013 11:39 Go to previous messageGo to next message
rajivn786
Messages: 136
Registered: January 2010
Senior Member
Thanks Mike..That's helpful.
The reason , I'm multiplying it by 6 is to convert number of hours into a day.
Since it's 4 hours, it's 4*6..which makes it a day.
That means the patient took 6 pills per day.
Re: Regular expression [message #578408 is a reply to message #578406] Wed, 27 February 2013 11:47 Go to previous messageGo to next message
Michel Cadot
Messages: 59142
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Quote:
That means the patient took 6 pills per day.


OK but why the "* 4" then?

If you have not a list of the possible forms (x a day, every x hours...) then you have to write an AI module to understand your sentence.

Regards
Michel
Re: Regular expression [message #578409 is a reply to message #578408] Wed, 27 February 2013 11:55 Go to previous messageGo to next message
rajivn786
Messages: 136
Registered: January 2010
Senior Member
Usually it will be one of the following for hours (multiples of 24) -- 2,3,4,6,8 ..
Can we modify the below query to include all the above combinations if the text has hours in it??
Ex : If it's 2 multiply by 12,
3 multiply by 8 etc...
Re: Regular expression [message #578410 is a reply to message #578409] Wed, 27 February 2013 12:38 Go to previous messageGo to next message
Michel Cadot
Messages: 59142
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
Assuming there is always "X hours":
SQL> with
  2    data as (
  3      select 'take 2 tablets (800 mg) by oral route every 4 hours while awake for 10 days' val
  4      from dual
  5    ),
  6    fixing as (
  7      select regexp_replace(val, '([[:digit:]]+ hours)', '24/\1') val
  8      from data
  9    ),
 10    inter as ( -- just to show the result of regexp_replace
 11      select trim(both '*' from regexp_replace(val,'[^[:digit:]/]+','*')) multiplication
 12      from fixing
 13    )
 14  select multiplication, dbms_aw.eval_number(multiplication) total
 15  from inter
 16  /
MULTIPLICATIO      TOTAL
------------- ----------
2*800*24/4*10      96000

Regards
Michel
Re: Regular expression [message #578411 is a reply to message #578410] Wed, 27 February 2013 13:07 Go to previous messageGo to next message
rajivn786
Messages: 136
Registered: January 2010
Senior Member
Thanks.
Not all of them have it..so I will be using the below query when I have hours in the text field.
Thanks for your time.
Re: Regular expression [message #578412 is a reply to message #578411] Wed, 27 February 2013 13:18 Go to previous messageGo to next message
Michel Cadot
Messages: 59142
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
If there is no "X hours" the regexp_replace I used does nothing, so you can keep it for all cases.
You may have to nest other regexp_replace if you have other cases to take into account.

Regards
Michel
Re: Regular expression [message #578413 is a reply to message #578402] Wed, 27 February 2013 13:57 Go to previous message
rajivn786
Messages: 136
Registered: January 2010
Senior Member
Mike,

I'm using dbms_xmlgen method. There are some cases in my data where it's completely text.
I can keep a where condition to filter the records.
But may I know what should be added to the query to handle text as it gives me error.
Previous Topic: Sysdate between Two date columns
Next Topic: Data Type to return decimal values in function
Goto Forum:
  


Current Time: Fri Sep 19 09:39:56 CDT 2014

Total time taken to generate the page: 0.09670 seconds