Home » SQL & PL/SQL » SQL & PL/SQL » Seperate Parts of English Sentence (Oracle, 9i 10g, Windows)
Seperate Parts of English Sentence [message #362287] Mon, 01 December 2008 12:47 Go to next message
danish_fsd@yahoo.com
Messages: 38
Registered: February 2008
Location: Pakistan
Member
Hi All.

I want to seperate VERBS, SUBJECT, OBJECT(part of English sentence). English sentences are stored in a column of type varchar2. Is it possible to make a query or stored procedure so that it first find only VERBS, SUBJECT, OBJECT etc and then split them. example is as follows

Sentences
How to write a query?
I am like to drink water.
.....
....

I want to seperate VERBS i.e.,("write", "drink")
or to seperate OBJECTS i.e., ("query", "water")

Can any one help me.

Regards
Danish
Re: Seperate Parts of English Sentence [message #362294 is a reply to message #362287] Mon, 01 December 2008 13:13 Go to previous messageGo to next message
Michel Cadot
Messages: 68733
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
"Time flies like an arrow"

What are the verb, subbject, object, adverb?

Regards
Michel
Re: Seperate Parts of English Sentence [message #362337 is a reply to message #362294] Mon, 01 December 2008 21:40 Go to previous messageGo to next message
danish_fsd@yahoo.com
Messages: 38
Registered: February 2008
Location: Pakistan
Member
"Time flies like an arrow"

Hello Michel

Thanks for reply. In my thinking, I understand "flies" as an verb and "Time" as an subject (may be I am wrong). I have no hard and fast rule to distinguish these. It should be done as it happen in English dictionary. If in any sentence some criteria not fulfill, then it may return null. Some solutions may be not exact but approximately near to solution.

Regards
Danish
Re: Seperate Parts of English Sentence [message #362339 is a reply to message #362337] Mon, 01 December 2008 22:10 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9104
Registered: November 2002
Location: California, USA
Senior Member
"Time flies like an arrow."

"Fruit flies like a banana."

In one sentence "flies" is a verb and in the other "flies" is a noun. There are many other words that can be one part of speech or another, depending on the usage and devising an automated way to decipher which is correct requires a complex algorithm. As far as I know, Oracle does not have any built-in method to do this. There are various online dictionaries and thesauri that can be downloaded. If you download one of these and load it into an Oracle table, so that you have one column for each word and another for potential parts of speech, then you can join that to your data table, and at least provide various possibilities. What is the purpose of this? What do you plan to do with the phrases after they have been parsed into parts of speech? Perhaps there is something else that can get you what you want.

Re: Seperate Parts of English Sentence [message #362347 is a reply to message #362287] Mon, 01 December 2008 23:45 Go to previous messageGo to next message
rajavu1
Messages: 1574
Registered: May 2005
Location: Bangalore , India
Senior Member


"OP want to work with database with Artificial Intelligence"

Smile
Rajuvan.
Re: Seperate Parts of English Sentence [message #362534 is a reply to message #362339] Tue, 02 December 2008 21:51 Go to previous messageGo to next message
danish_fsd@yahoo.com
Messages: 38
Registered: February 2008
Location: Pakistan
Member
Hello Barbara Boehmer

Thanks for your detail reply. I know this is very difficult to distinguish these. Actually I want to rank sentences (or part of sentences) which have same meanings, and then find out sentences which occur most frequently. For example, In a web database application, where people post their questions. I want to sort out questions which is mostly asked. and all questions are stored in a single column. I want to get some idea or to make some algorithm in PL/SQL Procedure which is approximately near to this solution.

Regards
Danish
Re: Seperate Parts of English Sentence [message #362565 is a reply to message #362534] Wed, 03 December 2008 00:58 Go to previous messageGo to next message
Michel Cadot
Messages: 68733
Registered: March 2007
Location: Saint-Maur, France, https...
Senior Member
Account Moderator
If you find an algorithm, it is "easy" (not more difficult) to implement it in any language.
It is not a PL/SQL question, I don't think this the most appropriate forum to post this question. Try to find a forum that talk about english language syntax and semantics, automatic spelling checking or this kind of things.

Regards
Michel
Re: Seperate Parts of English Sentence [message #362858 is a reply to message #362565] Thu, 04 December 2008 05:50 Go to previous messageGo to next message
rleishman
Messages: 3728
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Here's something you could read: http://en.wikipedia.org/wiki/Natural_language_processing
Re: Seperate Parts of English Sentence [message #362861 is a reply to message #362858] Thu, 04 December 2008 06:05 Go to previous messageGo to next message
Frank Naude
Messages: 4590
Registered: April 1998
Senior Member
What you want is called a POS (Part of Speech) Tagger. Various open source taggers are available on the Internet. Some of them are listed at http://www-nlp.stanford.edu/links/statnlp.html#Taggers

The problem is that none of them is implemented on Oracle. So, you will have to do that part yourself. If I were you, I'll pick a C/C++ open source solution that works fairly well and try to convert it to Oracle external procedure calls.

Best of luck!

Frank
Re: Seperate Parts of English Sentence [message #363014 is a reply to message #362861] Thu, 04 December 2008 15:41 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9104
Registered: November 2002
Location: California, USA
Senior Member
Oracle's new auto_lexer:

http://download.oracle.com/docs/cd/B28359_01/text.111/b28304/cdatadic.htm#BHCGJHDH

uses parts of speech tagging:

http://download.oracle.com/docs/cd/B28359_01/text.111/b28304/aautoref.htm#CHECEDIJ

but it is used internally for tagged stemming and such and I could not find a way to locate and extract the information. I suspect it is stored in binary format in the token_info blob column of the dr$...$i domain index table. I tried some combinations of auto_lexer and auto_section_group and queries using the within clause, but that didn't work either.

However, it sounds like what you are really looking for is document classification and Oracle can do that in various ways:

http://download.oracle.com/docs/cd/B28359_01/text.111/b28303/classify.htm#CCAPP9214

Re: Seperate Parts of English Sentence [message #363036 is a reply to message #362287] Thu, 04 December 2008 22:02 Go to previous message
danish_fsd@yahoo.com
Messages: 38
Registered: February 2008
Location: Pakistan
Member
Hi Dears.

Thanks for all your replies and suggestions. I think these will help me a lot to find solution. Now I have a direction in which I can start doing work.

Remember me in your prayers. Smile

Regards
Danish
Previous Topic: update
Next Topic: Cursor Error..not able to find it out....
Goto Forum:
  


Current Time: Thu Feb 06 14:39:02 CST 2025