| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
Home -> Community -> Usenet -> comp.databases.theory -> Re: fast wildcard searching on millions of strings
Thanks for you post. It's just a large list of domain names and URLs that I need to search on. You can think of it as a bunch of strings with an average length of 18 characters. I am not searching or doing anything with the documents at the URLs. I would like to put wildcards(* or ?) anywhere on the query string.
I tried to find information on OpenText, couldn't find it yet. Do you have any links that could point me in the right direction?
Thanks a lot,
Ryan Sit
Kai.Grossjohann_at_CS.Uni-Dortmund.DE (Kai Großjohann wrote in message news:<vaf66buz3v2.fsf_at_lucy.cs.uni-dortmund.de>...
> rsit_at_ucsd.edu (rsit) writes:
>
> > Hey I was wondering if some of you guys could help me on a problem I
> > am having. I am working on a searching engine that will be primarily
> > doing wildcard searching on a set at least 33 million URLs. The
> > problem I am having is figuring out a underlying architecture that
> > would support wildcard searching on this set in hopefully less than
> > one second.
>
> Are you going to search in the URLs or in the documents that are
> behind the URLs? If it's the documents, then I think wildcard
> searching is not the way to go. Rather, you want stemming or maybe
> phonetic similarity search. Sounds like Information Retrieval.
>
> There is OpenText which uses the Pat algorithm for regexp searching in
> large corpora.
>
> kai
Received on Sat Aug 11 2001 - 19:35:26 CDT
![]() |
![]() |