Home » SQL & PL/SQL » SQL & PL/SQL » NEED OPTIMAL SOLUTION FOR COMMON WILDCARD SEARCH IN ORACLE DATABASE AGAINST 2-4 MILLION RECORDS
NEED OPTIMAL SOLUTION FOR COMMON WILDCARD SEARCH IN ORACLE DATABASE AGAINST 2-4 MILLION RECORDS [message #224640] Wed, 14 March 2007 20:48 Go to next message
cdmsenthil
Messages: 2
Registered: March 2007
Junior Member
I am currently involved in developing touch panel application [C#.Net, ORACLE 10g] for a leading rental DVD/VHS/BOOKS/Games chain store whose data persisted in ORACLE DB. Their users use this touch panel application in store [Client Server Application] to search for a DVD/VHS by TITLE, DIRECTOR, GMAE NAME, and BOOK TITLE etc. It is a simple search against 2+ million records [DSS Database updated every day by a batch job]. The search uses right truncated wildcards EXAMPLE: TITLE LIKE ‘Star%’. The required query response time is < 2 seconds.

I created the requred PL/SQL query where two tables involved with sample data size 200000,400000 I did some test to find the query response time with different sample input search token.

In case of narrow search terms EXAMPLE: LIKE ‘STAR WARS %’, it responded with in the stipulated time but for queries EXAMPLE: LIKE ‘A%’, ‘C%’ the response time is quiet slow.

I created following indexes in the target columns

1. B-TREE INDEX
2. CONTEXT Index [Oracle Text Search]
3. BITMAP Indexes [Though not supposed to be created for low cardinality column]

All the three approach performed well within the stipulated time for narrow queries, in fact the common “Like?? %” Query outperformed Oracle text Search “CONTAINS” query. For queries like “A%”, “B%”, index not used and the response time also slow. But for oracle text search [CONTEXT], it utilizes the index but slower than FULL TABLE SCAN.

I need help / direction on how to handle such types of queries. Hope this is a common issue and many might overcome similar problems.
Test Environment:

Oracle 10g Database: WIN 2003 Server
C# Client Application: Running in WIN 2000 Professional [My Development PC, 1 GB RAM]
Test Data: 200000 [Table 1] & 400000 [Table 2] rows [In the real DB, it is 2000000 & 400000 rows maximum and keeps growing]

My test results is attached in the image.
  • Attachment: result.gif
    (Size: 20.73KB, Downloaded 163 times)

[Updated on: Wed, 14 March 2007 20:51]

Report message to a moderator

Re: NEED OPTIMAL SOLUTION FOR COMMON WILDCARD SEARCH IN ORACLE DATABASE AGAINST 2-4 MILLION RECORDS [message #224661 is a reply to message #224640] Thu, 15 March 2007 00:56 Go to previous messageGo to next message
rleishman
Messages: 3724
Registered: October 2005
Location: Melbourne, Australia
Senior Member
Oracle thinks you want the fastest response to return ALL MATCHING ROWS, and Full Table Scan is the fastest way to do that.

It sounds like you just want the fastest way to return the first page of results, which would be to use the index even for non-selective queries like 'A%'.

You can do this most easily by setting the initialisation parameter OPTIMIZER_MODE to one of the following:
- FIRST_ROWS
- FIRST_ROWS_1
- FIRST_ROWS_10
- FIRST_ROWS_100
- FIRST_ROWS_1000

I would recommend one of last 2. The first one - FIRST_ROWS - is deprecated for 10g. Play around until you find the best balance.

You can achieve the same functionality by adding a hint to your SQL: eg. SELECT /*+ FIRST_ROWS(100) */ ....

Now this is incredibly important: this will not work if your application fetches the entire result set into a collection before any data is displayed. If you select 1M rows before you display the first 100, its going to take a while. It MUST select the first "n" rows, present them to the screen, and then either close the cursor or leave it open for another fetch to get the next page later on.

Ross Leishman
Re: NEED OPTIMAL SOLUTION FOR COMMON WILDCARD SEARCH IN ORACLE DATABASE AGAINST 2-4 MILLION RECORDS [message #224757 is a reply to message #224661] Thu, 15 March 2007 06:14 Go to previous messageGo to next message
cdmsenthil
Messages: 2
Registered: March 2007
Junior Member
Thanks rleishman,
I appreciate your suggestion & Help. I applied your suggestion. I would like to share how i am retrieving the data (request you to look at the query in the attached html document), the red marked area is in question. The query has two parts, first one filters the record where the actual search token used [acts like a temp table]. In the second part i am getting the required number or rows (always 10 ) which will be different with respect to the pageIndex i.e. startRowIndex and endRowIndex.


After applied your suggestion,
Case 1 : FIRST_ROWS
oracle utilizes the index but again slow
Case 2: FIRST_ROWS(10) ----FIRST_ROWS(1000)
Full table Scan but the response time slightly improved than earlier. Looks OPTIMIZER_MODE helps

But my inference is, My AS-IS query always gets the whole record in the first part . I am not clear whether OPTIMIZER_MODE will help? Appreciate your comments and also correct me if i am going wrong in understanding your suggestion?

Regards,
Senthilnathan



  • Attachment: wild-card.htm
    (Size: 12.08KB, Downloaded 334 times)
Re: NEED OPTIMAL SOLUTION FOR COMMON WILDCARD SEARCH IN ORACLE DATABASE AGAINST 2-4 MILLION RECORDS [message #224894 is a reply to message #224757] Thu, 15 March 2007 18:50 Go to previous messageGo to next message
rleishman
Messages: 3724
Registered: October 2005
Location: Melbourne, Australia
Senior Member
The ROW_NUMBER() is going to be a problem. In order to number the first 10 rows, Oracle must identify every matching row AND perform a sort.

There is ONLY ONE WAY to make this fast for any wildcard input: you CANNOT sort the results. You must return them in the order they are read from the table. ANY other option will require every row to be read from disk. Without the sort, fetching 1-10 will be quick, 11-20 a little slower, 1000000-1000010 a lot slower.

So where does this leave us?
- Either you return 10 unsorted rows at a time.
- Use trickiness to get Oracle to use an index to sort for you. It is possible to get ordered results without Oracle having to perform a SORT step.

Take a look at this article that briefly discusses sorting with indexes.

The other problem is how to pick the n : n+9th rows now that you are not using ROW_NUMBER().

I think the problem may be intractable. On the one hand, you want to drive the query from M_TG_RENSHOHIN.SHOHINMEI_NY LIKE 'A%' and use an index on M_TG_RENSHOHIN.NBSHOUHINCD so that we don't waste time scanning non-matching rows. But then the rows need to be ordered by M_TG_RENSHOHIN.SHOHINMEI, dictating an indexed access on that column to avoid a SORT.

These two requirements are mutually exclusive: one of them has to go. Either you return unsorted results, or you scan thousands (or millions!) of non-matching rows.

If you choose to return unsorted results, it can be made to run quickly with something like:
SELECT /*+FIRST_ROWS(10)*/ *
FROM (
    SELECT tbl.*, rownum AS rn
    FROM   tbl
    WHERE  EXISTS (...driving sub-query...)
    AND    rownum <= :upper_bound
)
WHERE rn >= :lower_bound


Ross Leishman
Re: NEED OPTIMAL SOLUTION FOR COMMON WILDCARD SEARCH IN ORACLE DATABASE AGAINST 2-4 MILLION RECORDS [message #225144 is a reply to message #224894] Sun, 18 March 2007 07:49 Go to previous message
S.Rajaram
Messages: 1027
Registered: October 2006
Location: United Kingdom
Senior Member
Just a thought.

Have you thought of partioning the tables based on the starting character of the Movie Name. You are talking about 27 partitions. A - Z (26 partitions) any others goes to to the 27th partition. I have not tried it or experminted it. Again if you are planning to use it you should make sure optimizer should peeks at the bind value before parsing the query otherwise it won't be of any use. I think that is the default for oracle 9 and oracle 10.

Good Luck
Previous Topic: URGENT - Please help with this query !!
Next Topic: Procedure(Authid)
Goto Forum:
  


Current Time: Mon Dec 05 10:42:47 CST 2016

Total time taken to generate the page: 0.13315 seconds