SQL performance when comparing to NULL in where clause [message #356166] |
Wed, 29 October 2008 15:54  |
bpamarti
Messages: 3 Registered: October 2008
|
Junior Member |
|
|
Hi Gurus:
I have a SQL query which joins several large tables (so indexes matter here) from Oracle database. In the where condition I use IS NULL with one of the date field values. Query takes 40 sec to run and if I comment this one line...it takes 1 sec to run. This date field is an index on the table and I learnt that --
1. IS NOT NULL in where clause uses an index
2. IS NULL in where clause does not use an index
Is there any work around to make the query faster...other than changing all the NULL date values in the table to some string. In other words can I force it to use the index...
Thanks in advance,
Bala
[Updated on: Wed, 29 October 2008 15:59] Report message to a moderator
|
|
|
Re: SQL performance when comparing to NULL in where clause [message #356186 is a reply to message #356166] |
Wed, 29 October 2008 19:53   |
 |
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
I dont' know what you learned or who told you this stuff but that is way off base. Let me tell you what I know and where I get it wrong others can correct me (Michel, Frank, Ross, Anacedent, etal.).
Things have changed a lot over the last five years when it comes to how indexes work in Oracle. But last I recall:
Quote: | IS NULL will not use an index.
IS NOT NULL will not use and index.
Most indexes in Oracle are sparse indexes.
|
Sparse means that if the first column of an index for some given row is null then that row is not tracked by the index.
Consider the following:
Quote: | 1) you have a table with 10 million rows
2) all values of column DATE_OF_BIRTH in this table are null
3) column APPLICATION_DATE is mandatory
|
create index i1 on t1 (application_date);
how many rows are tracked by the above index? A: 10 million.
create index i2 on t1 (date_of_birth);
how many rows are tracked by the above index? A: 0 (this index is empty).
This sparse nature of Oracle indexes was exploited in the early days to take large tables (many rows), and use FLAG fields to create subsets that were easy to access quickly. For example, a credit card proceessor might have 10 million customers, but only maybe one in every 10,000 cards would be a stolen card. A column could be added to the table that would have a "Y" in it for any card that was known stolen and a null for all the others. Then an index created on this column. There are 10 million rows in the table, and maybe only 1000 rows tracked by this index. Working with stolen cards would be real fast.
Today there is another trick that is similar in nature that will allow you to index null values if you want to, but it requires you to change your code to exploit it. It is called FUNCTION BASED INDEXES.
Consider this:
create index i3 on T1 (case when hire_date is null then 1 else null end);
this index effectively reverses the null-ness of hire_date for indexing purposes; null column values generate a 1 in the case expression above, where as non-null column values generate null. If you are willing to change you code, you can write this in order to find all null entries using the index:
select *
from t1
where case case when hire_date is null then 1 else null end = 1
/
Hope this helps. Kevin
|
|
|
Re: SQL performance when comparing to NULL in where clause [message #356281 is a reply to message #356186] |
Thu, 30 October 2008 03:11   |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Kevin's suggestion of a function-based index is probably your best best. Another good function to index might be
NVL(THIS_DATE, to_date('99991231','YYYYMMDD'))
An alternate is - if you have relatively few distinct values and low concurrent updates on the table - a BITMAP index which - if memory serves - DO work with NULLs.
Ross Leishman
|
|
|
|
Re: SQL performance when comparing to NULL in where clause [message #356356 is a reply to message #356294] |
Thu, 30 October 2008 08:41   |
 |
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
If you are having difficulty understanding the link consider the following:
SQL> drop table temp1
2 /
Table dropped.
SQL> create table temp1 (a varchar2(100),b varchar2(100))
2 /
Table created.
SQL>
SQL> insert into temp1
2 select lpad(rownum,100,rownum),null
3 from dual
4 connect by level <= 100000
5 /
100000 rows created.
SQL>
SQL> create index temp1_i1 on temp1(a)
2 /
Index created.
SQL> create index temp1_i2 on temp1(b)
2 /
Index created.
SQL> create index temp1_i3 on temp1(b,a)
2 /
Index created.
SQL>
SQL> desc user_segments
Name
-----------------------------------------------------------------------------------------------------------------------
SEGMENT_NAME
PARTITION_NAME
SEGMENT_TYPE
TABLESPACE_NAME
BYTES
BLOCKS
EXTENTS
INITIAL_EXTENT
NEXT_EXTENT
MIN_EXTENTS
MAX_EXTENTS
PCT_INCREASE
FREELISTS
FREELIST_GROUPS
BUFFER_POOL
SQL>
SQL> select segment_name,segment_type,blocks
2 from user_segments
3 where segment_name like '%TEMP1%'
4 /
SEGMENT_NAME SEGMENT_TYPE BLOCKS
--------------------------------------------------------------------------------- ------------------ ----------
TEMP1 TABLE 384
TEMP1_I1 INDEX 416
TEMP1_I2 INDEX 6
TEMP1_I3 INDEX 416
4 rows selected.
SQL>
SQL> drop table temp1
2 /
Table dropped.
As we can see, indexes take up space depending upon how much data (and rows) they track. Index TEMP1_I2 is on an all null column so all rows have no data for this index and so it does not track any of the rows. This is the trick I spoke of before.
Additionally we see where I got it wrong though "wrong" is a relative term. I am pretty sure this was not always the case. I can recall clearly in the "old days" where the Oracle gurus at the time pointed out that indexes which started with a null column value did not track that particular row, and that the IS NOT NULL did not use an index. Maybe it was the RULES BASED OPTIMIZER thing in action. I would also guess that this is a change within the last five years to facilitate the skip-scan index access method. Clearly we see that index TEMP1_I3 starts with a null column yet it is tracking all rows as it has the same space consumption as the TEMP1_I1 index.
Nice link bonker.
Kevin
|
|
|
|
|
|
|
Re: SQL performance when comparing to NULL in where clause [message #356499 is a reply to message #356461] |
Thu, 30 October 2008 22:36   |
bonker
Messages: 402 Registered: July 2005
|
Senior Member |
|
|
Quote: |
There is a field in the same table PERSON_ID (NOT NULL and INT type) which is again an index...if I add PERSON_ID > 0 at the beginning of the where clause the execution time is reduced to 15 sec.
|
In that case can you create concatenated Index of Date_of_Birth and person_id and gather statistics. Then run the query again with condition date_of_birth is null and see if it is picking the index and response time has also improved.
Updated: I did not see that you mentioned that you have no control on modifying or creating new index.
[Updated on: Thu, 30 October 2008 22:39] Report message to a moderator
|
|
|
Re :SQL performance when comparing to NULL in where clause [message #450025 is a reply to message #356499] |
Sat, 03 April 2010 10:04   |
mehrotak
Messages: 14 Registered: April 2010 Location: pune
|
Junior Member |
|
|
Hi Gurus,
I am working on oracle 10g release.
I am facing little bit similar Problem..
I have a SQL query which joins several large tables from Oracle database. In the where condition I use IS NULL with one of the date field values. Query takes 100 sec to run and if I comment this one line...it takes 7 sec to run. This date field has no index on the table .
My problem is i can not create or modified any index on database because it is production database and i dnt have the privileges on this datatbase.So we can do any thing on Query to improve performance of this Query.
Thanks for your Cooperation...
Thanks in Advance...
Regards,
Akshat
|
|
|
Re: Re :SQL performance when comparing to NULL in where clause [message #450026 is a reply to message #450025] |
Sat, 03 April 2010 10:10   |
 |
BlackSwan
Messages: 26766 Registered: January 2009 Location: SoCal
|
Senior Member |
|
|
>I have a SQL query which joins several large tables from Oracle database. In the where condition I use IS NULL with one of the date field values.
>Query takes 100 sec to run and if I comment this one line...it takes 7 sec to run. This date field has no index on the table .
>My problem is i can not create or modified any index on database because it is production database and i dnt have the privileges on this datatbase.So we can do any thing on Query to improve performance of this Query.
So what changes are you allowed to make?
So you want us to tune SQL we can not see, which access unknown tables, and we can't use or change indexes.
Right? Good Luck!
I am sorry to report my bag of pixie dust is empty.
It would be helpful if you followed Posting Guidelines - http://www.orafaq.com/forum/t/88153/0/
[Updated on: Sat, 03 April 2010 10:11] Report message to a moderator
|
|
|
|
Re: Re :SQL performance when comparing to NULL in where clause [message #450065 is a reply to message #450034] |
Sun, 04 April 2010 08:31   |
mehrotak
Messages: 14 Registered: April 2010 Location: pune
|
Junior Member |
|
|
Hi All,
Sorry for not posting proper information
Below is the whole information related to Query:-
Query without using "IS NULL" condition :-
SELECT TUP.TRIAL_NO,
TUP.COUNTRY_CODE,
TUP.UNIT_NO,
TUP.PERSONNEL_NO,
CASE
WHEN TU.CANCEL_STOP_FLAG = 'Y'
THEN CASE
WHEN TU.TRIAL_UNIT_STATUS <
9200
THEN 'Y'
ELSE 'N'
END
ELSE 'N'
END SITE_CANCEL_FLAG,
CASE
WHEN TU.CANCEL_STOP_FLAG = 'Y'
THEN CASE
WHEN TU.TRIAL_UNIT_STATUS >=
9200
THEN 'Y'
ELSE 'N'
END
ELSE 'N'
END SITE_STOP_FLAG,
TU.HOLD_FLAG SITE_HOLD_FLAG,
TU.CONFIRMED_FLAG,
P.DISCONTINUED_FLAG,
'I' AS INTERNAL_FLAG,
D.PROTOCOL_NM,
TUP.OCCUPATION_CODE AS ROLE_CODE,
SUBSTR
(TRIM (TU.TRIAL_UNIT_REFERENCE),
LENGTH (TRIM (TU.TRIAL_UNIT_REFERENCE)) - 5,
4
) SITE_NO
FROM TRIAL@CTMSPRS1 T,
TRIAL_UNIT@CTMSPRS1 TU,
PERSONNEL@CTMSPRS1 P,
TU_PERSONNEL@CTMSPRS1 TUP,
CPAC_PROVISIONING_EXCLUDE_EXT C,
D_STUDY D
WHERE C.STUDY_KEY = D.STUDY_KEY
AND C.CPAC_EXCLUSION IN ('X', 'E')
AND D.PROTOCOL_NM =
SUBSTR (TU.TRIAL_ALIAS_CODE,
1,
INSTR (TU.TRIAL_ALIAS_CODE, '-', 1, 2) - 1
)
AND T.TRIAL_NO = TU.TRIAL_NO
AND TU.TRIAL_NO = TUP.TRIAL_NO
AND TU.COUNTRY_CODE = TUP.COUNTRY_CODE
AND TU.UNIT_NO = TUP.UNIT_NO
AND P.PERSONNEL_NO = TUP.PERSONNEL_NO
AND TUP.OCCUPATION_CODE IN (
SELECT *
FROM TABLE
(CORE_CPAC_PUBLISH_PKG.FN_GET_PARAMETER_MULTI_API
('PR_USERPROTOCOL_ALERT_INSERT',
'ROLE_ABBR'
)
))
AND (TUP.END_DATE >= SYSDATE)
This Query is returning only 3 rows and taking 6 seconds to give output.
If we add ( TUP.END_DATE IS NULL OR TUP.END_DATE >= SYSDATE)
means TUP.END_DATE IS NULL in this Query it is taking 1040sec to give output.
More information :- I have attached the explain for both Query.
select count(1) from TU_PERSONNEL@CTMSPRS1 TUP gives total number of record =1562424
select count(1) from TU_PERSONNEL@CTMSPRS1 TUP WHERE TUP.END_DATE IS NULL gives number of record =1481850 , It means Most number of record has END_DATE IS NULL.
please help me to solve this problem.
Thanks for all your Cooperation.
Regards,
Akshat
|
|
|
Re: SQL performance when comparing to NULL in where clause [message #450099 is a reply to message #356166] |
Sun, 04 April 2010 22:13   |
 |
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
Since for the moment we are reduced to guessing, my first guess is that your stats are out of date.
Notice that the plan expects to get more rows when you add your clause. This is correct because adding ...IS NULL OR... lets more rows from the tested table pass the test and be feed into your query. If your stats are bad and they are giving you grossly incorrect numbers then the number of rows you get when not using the IS NULL may be 15 not 1549 reported in your query. Or the when accepting more rows into the query when using IS NULL, you might be getting 300,000 rows not 3021 as reported in your plan. Either of these would explain your results.
Indeed, you said your "faster" query is returning 3 rows. Then why does the associated query plan say you should be seeing more like 1549 rows? There is a big different between 1549 vs. 3. This is a clear indicator that the optimizer has grossly miscalculated intermediary rowset sizes. For some reason it could not figure out that 3 was a reasonable number of rows to return for this query. This is most often (though not always) because stats are stale.
My first step would be to recollect stats on all tables, then try again. Of course this is risky in a production environment so what you should really do is copy stuff to another environment, collect stats there, and then see if you can reproduce the problem. Given this is a true distributed query, that will require even more work than normal.
I suggest you get friendly with your DBAs who control this data and ask them for help.
Good luck, Kevin
[Updated on: Sun, 04 April 2010 22:20] Report message to a moderator
|
|
|
|
Re: SQL performance when comparing to NULL in where clause [message #450172 is a reply to message #356166] |
Mon, 05 April 2010 10:00  |
 |
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
You are welcome. My point with the post is that there may be no immediate performance improvement you can create if indeed you are looking at lots more rows.
It may be the query using IS NULL may be moving around lots more data. Even if both queries eventually return the same 3 rows, it may be that IS NULL creates significantly larger intermediary rowset size that pass through the middle steps of your query. For example, you my be calling that function thousands of times in the IS NULL query and only a few times in the "faster" query.
You can only figure this out by breaking the query up into steps and seeing what each step does.
Good luck. Kevin
|
|
|