Re: How to test to VARCHAR2's for equality EFFICIENTLY ?

From: mcstock <mcstockspamplug_at_spamdamenquery.com>
Date: Fri, 21 Nov 2003 17:01:05 -0500
Message-ID: <sfydnen5bI-pFiOiRVn-sw@comcast.com>

yes, '!=' is the wrong operator to use for NULL comparisons -- but that is not the point

the logic of you sample pseudo code uses '!=' to compare to values, with the assumption that either one of them could be null

if you implement it in a function, i.e.. turn

BEGIN
if (a!=b) then rv:=0 else rv:=1;
end if;
return rv;
end;

into

create or replace function compstr( a in varchar2, b in varchar2 ) return number
is
  rv number;
begin
  if (a!=b)
  then rv:=0;
  else rv:=1;
  end if;
  return rv;
end;
/

and test it you'll see the problem:

SQL> select

  2    compstr('this', 'that')
  3  , compstr('this', 'this')
  4  , compstr(null, null)
  5  , compstr('this', null)

6 from dual
7 /
...

COMPSTR('THIS','THAT') COMPSTR('THIS','THIS') COMPSTR(NULL,NULL) COMPSTR('THIS',NULL)

---------------------- ---------------------- ------------------ -----------
---------
                     0                      1                  1

your function tells me that 'this' and null are equal -- a typical logic error due to the fact that not only does NULL never compare equal, but it never compares to be not-equal either.

your test implements the logic 'if you can tell me for sure that the two values are not equal, return 0, otherwise (whether you know they are equal or don't know they are equal), return 1'

standard classroom explanation:

did you get your bonus check yet? no.
do you know how much it is? no (i.e. null, i.e., unknown) tell me how much it is? i can't
can you tell me if it is $100? cannot say for sure if it is -- the value is still unknown
can you tell me if it is not $100? cannot say for sure if it isn't -- don't know the value!
can i have 20% of your bonus? -- go away!

"Turkbear" <john.g_at_dot.spamfree.com> wrote in message news:o2usrvsa7iv61u87v8cqu8ssn1t1ne10kn_at_4ax.com...
|
| Inline comment:
| "mcstock" <mcstockspamplug_at_spamdamenquery.com> wrote:
|
| >close....
| >
| >'x' != 'x' is false, returns 0 -- ok
| >'x' != 'y' is true, return 1 -- ok
| >null != null is null, returns 1 -- ok
| >'x' != null is also null, returns 1 -- not ok
| >
|
| Sorry but
| 'x' != null ( By the way that is a bad comparison operator when using
NULL, it should be 'x' is not NULL) is
| TRUE and should return a 1 - Nothing is equal to null, so x is not equal
to null either..
|
| 1* select 1 from dual where 'x' is not Null
| SQL> /
|
| 1
| ----------
| 1
| 1* select 1 from dual where 'x' is Null
| SQL> /
|
| no rows selected
|
| SQL>
|
|
| >the most important observations in this thread are to avoid using a
| >function, but to use an expression in the SQL, since using the function
will
| >limit the optimizer's choices
| >
| >for example, with this function
| >
| >SQL> create or replace function vc2comp( a varchar2, b varchar2 )
| > 2 return number
| > 3 is
| > 4 begin
| > 5 if (a=b or a||b is null)
| > 6 then
| > 7 return 1;
| > 8 else
| > 9 return 0;
| > 10 end if;
| > 11 end vc2comp;
| > 12 /
| >
| >Function created.
| >
| >SQL> select
| > 2 vc2comp('a','a')
| > 3 , vc2comp('a','b')
| > 4 , vc2comp(null,null)
| > 5 , vc2comp('a',null)
| > 6 from dual;
| >
| >VC2COMP('A','A') VC2COMP('A','B') VC2COMP(NULL,NULL) VC2COMP('A',NULL)
| >---------------- ---------------- ------------------ -----------------
| > 1 0 1 0
| >
| >consider the following scenario (using function based indexes with
| >QUERY_REWRITE_ENABLED = TRUE)
| >
| >SQL> select e.ename, d.dname, e.status_code
| > 2 from bigemp e, dept d
| > 3 where e.deptno = :deptno
| > 4 and vc2comp(e.status_code, :status) = 1
| > 5 /
| >
| >no rows selected
| >
| >Execution Plan
| >----------------------------------------------------------
| > 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=64 Card=192 Bytes=57
| > 60)
| >
| > 1 0 MERGE JOIN (CARTESIAN) (Cost=64 Card=192 Bytes=5760)
| > 2 1 TABLE ACCESS (FULL) OF 'BIGEMP' (Cost=16 Card=48 Bytes=1
| > 008)
| >
| > 3 1 SORT (JOIN) (Cost=48 Card=4 Bytes=36)
| > 4 3 TABLE ACCESS (FULL) OF 'DEPT' (Cost=1 Card=4 Bytes=36)
| >
| >Statistics
| >----------------------------------------------------------
| > 0 recursive calls
| > 4 db block gets
| > 102 consistent gets
| > 0 physical reads
| > 0 redo size
| > 317 bytes sent via SQL*Net to client
| > 314 bytes received via SQL*Net from client
| > 1 SQL*Net roundtrips to/from client
| > 0 sorts (memory)
| > 0 sorts (disk)
| > 0 rows processed
| >
| >SQL> select e.ename, d.dname, e.status_code
| > 2 from bigemp e, dept d
| > 3 where e.deptno = :deptno
| > 4 and nvl(e.status_code,'-') = nvl(:status,'-')
| > 5 /
| >
| >no rows selected
| >
| >
| >Execution Plan
| >----------------------------------------------------------
| > 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=9 Card=192 Bytes=576
| > 0)
| >
| > 1 0 MERGE JOIN (CARTESIAN) (Cost=9 Card=192 Bytes=5760)
| > 2 1 TABLE ACCESS (FULL) OF 'DEPT' (Cost=1 Card=4 Bytes=36)
| > 3 1 SORT (JOIN) (Cost=8 Card=48 Bytes=1008)
| > 4 3 TABLE ACCESS (BY INDEX ROWID) OF 'BIGEMP' (Cost=2 Card
| > =48 Bytes=1008)
| >
| > 5 4 INDEX (RANGE SCAN) OF 'BIGEMP$DEPT$NVLSTATUS' (NON-U
| > NIQUE) (Cost=1 Card=48)
| >
| >Statistics
| >----------------------------------------------------------
| > 0 recursive calls
| > 4 db block gets
| > 1 consistent gets
| > 0 physical reads
| > 0 redo size
| > 317 bytes sent via SQL*Net to client
| > 314 bytes received via SQL*Net from client
| > 1 SQL*Net roundtrips to/from client
| > 3 sorts (memory)
| > 0 sorts (disk)
| > 0 rows processed
| >
| >of course, your results will vary, this was performed by a professional
| >driver on a closed course, etc. etc.
| >
| >-- mcs
| >
| >"Niall Litchfield" <n-litchfield_at_audit-commission.gov.uk> wrote in
message
| >news:3fbe177a$0$13348$ed9e5944_at_reading.news.pipex.net...
| >| You might improve performance marginally by changing the logic to
| >|
| >| BEGIN
| >| if (a!=b) then rv:=0 else rv:=1;
| >| end if;
| >| return rv;
| >| end;
| >|
| >| since NULL != NULL also doesn't return true.
| >|
| >| I suspect you'd see most improvement by rewriting your queries to use
the
| >| built in NVL function.
| >|
| >| --
| >| Niall Litchfield
| >| Oracle DBA
| >| Audit Commission UK
| >| "André Hartmann" <andrehartmann_at_hotmail.com> wrote in message
| >| news:3fbe1046_at_olaf.komtel.net...
| >| > Hi there,
| >| >
| >| > in my database schema there is a function that tests two VARCHAR2
| >values
| >| > for equality. The function goes like this:
| >| >
| >| > (a IN VARCHAR2, b IN VARCHAR2)
| >| > RETURN INTEGER
| >| > IS RV NUMBER;
| >| > BEGIN
| >| > IF (a IS NULL) AND (b IS NULL) THEN
| >| > rv := 1;
| >| > ELSE
| >| > IF (a = b) THEN
| >| > rv := 1;
| >| > ELSE
| >| > rv := 0;
| >| > END IF;
| >| > END IF;
| >| > RETURN rv;
| >| > END;
| >| >
| >| > Why did I do it ? Well because the usual "=" operator does not say
that
| >| two
| >| > NULLs are equal ! to check this, just try "select 1 from DUAL where
| >| > NULL=NULL".
| >| >
| >| > Now I see that my function gets called VERY often, many million times
in
| >a
| >| > great variety of SQL statements that my application fires. Any idea
how
| >i
| >| > could improve the performance of the function or maybe there is an
| >| entirely
| >| > different approach to that ?
| >| >
| >| > Thanks,
| >| > André
| >| > :)
| >| >
| >| >
| >|
| >|
| >
|
Received on Fri Nov 21 2003 - 16:01:05 CST