Re: Distinct values
From: joel garry <joel-garry_at_home.com>
Date: Wed, 4 Aug 2010 09:01:50 -0700 (PDT)
Message-ID: <ea218574-c508-49d7-9080-676e9722ecd8_at_x20g2000pro.googlegroups.com>
On Aug 4, 8:44 am, Mladen Gogala <n..._at_email.here.invalid> wrote:
> I have recently had a conversation with an Oracle support engineer who
> told me that, when analyzing VARCHAR2 columns, Oracle only counts the
> first 32 characters. I checked the statement and it is true:
>
> SQL> create table test1(col varchar2(40));
>
> Table created.
>
> Elapsed: 00:00:00.07
> SQL> insert into test1 values('01235678901234567890123456789012A');
>
> 1 row created.
>
> Elapsed: 00:00:00.07
> SQL> insert into test1 values('01235678901234567890123456789012B');
>
> 1 row created.
>
> Elapsed: 00:00:00.08
> SQL> commit;
>
> Commit complete.
>
> Elapsed: 00:00:00.06
> SQL> analyze table test1 compute statistics
> 2 for table for all columns size 254;
>
> Table analyzed.
>
> Elapsed: 00:00:00.07
> SQL> select column_name,num_distinct from user_tab_columns
> 2 where table_name='TEST1';
>
> COLUMN_NAME NUM_DISTINCT
> ------------------------------ ------------
> COL 1
>
> Elapsed: 00:00:00.11
> SQL>
>
> In other words, if you are storing FS paths into the database and
> analyzing the table, stats will show significantly smaller number of
> the distinct values than there really are. Queries may be messed up
> because of that. I tested on 10.2.0.5 and 11.2.0.1, the behavior is the
> same. Did anybody else notice this?
>
> --http://mgogala.byethost5.com
Date: Wed, 4 Aug 2010 09:01:50 -0700 (PDT)
Message-ID: <ea218574-c508-49d7-9080-676e9722ecd8_at_x20g2000pro.googlegroups.com>
On Aug 4, 8:44 am, Mladen Gogala <n..._at_email.here.invalid> wrote:
> I have recently had a conversation with an Oracle support engineer who
> told me that, when analyzing VARCHAR2 columns, Oracle only counts the
> first 32 characters. I checked the statement and it is true:
>
> SQL> create table test1(col varchar2(40));
>
> Table created.
>
> Elapsed: 00:00:00.07
> SQL> insert into test1 values('01235678901234567890123456789012A');
>
> 1 row created.
>
> Elapsed: 00:00:00.07
> SQL> insert into test1 values('01235678901234567890123456789012B');
>
> 1 row created.
>
> Elapsed: 00:00:00.08
> SQL> commit;
>
> Commit complete.
>
> Elapsed: 00:00:00.06
> SQL> analyze table test1 compute statistics
> 2 for table for all columns size 254;
>
> Table analyzed.
>
> Elapsed: 00:00:00.07
> SQL> select column_name,num_distinct from user_tab_columns
> 2 where table_name='TEST1';
>
> COLUMN_NAME NUM_DISTINCT
> ------------------------------ ------------
> COL 1
>
> Elapsed: 00:00:00.11
> SQL>
>
> In other words, if you are storing FS paths into the database and
> analyzing the table, stats will show significantly smaller number of
> the distinct values than there really are. Queries may be messed up
> because of that. I tested on 10.2.0.5 and 11.2.0.1, the behavior is the
> same. Did anybody else notice this?
>
> --http://mgogala.byethost5.com
Interesting. Tell Jože: http://joze-senegacnik.blogspot.com/2009/12/cbo-oddities-in-determing-selectivity.html
Same results as yours on 10.2.0.4
jg
-- _at_home.com is bogus. http://www.signonsandiego.com/news/2010/aug/04/intels-antitrust-tangles-and-ftc-settlement/Received on Wed Aug 04 2010 - 11:01:50 CDT