Home » SQL & PL/SQL » SQL & PL/SQL » Composite index having varchar2, number and CLOB column (Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit)
Composite index having varchar2, number and CLOB column [message #544930] |
Sat, 25 February 2012 02:30  |
manubatham20
Messages: 566 Registered: September 2010 Location: Seattle, WA, USA
|
Senior Member |

|
|
Hi,
I need to create a composite unique index on varchar2, number and CLOB column. I haven't used such index before that have the CLOB column indexing. I found the below link related to CLOB indexing...
http://www.orafaq.com/forum/t/8494/0/
Please suggest me other links from where I can get related info. Also I would like to know the impact of such index on performance. I have to store and process around 50 million records in such a way, will it be beneficial to use this index?
Thanks,
Manu
|
|
|
|
|
|
Re: Composite index having varchar2, number and CLOB column [message #544951 is a reply to message #544936] |
Sat, 25 February 2012 08:54   |
 |
Barbara Boehmer
Messages: 9106 Registered: November 2002 Location: California, USA
|
Senior Member |
|
|
Why do you think you need a unique index across these three columns? What sort of searches do you plan to do? It would be highly unusual for it to need to be unique. It would be reasonable to have some sort of unique index on perhaps the varchar2 and numeric fields, but not a unique index on a clob column. However, it is common to have a non-unique index for doing Oracle Text searches. With Oracle Text, you can create an index across multiple columns that allows rapid searching for data in any of the columns and has many features available for special searches. To do this, you create a multi_column_datastore that specifies the columns that you want to be able to search rapidly. Then you use that datastore in the index creation. The index can be created an any of the columns or a separate column, but the usage of the datastore causes it to index the data in all of the columns. This index is a ctxsys.context index and creation of the index results in creation of a domain index table that contains each token (word) within the data. You then search using contains to use the Oracle Text context domain index. There are many features in Oracle Text. You can find information in the Oracle Text Reference and Oracle Text Developer's Guide in the online documentation. We also have a separate OraFAQ sub-forum for Oracle Text. As to performance, performance of what? It can make such searches a lot faster, but it can slow down DML, depending on how you set things up. Oracle Text indexes require synchronization and optimization and can be done in various manners, similar to how a materialized view is refreshed. In order to use Oracle Text, you must have it on your system. If it is on your system, there will be a CTXSYS user. Any user using Oracle Text will need the CTXAPP role or the privileges granted individually. On a production system, Oracle Text is typically a separately purchased feature.
SCOTT@orcl_11gR2> -- table:
SCOTT@orcl_11gR2> create table t1
2 (v1 varchar2(20),
3 n1 number,
4 c1 clob,
5 all_cols varchar2(1))
6 /
Table created.
SCOTT@orcl_11gR2> -- data:
SCOTT@orcl_11gR2> insert all
2 into t1 values ('test data in v1 col', 1, 'other data in c1 column', null)
3 into t1 values ('other data in v1 col', 2, 'test data in c1 column', null)
4 into t1 values ('more data in v1 col', 3, 'more data in c1 column', null)
5 select * from dual
6 /
3 rows created.
SCOTT@orcl_11gR2> -- multi_column_datastore that uses all three columns:
SCOTT@orcl_11gR2> begin
2 ctx_ddl.create_preference ('your_ds', 'multi_column_datastore');
3 ctx_ddl.set_attribute ('your_ds', 'columns', 'v1, to_char(n1) n1, c1');
4 end;
5 /
PL/SQL procedure successfully completed.
SCOTT@orcl_11gR2> -- text index that uses multi_column_datastore:
SCOTT@orcl_11gR2> create index t1_v1_n1_c1_idx on t1 (all_cols)
2 indextype is ctxsys.context
3 parameters ('datastore your_ds')
4 /
Index created.
SCOTT@orcl_11gR2> -- tokens in domain index table resulting from indexing:
SCOTT@orcl_11gR2> select token_text from dr$t1_v1_n1_c1_idx$i
2 /
TOKEN_TEXT
----------------------------------------------------------------
1
2
3
C1
COL
COLUMN
DATA
MORE
N1
OTHER
TEST
V1
12 rows selected.
SCOTT@orcl_11gR2> -- search for 'test data' in any of the three columns:
SCOTT@orcl_11gR2> column c1 format a30
SCOTT@orcl_11gR2> set autotrace on explain
SCOTT@orcl_11gR2> select * from t1 where contains (all_cols, 'test data') > 0
2 /
V1 N1 C1 A
-------------------- ---------- ------------------------------ -
test data in v1 col 1 other data in c1 column
other data in v1 col 2 test data in c1 column
2 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 772606908
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2041 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 2041 | 4 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | T1_V1_N1_C1_IDX | | | 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CTXSYS"."CONTAINS"("ALL_COLS",'test data')>0)
Note
-----
- dynamic sampling used for this statement (level=2)
SCOTT@orcl_11gR2> -- search for '3' in any of the three columns:
SCOTT@orcl_11gR2> select * from t1 where contains (all_cols, '3') > 0
2 /
V1 N1 C1 A
-------------------- ---------- ------------------------------ -
more data in v1 col 3 more data in c1 column
1 row selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 772606908
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2041 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 2041 | 4 (0)| 00:00:01 |
|* 2 | DOMAIN INDEX | T1_V1_N1_C1_IDX | | | 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CTXSYS"."CONTAINS"("ALL_COLS",'3')>0)
Note
-----
- dynamic sampling used for this statement (level=2)
SCOTT@orcl_11gR2>
|
|
|
Re: Composite index having varchar2, number and CLOB column [message #544975 is a reply to message #544951] |
Sat, 25 February 2012 12:30   |
manubatham20
Messages: 566 Registered: September 2010 Location: Seattle, WA, USA
|
Senior Member |

|
|
Hi Barbara,
In CLOB column, I actually have concatenation of varchar2(4000) key columns (from some other table)
I need to treat duplicate records inserted on the basis of these key column values as well. These columns are stored in another very large table in row format.
Records
-------
rec1 a b[k] c[k] d e[k] f g
rec2 h i[k] j[k] k l[k] m n
Stored in 2 tables like below
----------------------------
table1
------
rec1_seq checksum
rec2_seq checksum
table2
------
rec1_seq col1 a
rec1_seq col2 b
rec1_seq col3 c
rec1_seq col4 d
rec1_seq col5 e
rec1_seq col6 f
rec1_seq col7 g
rec2_seq col1 h
rec2_seq col2 i
rec2_seq col3 j
rec2_seq col4 k
rec2_seq col5 l
rec2_seq col6 m
rec2_seq col7 n
And this table has almost billion of record. So detecting duplicate values on the basis of key columns (from table2)took a lot of time, so I decided to concatenate values of key columns and store it in table1 (Number of Key columns may differ in number for different datasets, so this type of architecture)
Now I want to impose unique index because as soon as there will be any entry of duplicate record, unique index will automatically throw the error, I will catch that record and treat it separately. Now table1 will look like below
table1
------
rec1_seq checksum b,c,e
rec2_seq checksum i,j,l
Many thanks for your information, I never used domain index before... I need to learn it to make use of it.
Thank you.
|
|
|
|
|
|
Re: Composite index having varchar2, number and CLOB column [message #545017 is a reply to message #545016] |
Sun, 26 February 2012 10:41   |
manubatham20
Messages: 566 Registered: September 2010 Location: Seattle, WA, USA
|
Senior Member |

|
|
Hi Barbara,
Thanks for suggestions!!
The problem is that I have different datasets and they contain different number of columns and key columns, and maximum number of key columns are not fixed.
So I can't create my table structure like...
table1
------
rec_no checksum keycol1 keycol2 keycol3 <-- may be one dataset can have only 2 key columns, another can have 4
------ -------- ------- ------- ------- <-- so I can't create fixed number of columns in this table
rec1_seq checksum b c e
rec2_seq checksum i j l
That's why I was thinking about using clob, as clob can store all key columns concatenated data.
Now I am thinking about using nested table of varchar2(4000) datatype as table1 column, and applying unique index on table1.
Note-> Key columns are declared as varchar2[4000], but its doesn't necessarily mean that they store 4000 chars, they may store, they may not.
Thank you.
|
|
|
Re: Composite index having varchar2, number and CLOB column [message #545019 is a reply to message #545017] |
Sun, 26 February 2012 11:23   |
John Watson
Messages: 8979 Registered: January 2010 Location: Global Village
|
Senior Member |
|
|
Hi - just one suggestion. You say "I have different datasets and they contain different number of columns and key columns". This sounds as though you are trying to use one table to store what should be (in a normalized structure) several tables? Is that it? That may be a design fault, but if you must do this you could create the table with all the colums that may be needed by all the rows: there is no problem with leaving some columns null for some rows. Add one more column, and populate it with a trigger that generates an SHA-1 hash from the concatenation of all the potential key columns (some of which will be null for some rows) and use that as your unique key.
Any good?
|
|
|
Re: Composite index having varchar2, number and CLOB column [message #545022 is a reply to message #545019] |
Sun, 26 February 2012 15:02  |
manubatham20
Messages: 566 Registered: September 2010 Location: Seattle, WA, USA
|
Senior Member |

|
|
Hi John,
Quote:This sounds as though you are trying to use one table to store what should be (in a normalized structure) several tables?
Yes, you got it right. I am just supporting an application, the team who build it, fades away 3 years back. I assure you that this type of design is a needed and not a design fault. The problem is I can't change in design, I can only search for alternatives, as change in design will require a lot of data to migrate hence will take a lot of time. That's why I am thinking about using hash and storing it in table1. I have following ideas for that--
1. Concat all key columns, store it in CLOB column of table1 and then unique index --> not possible
2. Compute hash on concatenation of key columns, store it in table1 and unique index, when any hash violate the unique index, then only hit table2 for duplicate check, as there is possibility that hash for 2 different values stored in table1 can be same. --> needs to be tested.
3. Store the key columns value in a column of type nested table and then apply unique index --> not sure if possible in a way I am thinking.
Thanks for your time, and always welcome for any type of suggestion/improvement.
Thank you.
|
|
|
Goto Forum:
Current Time: Tue Jul 22 12:08:47 CDT 2025
|