Re: Remove dupes with CLOB columns involved

From: Álvaro G. Vicario <alvaro.NOSPAMTHANX_at_demogracia.com>
Date: Tue, 23 Jun 2009 09:45:18 +0200
Message-ID: <h1q19n$ch2$1_at_news.eternal-september.org>



John Hurley escribió:
> On Jun 22, 11:25 am, "Álvaro G. Vicario"
> <alvaro.NOSPAMTH..._at_demogracia.com> wrote:

>> I've created a GLOBAL TEMPORARY TABLE where one of the columns stores
>> the result of concatenating a variable (and unknown) number of
>> VARCHAR2(100) cells. I didn't want to hit the 4,000 character limit when
>> the source table has more than 40 rows so I created such temp column as
>> CLOB. But when I joined my temporary table with some other regular
>> tables I got an "ORA-00932: inconsistent datatypes: expected - got CLOB"
>> error message. The problem seems to be that it's a "SELECT DISTINCT"
>> query and I've read in docs that it's a known limit. Replacing DISTINCT
>> with GROUP BY doesn't change this.
>>
>> Looking at the documentation, I appeared to me that TO_CHAR() could be a
>> solution but it isn't: it fixes ORA-00932 but I get "ORA-22835: Buffer
>> too small for CLOB to CHAR or BLOB to RAW conversion (actual: 4001,
>> maximum: 4000)" when the CLOB contains more than 4000 bytes. Some forums
>> suggest DBMS_LOB.SUBSTR() but it also fails for 4000+ (ORA-06502:
>> PL/SQL: error : buffer de cadenas de caracteres demasiado pequeño
>> numérico o de valor ORA-06512: en línea 1).
>>
>> I just need to run a query where some of the columns are CLOB and get
>> rid of duplicated rows. Is there an easy approach I could use?
>>
>> It's a PHP/5.2+OCI8 app running on top of Oracle 10g Express Edition in
>> a Windows XP box. Thank you in advance,
> Could you give an overview of what exactly you are trying to
> accomplish?
> 
> Perhaps a redesign of the database tables involved might be a better
> idea.  While there's nothing exactly wrong with BLOBs or CLOBs ...
> they can get unwieldy and quite complicated unless planned and tested
> from the beginnings of design phases.

It's part of a module that prints custom reports (defined by users). Among (manu) other things, each reportable object has a list of attributes. A simplified example:

object:

	object_id
	object_name
	
attribute:
	attribute_id
	object_id
	attribute_name

I'm emulating MySQL's GROUP_CONCAT() function so I get one row per object with all the attributes in one cell:

object_name, attribute_list
'object_1', 'attr_1_1; attr_1_2; attr_1_2' 'object_2', 'attr_2_1; attr_2_2'

It happened that the code I'm using [1][2] implies some complex operations (CONNECT BY and the like) and was crashing Oracle Express Edition if used as subquery in the main query. So I decided to run these queries first and store the result in a temporary table. Since the attribute table can have lots of rows I created attribute_list as CLOB to prevent truncation errors.

So far I'm considering two possibilities:

  • Get over with it and just truncate large data (after all, 4,000 bytes of attributes might not be a real life possibility).
  • Fetch all rows and remove dupes in the application side (the PHP code might not be optimal but it's easy to write).

[1]
http://halisway.blogspot.com/2006/08/oracle-groupconcat-updated-again.html [2]
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:229614022562

-- 
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
Received on Tue Jun 23 2009 - 02:45:18 CDT

Original text of this message