Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.misc -> Re: hint needed to complete a single "tokens deduplication" update statement ...
Giovanni Azua wrote:
> Hello all,
>
> I am close to have a beatiful approach to de-duplicate
> tokens from each name row in a table i.e. I have a table
> of names and I want to make a single statement update
> that removes duplicated terms within the name e.g.
>
> SULTAN HAMID SULTAN
>
> should be updated to
>
> SULTAN HAMID
>
> I already have the core approach of the single row
> de-duplicating statement:
>
> SELECT DISTINCT VALUE(term) FROM
> THE(SELECT tokenize(name_full)
> FROM T_NLM_NAME
> WHERE name_id=<current_id>) term;
>
> where "tokenize" is a PL/SQL function that does exactly
> that: returns the collection of tokens that occur in a
> VARCHAR2(X). The collection type is defined as:
> type strTable is table of VARCHAR2(X).
>
> Now that I have the result set of distinct tokens I want
> to bring it back to VARCHAR2 but I do not know how to
> do that, would be something like (this of course does
> not work otherwise I would not be asking :-))
>
> SELECT to_char(CASE WHEN RNUM=1 THEN term END)
> || to_char(CASE WHEN RNUM=2 THEN term END)
> || to_char(CASE WHEN RNUM=3 THEN term END) as deduplicated_name_full
> FROM (
> SELECT DISTINCT VALUE(term) as term
> , row_number() over (order by VALUE(term)) as RNUM
> FROM THE(SELECT tokenize(name_full)
> FROM T_NLM_NAME
> WHERE name_id=<current_id>) term);
>
> The complete thing would look like:
>
> UPDATE T_NLM_NAME
> SET name_full=
> SELECT <what_to_do_here> as name_full
> FROM (
> SELECT DISTINCT VALUE(term) as term
> , row_number() over (order by VALUE(term)) as rnum
> FROM THE(SELECT tokenize(name_full)
> FROM T_NLM_NAME
> WHERE name_id=<current_id>) term);
>
> Thanks in advance,
> Best Regards,
> Giovanni
Thanks for what? You didn't ask for anything.
-- Daniel A. Morgan University of Washington damorgan_at_x.washington.edu (replace 'x' with 'u' to respond)Received on Mon Oct 25 2004 - 22:44:48 CDT
![]() |
![]() |