Re: A problem in SQL (a real challenge)

From: <ken_hansen_at_my-deja.com>
Date: 2000/05/31
Message-ID: <8h3gim$6ja$1_at_nnrp2.deja.com>#1/1


I support, but think I can improve on, the advice of Kyle Lahnakoski.

All the previous talk of standards leaves me cold (probably heresy!) but more importantly seems to miss the point of how Teradata works and the flexibility it offers.

Teradata supports SET TABLES which do not permit a duplicate and MULTISET TABLES which do permit duplicates. Keys appear to be used for indexing and spreading data across multiple nodes and are used when data is inserted, updated and
selected. You can have a UNIQUE PRIMARY KEY which avoids duplicates by checking each row when it is inserted/updated and is therefore a heavy processing overhead, or a PRIMARY KEY which permits duplicates and is used primarily to spread the data across multiple nodes. You can create additional KEY's to optimise select queries particularly joins.

Why restrict to one sql statement - any delete is executed row by row and uses a lot of CPU and time. Whereas insert into an empty table runs much quicker and uses one.
Insert into a SET_TABLE will silently drop the duplicates.

Additionally, instead of listing all the field/attribute names use an asterisk:

I haven't come across CREATE TABLE AS...... maybe it is not on my version. I would :

rename table_A as table_A_temp;
Show table_A_temp;
/copy from that show table, most of the CREATE TABLE code altering first line only to read:

CREATE SET TABLE table_a ......

insert into table_a select * from table_a_temp; drop table_a_temp;
Collect statistics on table_a index (.........);

The dissadvantage of dropping the original table is that you will have to grant all permissions again but you cannot avoid that if you want a set_table.

If you do not want to regrant all the
permissions, copy the contents into a temp table then delete * from table_a and insert into table_a from temp grouping by evry field which you want to be unique. (I recently had a 2 Gb table with 53 attributes but identified that if 5 fields were unique then the remainder would not represent duplicates)

I'll try CREATE SET TABLE AS ... and perhaps a combination of the above and Kyle's answer will solve your problem.

Good luck.

Ken Hansen, Ivy House Associates, UK.
khansen99_at_aol.com

In article <3920BF8B.540222F0_at_arcavia.com>,   kyle_at_arcavia.com wrote:
> I do not believe there is a standard SQL method
 of doing this with one
> query. Here is a series of commands that will
 do.
>
> RENAME TABLE <TABLE NAME> TO <TEMP TABLE NAME>;
> CREATE TABLE <TABLE NAME> AS
> SELECT
> <LIST ALL COLUMNS HERE>
> FROM
> <TEMP TABLE NAME>
> GROUPBY
> <LIST ALL COLUMNS HERE>
> ;
> DROP TABLE <TEMP TABLE NAME>;
>
> There may be a CREATE OR REPLACE command in
 your version if SQL that
> replaces the need for a temp table.
>
> Why do you need only one sql statement?
>
> thought wrote:
> >
> > Suppose I have a table without a primary key
 (or for that matter
> > any key). I want to delete duplicate entries
 from the table
> > using a SINGLE SQL statement. Can anyone
 there help me out?
> >
> > * Sent from RemarQ http://www.remarq.com The
 Internet's Discussion Network *
> > The fastest and easiest way to search and
 participate in Usenet - Free!
>
> --
> ------------------------------------------------



> Kyle

Lahnakoski
Arcavia Software Ltd.
> (416) 892-7784

 http://www.arcavia.com
>

Sent via Deja.com http://www.deja.com/
Before you buy. Received on Wed May 31 2000 - 00:00:00 CEST

Original text of this message