Re: LOTS OF DATA, how do I delete it?

From: Randy Dewoolfson <randyd_at_cais3.cais.com>
Date: 1995/11/24
Message-ID: <49536o$rm6_at_zippy.cais.net>#1/1


rschick (rschick_at_eagle.wbm.ca) wrote:
: Got a couple of questions for all of you:
 

: 1) How to best delete 1 million rows from a table.
: 2) How to best SELECT data from 24 million rows.
 

: Overview
: =======
: I have built an Oracle 7.1.3 database which houses a 15 gig database.
: It's purpose is to do cellular rate plan analysis (based on a customer's cellular phone
: calling history). We get a weekly data load of 250 meg which is all the calls and summary data
: for the week. The front end is VB 3.0 and the data is loaded via SQL Loader (DIRECT).
 

: Problem
: ======
: Portions of the data in the calls table(s), say it's called T_CALLS, must be removed/replaced
: with the new calls data. Each of our proposed calls table holds 4 sets of calls (one for each
: billing period). The OLD billing cycle (say Sept 7/95) will have to be replaced with the NEW
: billing cycle (say Sept 7/96). We cannot DROP the table since there are 4 billing periods in
: each month (days 7/15/22/28). Currently we have 6 call tables (one for each month). One option
: is to DELETE the OLD cycle then SQL Load the new one. The DELETE will be from 1 to 1.5
: million rows a week. We estimated the DELETEwill take 150 minutes (which is too slow).
 

: The only other options we have is to make a table PER billing cycle (ie T_SEPT_7 table name)
: then just truncate the table and SQL Load the new load. The problem I forsee with this is that I
: will have to be joining all of these tables when doing an SQL call. The maximum join would involve
: 24 tables (4 cycles/mo > 4 tables X 6 months = 24 tables). Each of these billing cycle tables will
: hold 1 to 1.5 million rows each.
 

: The primary key for all the calls tables is the cellular phone number (two char fields : char(3)
: and char(4) for the phone number 123-4567). When querying the calls tables, I always know
: the phone number (primary key).
 

: What ever the solution, the responce time is NOT critical. However, a user can only wait for so long
: before shutting off the PC.
 

: Any ideas would be tried and tested on my lovely Unix/Win95 platforms.
 

: Thanks a million (pun pun pun),
: Rob Schick
: SHL Systemhouse
: rschick_at_shl.com

A couple of ideas:

1st make sure your date field is indexed. ( Create a seperate index on only this column even if it is part of the primary key - which you indicate that it is not ). Then the where clause that you use in your DELETE statement should have this date only

like DELETE T_CALLS WHERE call_date < :cutoff_value

This should speed things up dramatically. If you want restartability and reduced fear of blowing rollback segments or temp tablespaces - then create a PL*SQL or PRO*C routine that opens a cursor of all calls ordered by date ascending (still needs the index). Then cycle through the records deleting by rowid and committing every so often. Might take a little longer - but you can restart at any time and you wont need the entire time from the engine in one block...

Good luck
Randy :)

--
    ..uu.                                     ----------------------
  .?$" '?i     .                              I  Randy DeWoolfson  I
 .T^M  ._at_"    d9    .     f   ,.un.  b,    i  I--------------------I
 "  Z :#"    M `8   U    <  .dP"``"# `M   _at_"  I  randyd_at_cais.com   I
    &H?`    Xl _R   $5.  $  ?*    _at_   'P,#"   I--------------------I
  ,d#^*L   :RP'~$b  f`$L:M  Xf  .f'    dH`    I        ,\//.       I
    &  'M ,P    `E  M   "$  Mux~      n!`     I        |o o|       I
   dk   `h"       ' j     " y"       *~       I====oOO==(_)==Ooo===I
Received on Fri Nov 24 1995 - 00:00:00 CET

Original text of this message