Re: Best way to delete million records

From: Shakespeare <whatsin_at_xs4all.nl>
Date: Sun, 17 Aug 2008 10:38:41 +0200
Message-ID: <48a7e397$0$184$e4fe514c@news.xs4all.nl>

<paa.listas_at_gmail.com> schreef in bericht news:cf74d9a0-64b6-4d92-a415-d6823fcb527b_at_l42g2000hsc.googlegroups.com...
> On 30 jun, 05:00, fefe78 <pe..._at_puppurnazzi.com> wrote:
>> Hi all,
>>
>> I have to delete about 43 millions rows from a table, but the following
>> conditions exist:
>>
>> - The table is not partitioned and unfortunately I can't modify the
>> structure
>> - The delete procedure must be done without stopping service, so I can't
>> use CTAS, moving records and renaming table.
>> - I can't use truncate because the records affected are selected in a
>> range of dates
>>
>> Waiting your suggestions and probable problems for this procedure.
>>
>> Thanks for help.
>>
>> Bye
>
> Hello,
>
> i have a similar problem and use someting like this:
> -----------------------------
> DECLARE
>
> -- 1 month
> ITERACIONES CONSTANT NUMBER := 30;
>
> -- 2 years. your oldest record timestamp
> OLDEST_DATE_ CONSTANT NUMBER := 730;
> oldest_date DATE;
>
> KEEP_DATE CONSTANT NUMBER := 60;
>
> STEP_DATE CONSTANT NUMBER := 1;
> loop_date DATE;
>
> BEGIN
> dbms_output.put_line('.- Iniciando a las: ' || SYSDATE );
> -- Inicializamos
> -- set server outout on format wrapped;
> SELECT SYSDATE - OLDEST_DATE_ INTO oldest_date FROM DUAL;
> SELECT oldest_date INTO loop_date FROM DUAL;
>
> FOR x IN 1..ITERACIONES LOOP
> dbms_output.put_line('Iniciando procedimiento nro: ' || x);
> dbms_output.put_line('borrando registros entre Fechas ' ||
> oldest_date || ' y ' || loop_date );
>
> -- DELETE...;
> DELETE FROM iwaylogs.cm_leases
> WHERE iwaylogs.cm_leases.fecha BETWEEN oldest_date AND
> loop_date;
> -- COMMIT;
> COMMIT;
>
> -- adelanto la fecha
> oldest_date := loop_date;
> -- adelanto el loop
> loop_date := loop_date + STEP_DATE;
>
> -- Si ya llegue a borrar hasta donde quiero salgo
> IF ( loop_date >= (SYSDATE - KEEP_DATE) ) THEN
> dbms_output.put_line('Llegamos a los 60 dias,
> saliendo');
> EXIT;
> END IF;
> dbms_output.put_line('Procedimiento listo.');
> -- Esperamos XX segundos para 'descansar' la DB :)
> dbms_lock.sleep(90);
> END LOOP;
>
> dbms_output.put_line('Finalizado' );
>
> END;
>
> ------------------------------------------------------
> )

Why don't you just do a delete like
where fecha between oldest_date and oldest_date+30 ? It saves you a loop and you do it all in one transaction.... and you don't have to 'descansar' - sleep - your DB.
(I noted you'll have to keep track of KEEP_DATE as well, so you'll have to do a MIN( ) in your comparison somewhere) Furthermore, your loop sleeps 90 seconds every loop, which makes your program run at least in 30*90 seconds = 2700 seconds is 45 minutes... even if there is nothing to delete in the first 29 loops!!! Which will be the case if you run your script once a day. If you run it several times a day, it is even doing nothing at all for 45 minutes. (delete nothing, wait 90 seconds, delete nothing, wait 90 seconds and so on....)

Shakespeare Received on Sun Aug 17 2008 - 03:38:41 CDT

Original text of this message