Re: Storing "deleted" data

From: Jonathan Leffler <jleffler_at_earthlink.net>
Date: Tue, 01 Nov 2005 04:51:22 GMT
Message-ID: <evC9f.4434$yX2.3881_at_newsread2.news.pas.earthlink.net>


zeldorblat_at_gmail.com wrote:
> We are looking for a way to represent and store "deleted" or old data
> in our database. The idea is that we have some records for active
> customers/people/whatever and some for those that we no longer care
> about. We can't just delete those rows, however, because it would
> violate referential integrity. But we would like to have a way to
> exclude them so they don't clutter drop-down lists, reports, etc.
>
> I've come up with two options so far:
>
> 1. Just add a bit column to the table -- 'isDeleted' or something
> 2. Create two separate tables -- one with current data and one with old
> data
>
> #1 is the easiest to implement and preserves referential integrity.
> But the queries must now discriminate on a column with only two
> possible values (i.e. an index on the isDeleted column won't really
> help).
>
> #2 Isn't too bad -- the table containing current data is smaller and
> faster to update/select. But in this case I don't know how to define
> the integrity constraints. As far as I know you can't have a foreign
> key reference to one table OR another.
>
> How do others solve this problem? Is there an alternative that I'm
> missing? I suppose you could also use triggers but that would be kind
> of ugly.

Investigate 'temporal databases'. These give you a history of changes to the data. There are a couple of books to consider - "Temporal Data and the Relational Model" by Date, Darwen and Lorentzos, or "Developing Time-Oriented Applications in SQL" by Snodgrass. Both are tough reading in their different ways, but the subject is complex.

-- 
Jonathan Leffler                   #include <disclaimer.h>
Email: jleffler_at_earthlink.net, jleffler_at_us.ibm.com
Guardian of DBD::Informix v2005.02 -- http://dbi.perl.org/
Received on Tue Nov 01 2005 - 05:51:22 CET

Original text of this message