RE: null or value
Date: Thu, 17 Sep 2009 09:08:14 -0400
It is a semi-dw application. I get feeds once a day with many transaction. Some of the transactions are done with test cards. I may not know which the test cards are until days or weeks after the transaction is loaded. So I will do a batch update going back several weeks, to mark which transaction are done with the test cards. Most of the reports will be for only the non test transaction.
From: Tim Gorman [mailto:tim_at_evdbt.com]
Sent: Thursday, September 17, 2009 3:31 AM
To: Steiner, Randy
Subject: Re: null or value
Everyone else is going to respond to this question focusing on the issues of NULL-ness or NOT NULL-ness, etc...
I'd like to step back a bit and reflect on a different concern, one of design: Is this a data warehouse application, and the proposed ACTIVE_FLAG column an indicator of which "version" of a record is supposed to "active" when there are multiple versions, such as...
ID EFF_DT ACTIVE_FLAG
=========== ================ ================ 123456 02-Aug 2009 Y 123456 17-Jun 2007 N 123456 23-Dec 2005 N 123457 11-Mar 2009 Y 123457 08-Oct 2006 N 123458 11-Jan 2005 Y 123459 01-Jul 2008 Y 123459 12-May 2004 N
...(and so on)...
So, the row with the latest EFF_DT value for each distinct ID value has ACTIVE_FLAG = 'Y', while all the other (older) rows with the same ID value have ACTIVE_FLAG = 'N'? And each time a newer row is added for a particular ID value, then the existing row with ACTIVE_FLAG = 'Y' will need to be updated?
If this is the case, then please know that this method of using an ACTIVE_FLAG to identify the "current" or "active" row will become one of the biggest ETL performance problems in your data warehouse, and even worse it will prevent you from making use of the time-variant nature of your data in order to set tablespaces containing older data to READ ONLY. Thus, as your data warehouses grows in volume, it will continue to be necessary to backup the entire database on a regular basis, and the entire database will continue to reside on expensive "tier 1" storage, and your loins will become barren, and locusts will blanket the earth, and... sorry, got kind of carried away there...
But if this is not the case, then never mind...... :-)
consultant - Evergreen Database Technologies, Inc. P.O. Box 630791, Highlands Ranch CO 80163-0791
website = http://www.EvDBT.com/ email = Tim_at_EvDBT.com mobile = +1-303-885-4526 fax = +1-303-484-3608 Yahoo IM = tim_evdbt
Steiner, Randy wrote:
I have a table with 1 - 2 million records. I need to add a column to indicate if the records is active or not. I would guess that only 1,000 of the records would not be active. Should I make one of the values null? So I could put a Y or Null? Or put Y or N?
Would a b-tree or bitmap index do any good?
Generally I would want to see all the records that are active
RandyReceived on Thu Sep 17 2009 - 08:08:14 CDT