Re: Tables Larger When using DBMS_REDEFINITION

From: Mladen Gogala <>
Date: Mon, 23 Feb 2015 20:26:47 -0500
Message-ID: <>

On 02/23/2015 12:35 PM, wrote:
> Not really the subject of the thread… but it is quite common practise
> to have no primary key on a datawarehouse fact table. Sometimes it’s
> logically not possible to uniquely identify a row eg same customer
> buys same product from same branch on same day – business is not
> interested in the time of day, or the sales order number, but they are
> interested in the number of individual sales. Unfortunately the
> elegance of Primary Keys and Relational Database Design has been
> scuppered by those darned realities of life.
> You could include the time, but nobody’s interested. You could include
> a generated surrogate key and primary key that, but why would you
> invent work when there’s enough already. If you need to get close to a
> row to maintain or delete it then you use the foreign keys (Bitmap
> Indexed) from the dimensions.
> So however crazy a design offence this may be, we live without a
> Primary Key on the Fact and it hasn’t caused us a moments problem in 5
> years of live running.

There is only one question: your database has been running for quite some time. How do you know it's logically correct? How do you know that your ETL procedure hasn't missed a beat here and there or that it hasn't loaded the same record twice? Simple truth is that it is very hard to control that and that you would have to check ETL logs and compare them with the primary after each and every log. Second question is whether your primary tables have the primary key? Hopefully, they do. Second disagreement is that this doesn't have anything to do with the initial thread. The whole problem was created because the table being redefined by the DBMS_REDEFINITION does not have the primary key. Last, if your business is only interested in the number of daily purchases, then why give them all rows in de-normalized form? Loading just the requested data, with GROUP BY stuff done on the primary side looks much more economical solution than loading all rows, without means of distinguishing among them. This is really a philosophical question, something for the late Lex de Haan. Relational databases are based on the set theory. Set theory is based on the notion that it is possible to tell the elements of a set apart. Without the possibility of telling them apart, it is not possible to answer the question of how many elements are there in the set.
The fact that we call sets "tables" is just a convenience. This is what happens when you have a big table, full of rows which are impossible to tell apart:

I have seen such "designs" many times. If it is not possible to have a clear uniquely identifying property of each row, maybe the table itself is not worth creating? Such questions do save storage space, you know.

Mladen Gogala
Oracle DBA

Received on Tue Feb 24 2015 - 02:26:47 CET

Original text of this message