Re: Designing a database that supports versioning of its data?

From: Tim X <timx_at_spamto.devnul.com>
Date: 30 Dec 2002 14:10:32 +1100
Message-ID: <87adiokxqv.fsf_at_tiger.rapttech.com.au>


"Andrew" <theshadow330 _at_ hotmail.com.N0SPAM> writes:

> Hi all,
>
> I would like some suggestions and ideas regarding the design of a database
> that supports versioning of its information. Firstly let me explain what the
> database needs to represent - comments on this design are also welcome so
> long as they in regard to versioning :-)
>
> If there is somewhere else I could post this question, please could you let
> me know - the more comments I get the better :)
>
> So, on with the questions:
>
> I need to represent a network. The network is made up of Nodes and
> Connections. Connections link two nodes. A node may be connected to any
> number of other nodes via a single connection.
>
> Naturally, I have Connection and Node tables. To represent the connection of
> two Nodes, I also have a NodeLink table. This table has a Connection FK and
> a Node FK. So, for a single Connection, there will be two entries in the
> NodeLink table - both have the same Connection_FK but each with a different
> Node_FK.
>
> There are various types of connections, each having their own configuration
> information. To represent this, there are two additional tables - Physical
> and Virtual. Both these tables have a Connection FK. It can be said that the
> type of Connection determines which of these two tables will contain an
> entry for that connection. If one table does not contain a reference, one is
> gaurenteed to a reference in the other table. Conversely, if a reference is
> found in one table, a reference will not exist in the other table.
>
> So we have something along the following lines in terms of an ERD; the
> numbers indicate the min/max cardinality of each relationship.
>
> [Virtual]
> 0,1
> |
> 1,1
> [Connection] (1,1)---(2,2) [NodeLink] (1,*)---(1,1) [Node]
> 1,1
> |
> 0,1
> [Physical]
>
> So representing the following layout would be rather simple:
>
> A --- B --- C
> |
> D --- E
>
> Say I then reconfigure the network to look as follows:
>
> B --- C
> |
> A --- D --- E
>
> I need to keep the original layout in my database as well as represent the
> new "proposed" layout. So it could be said there are two versions of this
> network. An OLD one and a ACTIVE one. With this simple requirement, what
> would be the recommended schema change for the database so that it can
> support this versioning?
>
> With that in mind, I also have the following requirements:
>
> 1. The layout of the network may not change, but the configuration
> information for a Node or a Connection may. This change needs to be
> versioned too so that one can look back at past network layouts and
> configurations.
>
> 2. I need to be able to redraw the state of the network at any point in time
> in the past. This should produce a diagram indicating the layout and
> configuration of the Connections and Nodes at that time.
>
> NOTE: Duplicating the structure of the entire network is not an option as
> the network could be huge.
>
Hi Andrew,

I'll have a go. Hope I understand your question correctly.

I gather from what you have written, your final application should be able to reproduce your network layout (both physical and virtual) at any point in time from its inception to the present (but not the future).

I guess I'd be tempted to put a start date and end date in the connection and possibly node entities. This information in the connection entity should allow you to collect all connections which were valid before, after or during a particular date - obviously for current connections, you could just have a null in the end date field. Putting start/end dates in the node field could be useful for additional data verification (e.g. you cannot connect to a node which has a later start date than the connection entity date). and to ensure all nodes are represented at a particular point in time (e.g. noeds which at some point become isolated).

Using the date fields would also simplify the situation where you have connections which are removed, but later added again - the dates can provide the necessary key to identify it as a different connection etc.

Just my $0.02

Tim

-- 
Tim Cross
The e-mail address on this message is FALSE (obviously!). My real e-mail is
to a company in Australia called rapttech and my login is tcross - if you 
really need to send mail, you should be able to work it out!
Received on Mon Dec 30 2002 - 04:10:32 CET

Original text of this message