Re: Archiving : Need help

From: Adrian Challinor <adrian_at_a-cha.demon.co.uk>
Date: 1995/04/10
Message-ID: <ABAE840196683EB3C1_at_a-cha.demon.co.uk>


In article <rajaD6JD2o.1u8_at_netcom.com> (Archiving : Need help ), you write:

>We have been trying to build archiving /retrieval utilities for the past
>some months & have been hitting a brick wall due to one reason or another.
>

I have a Database Archiving Tool that I sell as a commercial product. It handles Oracle 6, 7.0, 7.1 versions. Runs on all known platforms that support the C compiler (ANSI, please) and OCI interface.

The distributor is Radan Business Systemns in the UK. Phone +44-1225-484449, contact Stuart Harrison.


DB*A  V1.4                                        Software Product
Description

Relational Database Management Systems (RDBMS) have become an industry standard for the storage of corporate information. However, the systems provided by the vendors are deficient in that they do not offer the facilities necessary to manage the contents of the database. Their is a tendency for databases to fill up with historical information. Though of value to the organisation, this information could often be held in off-line storage.

DataBase*Archiver (DB*A) solves this problem. It allows the database administrator to archive data on a time based criteria. DB*A understands that databases are far more than simply tables; it is aware that there are referential integrity constraints that provide the tables with the capacity to fulfil the organisations MIS needs. Though data can build up in the database over time, DB*A recognises that it is not sufficient to simply delete the data from individual tables to free up space.

DB*A allows the Database Administrator to define the relationship between the tables. It will then generate the SQL commands necessary to select data from the database and archive it to off-line storage. The selection of records ensures that the internal consistency of the database is maintained.

For example, when archiving an Employee table to remove people who no longer work for the organisation, it is necessary to also archive the salary history and job history information for these employees. DB*A supports this sort of relationship. Indeed, it can support up to an unlimited level of nested tables, which provides for all commercial databases. It has no limit on the number of tables being archived, nor does it restrict the number of referential integrity constraints supported. It does, however, restrict the selection of records to logical AND selection; the current version of the internal syntax does not support the concept of the relational union or outer-join for database archiving.

When archiving data, it is of prime concern to the database administrator that the data that has been removed from the database can be restored at a later stage. However, by their very nature, database structures do not remain static. DB*A supports the concept that the database will change. There are a number of changes that can occur:

A table might be deleted. On restoring the data, DB*A can re-create the table and populate it with the restored data, ignore the data and restore all other tables, or can abandon the restore operation. This is configurable with run-time switches. Tables re-created in this manner are simply the tables in the same form as when the data was archived; no attempt is made to restore integrity constraints of indexes.

A column might have been dropped. When restoring the archive, DB*A can recreate the column, ignore it (not restore this column), or abandon the restore. Again, this is a run-time switch.

A new column might exist in a table. In this case, DB*A can fill this field with NULLs when restoring the data, or it can abandon the restore operation. This is, of course, a run time option.

The more complex situation arises when the column has changed format, say from a date to a character string. There are a number of switches that can be applied: This include allowing the truncation of character string (say a CHAR(20) column became a CHAR(15)); automatically padding character strings with blanks, and attempting to coerce (or force a data type conversion) the field. Some coercionÕs are natural, say from a date to a character string; the field received the data in its default character format; or from a number to a character string. However, the reverse is not always so simple: not all character strings can be converted to a number - these are data content dependent. DB*A will attempt any reasonable conversion and will inform you if it is unable to succeed.

Unreasonable conversions occur when complex data types are used. In systems that support binary data types, it is unreasonable to attempt to coerce a graphical image (say a bit mapped picture or a movie) in to a character string. There is no known method by which this conversion can take place without a catastrophic loss of information. In such situations, the coercion fails before any attempt to restore data takes place.

It should be noted that DB*A internally stores numbers in an RDBMS independent form, which provides for at least 38 significant digits. However, the conversion between numbers and character strings relies on the restoring RDBMS being able to support and allow such coercionÕs.

DB*A has been designed to Ôfail safeÕ. This means that it does not delete data from the database unless the data has been safely written to disk. Nor does it restore any data back to the database unless the complete archive can be restored. There are a large number of checks during both the archive and restore operations. It is also worth noting that the archive files are protected by a series of complex, proprietary checksum operations. These ensure that the archive itself is not subject to changes, corruption or tampering. DB*A recognises that this data, whilst being outside the boundaries of the RDBMS is still corporate data and protects it against misuse. However, this data is not encrypted, so it is expected the archive is itself protected from unauthorised browsing by the database administrator.

DB*A will normally make a journal entry whenever it performs an archive or a restore. This is in addition to any script file and the internal log files. The journal is used in case the operation should fail, whether due to a machine, software, hardware or power failure. On reboot, DB*A can be instructed to check for journal files and restart any in-progress operations. The on-line journal of archive and restore operations may be queried by a simple, though powerful, command language. Facilities are provided to report on archives based on the date range or operation id number. Facilities are provided to re-run previous archives or to restore data previously archived based on the journal. Once the database archive has been completed, DB*A no longer controls the location of the archive file (that is, DB*A can not stop a suitably privileged person moving the archive from disk to tape, or to another disk), but it does provide facilities to allow a database administrator to record such an event in the internal journal.

DB*A is fully scriptable. It has a powerful facility to select records based on absolute date, relative dates, constants and referential integrity constraints. All of this can be entered as commands into a DB*A interactive session or can be written to a generic script. DB*A can write its own script files from an interactive session.

DB*A is licensed to run on selected machines in the organisations network. This is defined in a license file supplied by the distributors. This license file can be easily updated once a key is provided by the distributors; it is not necessary to re-install the product to add a new node.

DB*A is a complex product, aimed at a variety of databases on most common platforms. The product itself requires under 1mb of disk space for its executable code and support files. The amount of memory, CPU time, and disk space when executing depends on the complexity of the archive request and the internal structure of the database. The product is written in industry standard C and supports both the ANSI and the GCC compilers. Source code is not provided with then product. For performance reasons, this product does not archive and restore data in a client/server environment, it will always run on the database server.

DB*A also supports referential integrity (foreign key) constraints with the database definition language (DDL). If referential integrity constraints have been configured in to the database design, DB*A will recognise this and will generated the table definitions necessary to support the foreign key relationships.

DB*A is of interest to any Relational Database site which has time dependent data that they wish to retain, but which need not be held on-line. There are flexible licensing arrangements available, which cover the size of database and the number of concurrent users the database supports.

As an option, consultancy and training can be provided on-site to assist the database administration team understand the complexity of the product and get the best from it. Please call the distributor for more details. Received on Mon Apr 10 1995 - 00:00:00 CEST

Original text of this message