Need advice on VLDB system

From: Robert Nygren <robert.nygren_at_mbox300.swipnet.se>
Date: 27 May 1998 20:32:51 GMT
Message-ID: <896301171.386723_at_mn8>



I am working in a VLDB project with a total amount of data of some 300-400 GB. The database will contain more than 100 tables, with most of the data being stored in some 10 tables. The largest table will have an estimated total of more than 1 billion rows.

We hope(!) to be able to use MS SQL Server 7.0 due to ease of maintenance and limited cost. Oracle is however still a valid alternative. We run Windows NT on Intel platforms with some 40-50 simultaneous users, and heavy background processing.

Our data is chronologically organized, with the latest month containing the most frequently accessed and updated data. New data is added on a continous basis. Updates to older data will take place sporadically.

Our main issue is how to deal with these immense amounts of information. We are not looking for an easy solution, or an ideal solution, but for something that will work! We have four different alternatives which we are currently discussing:

  1. Store everything in a single database. This is the easiest way to go, from the developers point of view, but the sizes of the largest tables are horrifying.
  2. Store everything in a single database, but partition the chronological information in a 'hot' and a 'cold' table. This will facilitate data access, due to the fixed number of tables, and will enable heavier use of indices on the 'cold' table. The cold table will however still be VERY large.
  3. Store everything in a single database, but partition the chronological data horizontally, month by month, in different tables. The estimated data amount for one month is around 5 GB. This will keep the table sizes down, but will complicate data access. What kinds of other limiting factors there are on the database level, e g related to metadata, is still unknown to us.
  4. Partition the chronological data horizontally, month by month, in separate databases. This will keep table sizes down, and we will possibly avoid any database-level limitations encountered under alternative 3, but data access will be complicated, and we will not be able to use declarative RI.

Administrative topics such as manageable backup/restore are also important factors, of course.

We haven't got the time to evaluate the different alternatives, and assume there must be other people out there who have found themselves in a similar situation, and we look for all kinds of advice and experience. We are not altogether amused by the idea to 'go boldly where no man has ever gone before'.

We look to get in contact with persons or companies who have experience in these matters. Preferably people who have taken part in creating working solutions in similar projects using SQL Server or, more likely, Oracle.

If you have the required experience yourself, or know about other people or companies that do, please do not hesitate to contact me as soon as possible.

Best regards Received on Wed May 27 1998 - 22:32:51 CEST

Original text of this message