Re: 1 TB _a day_ at CERN (was: 21 terabytes at NYNEX)

From: Jamie Shiers <Jamie.Shiers_at_cern.ch>
Date: 1996/06/06
Message-ID: <31B6B75F.2BCC_at_cern.ch>#1/1


People looking for the RD45 web page can find it via:

CERN (http://www.cern.ch) -> Research and Development -> RD45

or http://wwwcn.cern.ch/pl/cernlib/rd45/index.html

Amongst other things, you will find some predictions as to where various technologies will go in the next 10 years. Most of the information contained therein was gleaned from the web, and is about 4 months old, so is probably obsolete by now :-)

This project is a fairly long term one - we are trying to find data management solutions for the next generation of experiments at CERN, namely those on the Large Hadron Collider (LHC). This is not scheduled to enter operation until 2004/2005, but will then generate several PB (10**15 bytes) of data per year.

We hope to have about 1 TB in an ODBMS by this time next year, gradually building up to n PB in ten years time.

The argument concerning disks/tapes is based on market pressures - basically everyone on the planet is a potential customer for disks, whereas the market for high-end (100 GB or more per volume, high data rate and reliability) is, let's say, somewhat smaller.

Of course, one can speculate that all sorts of other exciting storage devices will come along in the next ten years, but you can't plan on it...

I'm not sure what "large quantities" are that Akmal refers to. We have several TB of disk today. My belief is that if you have less than a few TB of data, then don't get involved with tapes/robots/mass storage software. This 'break-even point' will certainly increase with time: a few tens of TB, then hundreds, and finally even PB will eventually become trivial.

If you want numbers, ring up your local tape salesman and ask him how much a 10 GB Magstar cartridge costs, then multiply by a million to get the media costs for storing two years LHC data.

Cheers,

--
Jamie Shiers, 
Computing and Networks Division, 
Application Software and Databases Group,
CERN
1211 Geneva 23

e-mail: Jamie.Shiers_at_cern.ch


Juergen Schlegelmilch wrote:

>
> On Sun, 26 May 1996 11:16:50 +0100, Akmal B Chaudhri <akmal_at_sarc.city.ac.uk> wrote:
> >
> > I think the tape robots would be the standard way to put such large
> > quantities of data on-line, as you rightly suggest. I believe CERN are
> > looking into various mass-storage technologies to interface with an OODB
> > for the longer-term. I have no idea about disk requirements, though. Even
> > if anyone wanted to put such large quatities on disks, it would probably
> > cost such an enormous amount of money to do so that I guess practically
> > nobody (except maybe the US military) could afford it. Also, I don't see
> > how disk technology is really going to improve that much over the time
> > that they want to conduct these tests.
>
> Just yesterday I heard in an Objectivity RoadShow a talk from Jamie Shier
> from CERN about this project (its name is rd45; sorry, there is a WWW page
> with exact info but I did not write down its URL). They plan to use clusters
> of machines running Objectivity, and believe that tapes are obsolete by that
> time, i.e. will only use hard discs. Regarding the money: The price for
> discs has dropped significantly over the last few months, and competition
> will further put pressure on it; and compared to the cost for their physics
> equipment are hard discs inexpensive, I believe.
>
> Regards,
> Juergen
>
> --
> +-----------------------------------------------------------------------------+
> Dipl.-Inf. Juergen Schlegelmilch University of Rostock
> email: schlegel_at_Informatik.Uni-Rostock.de Computer Science Department
> http://www.informatik.uni-rostock.de/~schlegel Database Research Group
> Tel: ++49 381 498 3402 18051 Rostock
> Fax: ++49 381 498 3404 Germany
> +-----------------------------------------------------------------------------+
Received on Thu Jun 06 1996 - 00:00:00 CEST

Original text of this message