Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Max Size Datafile in 10g

Re: Max Size Datafile in 10g

From: Howard J. Rogers <hjr_at_dizwell.com>
Date: Wed, 15 Sep 2004 15:35:17 +1000
Message-ID: <4147d493$0$23893$afc38c87@news.optusnet.com.au>


Mark Townsend wrote:

> Howard J. Rogers wrote:

>> Mark Townsend wrote:
>> 
>> 
>>>Joel Garry wrote:
>>>
>>>>"Howard J. Rogers" <hjr_at_dizwell.com> wrote in message
>>>>news:<4146fc78$0$5727$afc38c87_at_news.optusnet.com.au>...
>>>>
>>>>

>
>>>>>
>>>>>If I thought for a minute that anybody on the face of the planet, apart
>>>>
>>>>>from the CIA, actually *needed* 8 Exabytes, I'd care enough to look it
>>>>
>>>>>up...
>> 

>
>>  He referred to some research done at UCLA on the total information
>> quantity in the world currently at 1 Exabyte, but I don't recall who
>> did it.

>
>> 
>> 
>>>Two potential areas that I know of (at least, we cite them as the reason
>>>we lifted the restrictions in 10g) that may cause data explosions in the
>>>next 5 years
>>>
>>>Life sciences (storage of 'individualized' genetic code maps ?) and CERN
>>>(the new linear accelerator and the hunt for the God particle)
>> 
>> 
>> 
>> Really?

>
> Really.
>
>> Let me do some maths for a second (warning! warning!).
>> 
>> The human genome is about 3,000,000,000 bases long.
>> Each base is just a single (English) letter, so can be represented in 1
>> byte. A human genome would thus require 375,000,000 bytes of storage,
>> which is about 375MB.
>> 
>> For the approximately 6 billion people on the planet, you would therefore
>> need 375MB*6,000,000,000 = 2,250,000,000,000MB, which is 2,250,000,000GB,
>> 2,250,000TB, 2,250 Petabytes or 2.25 Exabytes.

>
> Right - and if the 1 Exabyte figure for all current information is
> correct then this would represent a 200+% increase in a very short time
> frame. Sort of what I would call a data explosion.
>
>> So you could comfortably fit an individualised genetic code map for every
>> person on the face of the Earth in less than one third of a single
>> bigfile tablespace in a single database.
>> 
>> Were Oracle corporation anticipating some sort of population explosion in
>> the next five years, or something?!

>
> Nope - obviously there is a fair amount of serendipity in the 8 Exabyte
> figure. I don't think we seriously consider that anybody will need this
> in the short term future. However, it was fairly easy to predict that
> petabyte plus storage environments will be common place in the next 2-3
> years, especially as more and more digital data becomes managed online
> for longer periods of time (Oracle itself already has around 3 petabytes
> of storage inhouse - including (and I joke not) over 3 terabytes of
> predominantly powerpoint content in a single database)
>
> So we knew limits needed to be raised, it just so happens that the
> bigfile work gave us a nice high 8 exabyte watemark. And we had a couple
> of customers that would validate out thinking for us, and can see
> themselves going to multiple petabytes in the next few years.
>
>
>> Mathematical joking aside, it's an interesting idea: two totally
>> off-the-planet requirements dictating the internal workings of a
>> mass-produced RDBMS. I dare say it doesn't happen very often.

>
>
> You may be surprised - there are a number of them.
>
> Cern, for instance, will generate around 10 Petabytes of info a year
> when they fire the accelerator up. It could run for 10-20 years.
>
> A certain company that does record and backup management and electronic
> vaulting (basically for the world) predicts massive movement of content
> from paper (and glass) and from current tape and offline storage to
> online-all-the-time storage over the next few years - and these guys
> literally have mountains (hint) of content that they manage - films,
> xrays and medical records, 50 years of email, even source code :-).
> Digitizing and storing everything that Hollywood (and Bollywood) has
> produced will require huge storage.
>
> Another group (not a company) wants to maintain the family tree history
> of every person that has lived on the face of the earth to date. Combine
> this with medical history and genomic maps and the figure is very large.
>
> The real kicker - sensor data - imagine a world where EVERYTHING is
> tagged with RFIDs (including, perhaps, even money), and sensors
> everywhere detect what is currently in and out of range, going from and
> coming to. Continuously. Or every vehicle on the face of the earth
> reporting current engine statistics and mileage per gallon information.
> Or every household reporting electricity usage at the 5 second interval
> for spot buying from producers/suppliers. Or in the note too distant
> future - the human ADDM device that collects and reports on our own
> individual health statistics.
>
> And then you have the mapping guys - earth maps, sea maps, space maps,
> body maps, genomic maps, mind maps. Maps with multiple dimensions. Maps
> with temperatures, frequencies etc etc etc. The sky is literally the
> limit for this.
>
> And then this is just the raw stuff - to use all this data, it has to be
> indexed and summarized and backed up - which could increase the actual
> storage foot print by a factors of 10 or more.
>
> And the thing is is that this is just what we can see now. Storage
> prices will continue to plummet, and new applications will be developed
> to take advantage of this fact - the iPod for instance. We don't even
> know what the devices of the future will require.
>
>> 
>> You *sure* it wasn't just so the marketing department could say theirs
>> was bigger than IBM's??!

>
> Nope - we will see single figure petabyte databases very soon (they are
> being built as we speak), and within 10 years we will see 100+ petabyte
> databases and some early exabyte environments. And it's not just
> marketing - Jim Gray has an interesting presentation on this - see
>
http://www.research.microsoft.com/~Gray/talks/Gray%20IIST%20Personal%20Petabyte%20Enterprise%20Exabyte.ppt
>
>
>> </irony>
>> 
>> Regards
>> HJR
>>


It was a good answer. But I have to say, a little cynically, I doubt the world will be a better place for all this data.

Seems to me that in the rush to capture everything, we've forgotten how to precis or discriminate between the worthless and the priceless.

But that's a topic for another day.

Thanks for the link, by the way.

Regards
HJR Received on Wed Sep 15 2004 - 00:35:17 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US