Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Choosing data file size for a multi TB database?

RE: Choosing data file size for a multi TB database?

From: Parker, Matthew <matthewp_at_amazon.com>
Date: Mon, 5 Sep 2005 23:35:27 -0700
Message-ID: <F385925F530F6C4081F1659F71640EB3042D9B59@ex-mail-sea-04.ant.amazon.com>


In Oracle a large number of datafiles do affect you administratively and in recovery.

In a 22TB database that has +4500 files. A simple query such as select tablespace_name,sum(bytes)/1024/1024 from dba_segments group by tablespace_name;

takes 5 minutes and that is not indicative of a large number of objects, it is indicative of the freespace bitmap walk across the datafiles (locally managed tablespaces). Under some circumstances or patch versions it can take much longer.

Also in examination of this database even under low load the time to check point is 15 minutes and many times overlays the next log switch. Someday Oracle will parallelize the CKPT process, similar to how it is done on recovery.

As to Tim's point about a Java application with 42 million files and directories, the Java app would be non-performant if it had to update all 42 million files/directories on a regular basis.

For you 10TB database 16GB files will give you a database with about 650 files. This still makes the files managable from a size perspective for backup/recovery or other types of movement requirements.

As to database backup verification we could talk for hours about backup scenarios and backup methodolgies, but in it's simplest terms, this comes down to a question I normally ask in interviews of DBA's. How do you test a backup of a database when you do not have sufficient space to hold a secondary copy of the database? I never accept trust the rust as an answer. The real answer of course is simple, you perform recovery of a part of the database and you walk through all the parts. It can be done, it simply takes time.

-----Original Message-----
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Zoran Martic Sent: Saturday, September 03, 2005 7:47 AM To: BranimirP_at_cpas.com; oracle-l_at_freelists.org Subject: RE: Choosing data file size for a multi TB database?

The checkpointing heavily depends on the multiple DBWR having the good I/O subsystem behind that can survive the I/O throughput usually measured in random I/O operations then throughput. Usually does not matter is it 50 or 1000 files, it I believe more depends on the physical I/O layout and how the disk I/O is handling fast delivery of I/O operations at the backend, then is it 100 or 1000 files in te database.

The huge database are using various ways to protect itself. Who has money usually have some crazy disk based backup strategy that is anyway the fastest backup and recovery. Usually all very important databases are also covered by good disaster standby or somekind of replication.

Regards,
Zoran

> What about checkpoint against tens of thousands of
> data files, surely
> more-merrier rule holds? For that reason (or due to
> a fear factor) I
> was under may be false impression that smaller
> number (in hundreds)
> of relatively larger data files (20 GB or so) might
> be better choice.
>
> Other very real problem with 10TB database I can
> easily foresee, but
> for which I do not know proper solution, is how
> would one go about the
> business of regular verification of taped backup
> sets? Have another
> humongous hardware just for that purpose? Fully
> trust the rust? (i.e.
> examine backup logs and never try restoring, or...)
> What do people
> do to ensure multi TB monster databases are surely
> and truly safe
> and restorable/rebuildable?
>
>
> Branimir
>
> -----Original Message-----
> From: Tim Gorman [mailto:tim_at_evdbt.com]
> Sent: Friday, September 02, 2005 5:59 PM
> To: oracle-l_at_freelists.org
> Subject: Re: Choosing data file size for a multi TB
> database?
>
>
> Datafile sizing has the greatest regular impact on
> backups and restores.
> Given a large multi-processor server with 16 tape
> drives available, which
> would do a full backup or full restore fastest?
>
>
>
> * a 10-Tbyte database comprised of two 5-Tbyte
> datafiles
>
> * a 10-Tbyte database comprised of ten 1-Tbyte
> datafiles
>
> * a 10-Tbyte database comprised of two-hundred
> 50-Gbyte datafiles?
>
> * a 10-Tbyte database comprised of two-thousand
> 5-Gbyte datafiles?
>
>
>
> Be sure to consider what type of backup media are
> you using, how much
> concurrency will you be using, and the throughput of
> each device?
>
> There is nothing "unmanageable" about hundreds or
> thousands of datafiles;
> don't know why that's cited as a concern. Oracle8.0
> and above has a
> limitation on 65,535 datafiles per tablespace, but
> otherwise large numbers
> of files are not something to be concerned about.
> Heck, the average
> distribution of a Java-based application is
> comprised of 42 million
> directories and files and nobody ever worries about
> it...
>
>
>
>



Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
--
http://www.freelists.org/webpage/oracle-l
--
http://www.freelists.org/webpage/oracle-l
Received on Tue Sep 06 2005 - 01:38:09 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US