Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: storing a million of small files Oracle (or other db) vs. File System

Re: storing a million of small files Oracle (or other db) vs. File System

From: Darren Dunham <ddunham_at_redwood.taos.com>
Date: Sun, 22 Jul 2001 00:08:06 GMT
Message-ID: <qJvY6.126806$qv3.37849418@nnrp5-w.sbc.net>

NetComrade <andreyNSPAM_at_bookexchange.net> wrote:
> Trying to figure out pros and cons of storing a million (and growing)
> files ranging from 1k to 500K (mostly text, possibly multimedia soon)
 

> Files can get inserted on the order of 15-20/sec
 

> Oracle:
> pluses
> * storing in LOBs all in one place, don't have to worry about file
> system, don't have to worry about recovering both DB and File System
> in case of file system problems
> minuses
> * bloated database, extra redo/rollback generated, larger backups
 

> File System
> pluses
> * easy managability for experienced Unixer
> minuses
> * not easy for inexperienced, File System OS might need to be tuned to
> accompany millions files in single dir.
 

> e.g., recently a delete of 64K files took a good few hours to complete
 

> Suggestions? Previous experience?

A filesystem is not a database. If you try to turn it into one, you may be disappointed.

The solaris UFS filesystem has no indexing within a directory. If you're only looking up a few things, then the directory name cache can help, but if you're hitting lots of different ones in a directory with millions of files, then your performance will suffer greatly.

If nothing else, you may well be able to do one or two extra levels to add indexing..

Instead of....		Use...
/database/0000000	/database/0/0/0000000
/database/0000001	/database/0/0/0000001
...			...
/database/0099999	/database/0/0/0099999
/database/0100000	/database/0/1/0100000
/database/0100001	/database/0/1/0100001
...			...
/database/5555555	/database/5/5/5555555
/database/5555556	/database/5/5/5555556

Obviously, your "indexes" depend on the namespace that your files are using..

The first method has to do a lookup (linearly) through potentially 10 million entries. The second method has reduced that to 2 lookups of a maximum of 10 entries, and a third of a maximum of 100,000 entries. That can make a *huge* difference in the performance.

Also, deleting files from a "big" directory will *not* improve creation performance. If you get rid of most of the files, you might as well create a new directory and move the remaining files in it.

All the above is for UFS. If you have VxFS, there might be some stuff in there to increase large directory performance.

Also, if the througput isn't very high, turning on ufs logging may increase write performance.

-- 
Darren Dunham                                           ddunham_at_taos.com
Unix System Administrator                    Taos - The SysAdmin Company
Got some Dr Pepper?                           San Francisco, CA bay area
          < How are you gentlemen!! Take off every '.SIG'!! >
Received on Sat Jul 21 2001 - 19:08:06 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US