Re: Database Usage of Unix FFS

From: Pete Bentley <pete_at_tecc.co.uk>
Date: Wed, 23 Feb 1994 11:32:20 GMT
Message-ID: <PETE.94Feb23113220_at_luggage.tecc.co.uk>


>>>>> In article <CLMt3D.35o_at_world.std.com>, lparsons_at_world.std.com (Lee E Parsons) writes:

>>
>> Given the low number of unix creates/delete on these drives, please
>> comment on the following statements.
>>
>> 1) It would seem reasonable in this case to turn the minfree option
>> on the FFS down as low as possable (0%) and use as much of the
>> drive as you can (100%).
>>
>> (I/O distribution considerations aside. Of course.)
>>
Well, I/O distribution matters...you can use 100% of the disk, but blocks from the last few files that are created may be scattered across the whole whole disk, hurting performance.

>> 2) Since no deletes are happening you don't have to worry about holes
>> in the data. ie) you won't have 100 meg in the middle of the drive
>> you need to use.
>>
>> Even if you needed to add an additional datafile it would just get
>> slapped on the end of the last datafile in one contiguous space.
FFS doesn't work like that. Blocks aren't allocated contiguously from the start of the drive, but rather the drive is split into cylinder groups, each of which behaves a bit like a SysV filesystem (gross oversimplification). When a file is created, the system tries to put it into a cylinder group with more than average free space, and as it grows, new blocks are added from the same cylinder group as 'near' as possible to the previous block ('near' is a complex function designed to reduce the delay between reading consecutive blocks of a file). As the cylinder group fills, it gets harder to allocate 'near' blocks, and sooner or later the group is full and the next block gets allocated from a different cylinder group, requiring a seek to read it. The 10% default value for minfree is designed to minimise these poor allocations by ensuring there is generally enough free space to allocate a block near to the previous one in the file. The maxbpg parameter is also there to improve allocation for general purpose filesystems...Only maxbpg blocks from a file will be allocated in a single cylinder group...further blocks will be allocated from another cylinder group, involving a seek to get to the next data block. This is designed to prevent one file in a cylinder group growing and causing inefficient allocation for other files in the group which grow. For a 'static' filesystem as you describe, I would say (completely guessing) that you might want to make maxbpg big (like 80% of the size of a cylinder group, or more). My guess is that this should (a) Improve locality of reference for random access within any one file and (b) depending how the data gets moved onto the disk, this could mean blocks get allocated more efficiently, with fewer files split across multiple cylinder groups (it could mean *more* split files, though depending on the relationships of the average file size, the cylinder group size and maxbpg...).

For what you are doing, putting a few big files on a big disk, it is tempting to suggest that a more traditional, SVR3-type filesystem might be more efficient...allocate a really small number of inodes, put all the files as few directories as possible and the SVR3 layout policy means you can get all the files contiguous if you create them on another f/s and tar or cpio them to their destination. Comments?

Pete. Received on Wed Feb 23 1994 - 12:32:20 CET

Original text of this message