SUMMARY: reference to really low level filesystem docs?

From: Largent, Aaron <ALargent_at_concordefs.com>
Date: Wed May 07 2003 - 15:26:16 EDT
I have found enough information at this point to summarize,
but my search for more hasn't ended.  I think a topical
breakdown best fits what I've found.  Please notify me of
any horrific errors.


------
LAYOUT OF A SINGLE DISK SLICE WITH UFS:

Label
	-at beginning of disk, not on any slice
	-is 1 sector (512 bytes) in length
	-contains the disk name, geometry, and partition table
Boot Block
	-located at Cylinder 0, Track 0, Sector 0
	-16 sectors in length (8K)
	-holds the boot program
Superblock
	-located at Cylinder 0, Track 0, Sector 16
	-16 sectors in length (8K)
	-contains information about filesystem:
	  -various offsets, block size, frag size, number of
	   data blocks, number of cylinders, last time written,
	-also contains tunables
	  -minfree, fs_rps, fs_maxbpg, optimization preference, etc
Cylinder Groups
	-each cylinder group has backup Superblock at beginning
	-then Inodes
	-then Data blocks

------
INODE DETAILS

An Inode is a 128-byte structure that contains all details
about a file except its name (this includes things like
permissions, atime, mtime, ctime, owner, etc).

Within the inode are 12 direct data pointers, locating the
first 12 data blocks of the file (about 96 kB of data,
if using 8kB data blocks size).

The 13th pointer locates an "indirect" block that is nothing
but a large array of pointers to the next 2048 data blocks.
(about 1.6 MB of data, if using 8kB data block size)

The 14th pointer locates another indirect block that contains
2048 pointers, each pointing to ANOTHER indirect block.
So 2048*2048*8 is about equal to 34 GB, which is already above 
the addressable range if the fragment size is 1024 bytes.

You can see most of the details of an inode by using the various
stat() system calls, or you can cheat by using truss like this:
	
	# truss -v all -t lstat ls -la /etc/passwd


------
VIEWING CONTENTS OF SUPERBLOCK:

"fstyp" is the most useful tool I have found to do this


------
VIEWING A FILE'S LAYOUT ON THE FILESYSTEM

"filestat" is available on the internet, but Solaris
doesn't coem with a tool that does this for you.  You
can read directly from the disk and do this, but I haven't
figured this part out yet.  "filestat" seems useful enough
for the curious admin.


------
UFS LAYOUT ALGORITHM

This was perhaps the most surprising information I found
in my search.  As it turns out, the default layout for a
file's data blocks is quite discontiguous if it is larger
in size than about 3 MB, and the filesystem is of respectable
size -- say 1GB+.

Assuming a basically empty filesystem, the first 12 data blocks
will be located in the first free cylinder group.  The next 2048
data blocks will be located in the following cylinder group,
and this will continue ad nauseam.

The parameters that can affect this are fs_maxbpg, optimization
preference of the filesystem, and block/fragment size.

By thinking about what kinds of data reside on a given filesystem,
the above parameters can be tuned especially for higher
performance of that filesystem.  This may also affect how
one would choose to mix applications on filesystems.


------
INTERESTING MATERIAL

manpages
	inode (4)
	statvfs (2)
	fs_ufs (4)
	dir_ufs (4)

commands
	fstyp
	filestat (external utility)
	tunefs
	mkfs
	ff
	
header files
	/usr/include/sys/fs/ufs_fs.h
	/usr/include/sys/fs/ufs_inode.h
	/usr/include/sys/fs/ufs_fsdir.h

books
	Solaris 7 Performance Administration Tools
		Frank Cervone
	Solaris Internals
		Mauro & McDougall

------
VxFS vs UFS

The main differences I have found between VxFS and UFS
so far have been logging and layout startegy.  Where logging
is an option on UFS, it is a main feature of VxFS, meaning
VxFS check/recovery times are on the order of seconds for 
large filesystems.

The layout strategy is fundamentally different for VxFS, though.
As an extent-based filesystem, it allocates contiguous extents
of space and then just keeps track of the offset and length.
For large files on fairly unfragmented filesystems, this should
result in far less overhead than looking up what could be millions
of pointers in the comparable UFS filesystem.

As a general rule, however, VxFS is prone to fragmentation,
where UFS is not.

------
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Wed May 7 15:26:11 2003

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:10 EST