电脑技术学习

提高unixware文件系统性能

dn001 66 0


1. Add the entry "delaylog=nolog" to the mount options in
/etc/vfstab, for example:

/dev/filesys /dev/rfilesys / vxfs 1 no
delaylog=nolog,mincache=closesync SYS_RANGE_MAX

This will run your filesystem with no transaction logging, making
it difficult to repair in the event of a system crash. However,
it will improve performance, and if the files in this filesystem
are considered to be temporary, then this option can be used.

More information on these options are available from the command:

man mount_vxfs

2. The above online manual page refers to the various "mincache"
options. The default is "mincache=closesync" which is the slowest,
but more reliable method.

However, adding the entry "mincache=tmpcache" to the mount options
in /etc/vfstab will cause your filesystem to become unmountable
unless you are using the Advanced VxFS filesystem, On-Line Data
Manager (ODM).

NOTES: In order to achIEve the best performance, it's helpful to look at
the type of disk configurations available:

No Raid - Disks independent, no resilience

Raid 0 - Fast as disks are striped.
1 disk min, max depends on controller.
No Fault Tolerance.
Increase disk capacity.

Raid 1 - Mirrored disks.
2 disks required.
Fault Tolerance.
Hot Spare available.

Raid 5 - Resilience but slow as parity is spread across all the disks.
3 disk min, max depends on controller.
Fault Tolerance.
Hot spare available.

Raid 10 - An entire Raid 5 configuration mirrored.
6 disks min, max depends on controller.
Fault Tolerant.
Hot Spare available.
Very slow but Excellent resilience.
Rarely used.

Raid 0+1 - Raid 0 configuration but each disk is mirrored.
2 disks min, max depends on controller.
Hot Spare available.
Costly, because if you wanted a 100GB Logical Drive
made from 10GB disks you would need 10 disks for the
100GB, and then a further 10 disks for the mirror.

Bus Speed - There are three components: the controller card, PCI bus
inside the server, and the driver for the controller.
All three need to be 64 bit to get true 64 bit,
otherwise there will be a 32 bit bottleneck.

Controller - If the controller has memory with a battery backup
then "Write Back" can be used to use the full benefit
of the on-board cache of the controller. The other
option is the "Write Thru" option which is slower
but more reliable.

NOTE: If the database used is transaction based, then
data can be written to the disk from the cache during
a boot up if the power had gone down earlier. If the
database is not fully opened then the transaction logs
can get confused.

You may want to have separate controllers for separate
filesystem activities.

You can also gain performance by ensuring that if the
controller is multi-channel, each logical drive uses
physical disks on the same channel.

It is recommended to have a separate controller for
tape(s) and CDROM(s), rather than use the same controller
as the disks. It is also recommended to not have too
many SCSI devices on the same bus.

Sizes - You can set the Stripe Size on the controller and the
Block Size of the filesystem to be the same to optimize
performance for large file processing with large
databases, such as Oracle, especially if these databases
are on a separate controller. This would mean increasing
the blocksize of the filesystem, when it is created, to
the same size as the Stripe Size.

The larger the block size is, then each file created will
consume that amount in the inode table. For example, a
block size of 8k will consume 8k in the inode table for
each file created. A typical example of this would be an
Oracle database with a 8k block size setting within Oracle's
configuration. You would make Oracle and the filesystem's
block size consistent.

For some servers you may want to lower the Stripe Size if
lots of small reads and writes are taking place so the
controller is not waiting for its cache to be filled each
time. However, the more read/writes there are, the more
times an interrupt will be created for the action.

The norm though is to use the default RAID configuration
setting for the RAID Stripe Size and the default filesystem
block size for UnixWare 7, which is 1024.

Disks - In general, the more disks, the better performance
as more spindles/heads are being used. The faster
the disk revolution, the better. You should also
check the cache provided by the disk from the
disk's manufacturer.

Filesystems - Assign individual filesystems to separate tasks to
reduce activity to the inode tables. In general,
accept the number of inodes recommended by UnixWare
7 when adding the filesystem, even when you are expecting
there to be a large number of small files written there.
In this case, you would normally expect to double the number
of recommended inodes.

However, the "diskadd" command will actually give you an
inode value of unlimited.

To check these settings, run this command on the filesystem
(slice) in question:

mkfs -m /dev/rdsk/cXbXtXdXsX

Please note: During the Initial System Load (ISL), UnixWare2
& UnixWare 7.0.x will provide an Advanced option
during the creation of the filesystems, in the
Customise Filesystems option. By default, these
"vxfs" filesystems will have a 64K inode limit but
this can be changed to "Unlimited" in this
Advanced option.

In the case of UnixWare 7, the best compromise for resilience
and performance is to use Raid 1 for the root filesystem on one
Logical Disk.

For each filesystem required, have a Logical Drive consisting of
the necessary Raid 0+1 disks required for the size. If the number
of disks is a factor, which is the norm, then have a reduced number
of Logical Drives splitting the filesystems between them.

For any server, this value may need to be altered to find the best
optimization, if the server is doing a combination of database and
i/o work.

General speed testing, other than installing and testing your
application, would be:

dd if=/dev/zero of=/<filesystem>/tmp/testfile bs=1024 count=5000000

(creates a 5GB file)

This will monitor the timings of any differences set; use
"sar -d" to monitor disk activity of each logical disk.

Also, before the server goes into production, remove disk(s) from
the RAID configurations to ensure that the RAID controller detects
that a disk has failed and that when it is re-inserted the RAID is
successfully rebuilt.

Once in production, you may wish to look at the disk fragmentation
options available with the "fsadm" command, eg.:

fsadm -p 1 -e -s -v /<filesystem>
fsadm -p 1 -D -d -s -v /<filesystem>

NOTE: Even if you optimally tune your vxfs filesystem, you might still
encounter performance problem when restoring from tape a large
number of small files. This overhead is only due to the vxfs
design. For instance, while using cpio to restore around 300,000
files with size of 4-14k, it has been reported that a drop in
performance was experienced from a 1-2M transfer rate (tape
specification) to a 500K transfer rate.

标签: