I'm trying to get some advice on how to best set up an NFS server to use with ESXi as a datastore. I took a stab at it with CentOS 7, but the performance is abysmal. I'm hoping someone can point out some optimization that I've overlooked, but I'm open to trying another free OS as well.
I have an old Dell PowerEdge T310 with a SAS 6i/r hard drive controller. I have two 2 TB hard drives and two 1 TB hard drives. Due to the limitations of the SAS 6i/r controller, I have left the drives independent and went with software RAID 1 + LVM to get 3 TB of usable space like this:
# mdadm --create /dev/md0 --run --level=1 --raid-devices=2 /dev/sdd /dev/sde
# mdadm --create /dev/md1 --run --level=1 --raid-devices=2 /dev/sdf /dev/sdg
# vgcreate vg0 /dev/md0 /dev/md1
# lvcreate -l 100%VG -n lv0 vg0
Then I formatted the new LVM partition with XFS:
# mkfs.xfs /dev/vg0/lv0
I mounted this at /var/nfs and exported it with the following options:
# cat /etc/exports
/var/nfs 192.168.10.3(rw,no_root_squash,sync)
I was able to add this to my ESXi host using the vSphere Client as a new datastore called nfs01.
I then edited my VM through the vCenter web interface, adding a new 2.73 TB disk.
The guest OS is Windows Server 2012. Through the Disk Management interface, I initialized the disk GPT and created a new volume. This took several minutes. Then I tried quick formatting the volume with NTFS. I cancelled this after about 4 hours. I then shrunk the volume to 100 MB and formatted that instead. That succeeded after several minutes, but just creating a blank text document on this drive takes about 8 seconds.
The NFS server is plugged into the same gigabit switch as the ESXi server. Here are the ping times:
~ # vmkping nfs.qc.local
PING nfs.qc.local (192.168.10.20): 56 data bytes
64 bytes from 192.168.10.20: icmp_seq=0 ttl=64 time=0.269 ms
64 bytes from 192.168.10.20: icmp_seq=1 ttl=64 time=0.407 ms
64 bytes from 192.168.10.20: icmp_seq=2 ttl=64 time=0.347 ms
I ran an I/O benchmark tool and got these results: Imgur: The most awesome images on the Internet
At the same time vCenter showed this performance data for the datastore: Imgur: The most awesome images on the Internet
I noticed that some I/O operations done locally on the NFS server are also slow. For example I can run "touch x" and it completes instantly, but if I run "echo 'Hello World' > x" it can take anywhere from 0 to 8 seconds to complete.
This is my first attempt at using NFS (my two ESXi hosts use local storage) so I'm not sure if any of this is normal.