This presentation is one that I gave at the New Mexico, New York City, and Seattle VMUG conferences (this specific deck came from the Seattle conference, as you can tell by the Twitter handle on the first slide). The topic is design considerations for running vSphere on NFS. This isn’t an attempt to bash NFS, but rather to educate users on the things to avoid if you’re going to build a rock-solid NFS infrastructure for your VMware vSphere environment. I hope that someone finds it useful.
My standard closing statements goes here–your questions, thoughts, corrections, or clarification (always courteous, please!) are welcome in the comments below.
-
Very nice Scott. For two years I managed a large Tier 1 VMware on NFS shop (500 hosts, 25 vCenter Servers). Designed right with the proper care and feeding, NFS is both a viable and scable option. That’s not to say it doesn’t have it’s own set of unique headaches which range from design, to impelmentation, to troubleshooting.
-
Excellent NFS 101 Scott. In reference to Ryan, If your NAS appliances can support sub interfaces or alias addresses on an aggregated LINK, connecting to different Datastores via different VMkernels/subnets works a treat.
The problem with NFS has always been the lack of load balancing.. that is going to change, and as soon as it does I can see iSCSI Datastores becoming less common.
-
the slides are blocked by EMC IT, sigh…
-
Scott,
Very useful and timely presentation. It was great to see this presentation here in Seattle.
Regarding Jumbo frames during the presentation, if i captured your thoughts accurately you effective explained:
- 10Gb CNA’s typically mask performance gains utilizing Jumbo Frames – performance benefits minimal.
- Beneficial to low IOP / Large Payload apps / instances.
- 95% of apps won’t realize a benefit from Jumbo Frames.What are your thoughts on UCS implementations:
- 10 Gb connectivity
- All vmnics funnel through a set of uplink connections to the physical switches which effectively would require all interfaces to be configured with Jumbo frames.
- SQL Best Practices (per vBCA video) recommend Jumbo Frames.I am attempting to define a solid Reference Architecture for UCS implementations I am involved in. Your views and input would be greatly appreciated!
Thanks – Jerry
-
Scott:
I’m finishing part 1 in a series on NFS deployment in SMB space and your slide deck popped up as a reference a couple of times. Your overview succinctly hits the high-points, and since I didn’t hear the preso I can assume you elaborated on a couple of IP-hash relevant issues:
1) hashing related to switches (typically layer-2);
2) hashing related to NAS (layer-2 to layer-4).vSphere has no deterministic influence on return-path choice (other than how it advertises and withdraws its MAC and sources its vmknic IPs), but those elements listed above do; and different switch/NAS vendors use different algorithms to determine how deeply hashing goes. For instance, low-end switches will likely use (or be limited to) layer-2 hashes yet Linux and Solaris derivatives (NAS) will use layer-3 or layer-4 hashing as default (multiplexing datastore traffic per session/TCP-data port).
Higher-end switches will often use layer-3 hashes, with more advanced ones using layer-4 (beyond layer-4 has dubious value for NFS). I have not seen any studies on the importance, risks or performance factors related to hash alignment where storage is concerned, although the biggies – out-of-order delivery and latency – can be assumed to be affected to various degrees. These factors could shift switch vendor (and/or model) selection for heavy NFS shops and could imply performance issues at the network layer based on NFS volume/loads (i.e. NFS packet rate increasing switch packet inspection overhead).
Also, the issue of VMware Tools guest OS tuning was absent in the deck, although I came across your valuable input on Frank Denneman’s blog on the topic as well as Jason Boche’s NFS postings so I have to assume a question or two came up in the preso. This topic is a very important one to my posting – especially where the differences between Windows and Linux treatments are concerned.
I’ve yet to find a VMware source that explains why Windows timeout adjustments per VMware Tools are still 60 seconds and Linux timeouts are 180. Your recommendation was 125 seconds back in 2009, but have you shifted your guidance to agree with Jason Boche and NetApp at 190 seconds? Likewise, as an EMC’r with closer ties to VMware now, do you have any insight into the discrepancy between Linux and Windows defaults per VMware Tools?
Thanks for all you do in support of the VMware community – it might be a job now, but your passion still shows!



10 comments
Comments feed for this article