Scott's Weblog The weblog of an IT pro specializing in virtualization, networking, open source, and cloud computing

VMware HA Failover Capacity Changes

Continuing the discussion regarding VMware HA failover capacity started in this article and continued in this follow-up article, it appears that VMware has added the ability to modify the “slot size” used in calculating VMware HA failover capacity as part of ESX Server 3.5 and VirtualCenter 2.5.

Alerted to this VMware KB article by Duncan Epping of Yellow Bricks in this posting on his site, there’s a reference in the PDF from VMware that discusses a new option for setting the default “slot size”. To quote from Duncan’s site:

If no VM reservations are set in a cluster VMware HA assumes cluster-wide average CPU and memory reservation sizes of 256 Mhz and 256 MB to use in admission control calculations. Alternative values can be specified instead…

Add the das.vmMemoryMinMB = <value> and das.vmCpuMinMHz = <value> option/value pairs to the cluster’s settings where <value> represents the desired values in terms of MB and MHz. Higher values will reserve more space for failovers.

So this looks like it’s allowing us to specify how VMware HA should calculate the default slot size, but as in so many areas of VMware HA there is precious little documentation. I like VMware HA; I really do. But VMware needs to get somebody on the ball to document the exact configuration and operation of VMware HA so that this solution becomes less of the “black box” that it is today. As it stands currently, many customers are forgoing the benefits of VMware HA because it can’t be reliably and consistently configured and debugged.

<aside>Cases in point: I was at a meeting before Christmas with a customer who is having problems with their VMware HA clusters and we can’t find anyone—inside or outside of VMware—that can speak definitively about VMware HA, how it should be configured, or how it operates. Back in October, I blogged about problems with isolation response, and still haven’t gotten those problems resolved. C’mon, VMware, don’t drop the ball!</aside>

If anyone can shed some light on these new settings—I plan on testing them in the lab as soon as possible—that would be very useful. In the meantime, I encourage everyone to check out the PDF linked in the VMware KB article on VMware HA best practices. And, just for fun, check out this white paper on the VMware HA VM failure monitoring functionality that’s new in ESX Server 3.5.

Be social and share this post!