With the release of VMware vSphere 4 earlier this year, VMware officially introduced VMware Fault Tolerance (VMware FT), a new mechanism for providing extremely high levels of availability to virtual machine workloads. As I’ve talked with customers, I’ve noticed a growing number of customers who are unaware of the differences between the types of high availability that VMware provides (in the form of VMware HA and VMware FT) and operating system-level clustering (such as Microsoft Windows Failover Clustering). Although both types of technology are intended to increase availability and reduce downtime, they are very different and offer different types of functionality.
Consider these points:
- While using VMware HA will protect you against the failure of an ESX/ESXi host, VMware HA won’t—by default—protect you against the failure of the guest operating system. An OS-level cluster, on the other hand, does protect against the failure of the guest operating system. +1 for OS-level clustering.
- VMware clusters that are using VMware HA can choose to use VM Failure Monitoring and gain some level of protection against the failure of the guest operating system, but you still won’t get protection of the specific application within the guest operating system, unlike an OS-level cluster. +1 for OS-level clustering.
- These same arguments also apply to VMware FT. VMware FT won’t protect you against guest operating system failure—a crash of the OS in the primary VM generally means a crash of the OS in the secondary VM at the same time—and it won’t protect you against application failure. +1 for OS-level clustering.
- You can’t failover between systems using VMware HA or VMware FT in order to perform OS upgrades or apply OS patches. +1 for OS-level clustering.
- Similarly, you can’t failover between systems using VMware HA or VMware FT in order to do a rolling upgrade of the application itself. +1 for OS-level clustering.
- Of course, the VMware technologies do have some advantages. Both VMware HA and VMware FT are far, far simpler to enable and configure than an OS-level cluster. +1 for VMware.
- Both VMware HA and VMware FT don’t require any application support in order to protect the VM and its workloads. +1 for VMware.
- Neither VMware HA nor VMware FT require that you license specific editions of the guest operating system or application in order to be able to use their benefits. +1 for VMware.
- VMware HA can produce higher levels of utilization within a host cluster than using OS-level clustering. +1 for VMware.
- VMware FT can provide higher levels of availability than what is available in most OS-level clustering solutions today. +1 for VMware.
This is not a knock against any of technologies listed—VMware HA, VMware FT, or OS-level clustering—but rather an exploration of their advantages, disadvantages, similarities, and differences. Hopefully, this will help readers who might not be as familiar with these products make a more informed decision about which technologies to deploy in their data center. (Hint: You’ll probably need all of them.)
Tags: Microsoft, Virtualization, VMware, VMwareFT, VMwareHA, vSphere
-
Trackback from uberVU - social comments on Friday, October 30, 2009 at 4:04 pm
-
Gabrie,
Is that really the case? I found http://communities.vmware.com/thread/206059 which doesn’t have any authoritative answer with references, but I would be inclined to think it is only one license required since the second instance is not readily usable without a failure event so it’s not unlike many other DR scenarios where you wouldn’t need multiple licenses.
cgb
-
Ah OK I can see how it can be parsed that way.. I’d just suggest that HA doesn’t protect from a ESX failure at all.. It _recovers_ from a failure by booting a VM elsewhere.. Which is why I’d expected you meant FT because that really does protect you from a ESX failure by maintaining availability/uptime.
cgb
-
Scott,
By counting points on either side you make it sound like its an either or choice between VMware HA/FT and OS level clustering. I presume that this is not really the case and there is no reason why one can’t use both levels of protection?
The VMware HA/FT would be there, in an N+1 configuration, to protect against physical tin failure with OS level clustering sitting above to provide all the benefits you describe in your post..?
-
Disclosure: I’m the product marketing manager at Stratus.
Or you could run VMware on a Stratus ftServer and eliminate all the additional hardware and software costs and get better then 5-nines uptime right out of the box on an Intel platform. Better yet, run VMware HA on the Stratus and you’ve got any unplanned hardware and OS downtime covered as well as individual VM/application. And VMotioning from a Stratus to any other x86/x64 is not different then anything else – Intel to Intel. Simpler, better SLA and better TCO. A winner all around.
-
Scott,
Great article… and very timely as I have a customer who asked what the pluses and minuses are for Clusters and vmware’s implementation of HA & FT.
But I have some general questions.
Given the way FT works; could a user turn off the FT on a VM…snap or clone the VM then do an update to the apps or OS ….then turn FT back on? This would provide some failback / availability value and still have the original VM running… While I have not done this as a lab exercise …shouldn’t this work in theory? -
Well, guest OSes don’t cluster themselves by default.. if HA can easily be configured to protect against guest OS failure that should be good.
I think VMware should add an API/option through VMware tools, where a guest can declare itself to have failed at a certain point in time, VMware will kill the VMX process, and ensure that VMware HA/FT will react accordingly.
There really should be an admin control too
Right click a VM choose “Fail this VM”To ungracefully kill it (as if the host died), and allow HA to bring it back up somewhere else.
But for guest OS patched/maintenance, that’s what vMotion is for.
Failing intentionally as a routine procedure seems like a dangerous strategy, granted in a guest OS-level cluster it’s your only option. -
You can already enable guest monitoring which sends a heartbeat probe to VMware tools periodically and listens for a response. If it doesn’t receive it in a given threshold it will power cycle the VM.
-
A very clear article… yesterday i had the same argument with some one. I disagreed that using FT creates a higher uptime for an application. F.i mail, my view is that you can create several level of redundancy (virtual) hardware, operating system and application. As i can see FT (and HA) is only valid for the hardware and operating system part. But these tools are not able to ‘monitor’ several mail (ms exchange) services, like information store.
For this one could use ms clustering … in clustering you can create dependent cluster resource(s). If the main goal is to have a high availability of an application then FT and HA aren’t sufficient to archieve this goal. Or am i missing a point?
-
Scott,
In reading VMware’s ‘Setup for Failover Clustering and Microsoft Cluster Service’, the vSphere MSCS Setup Limitations section states that none of the VMware high availability features (HA, FT, DRS or vMotion) are supported when using MSCS clustered virtual machines. Similar limitations exist with EMC AutoStart as well.
As a provider of turnkey systems that rely on having highly available applications, this seems to nullify most of the benefit of virtualization. Are there any options for providing application (Windows service) level fault tolerance that will allow for continued use of the VMware high availability features?
Thanks,
Rob. -
Scott,
In your book, Mastering VMware vSphere 4 (pg 458-459), you seem to indicate that virtual machines in a clusered configuration are not valid candidates for VMotion, and they cannot be part of a DRS or HA cluster. True? One of your posts seems to indicate that you can use VMware HA with MSCS.
Thanks
David -
Rob,
If you’re looking for a product that integrates with HA/FT and vMotion, detects and automatically repairs Windows system and application failures, you might want to check out vAppHA: –
http://www.neverfailgroup.com/virtualization/vapphatrial.html
-
what is background of HA.how it is work.what is AAM,vpxa,vmap.please tell me
-
hi i m preux pursuing b.tech. my last year thesis will be on the data stored in the vmdk file ?
firstly i will tell you how much i done …
i will created 2 different vmdk
1 -sparse.vmdk
2 -flat.vmdk
open it into a hexeditor which provide me the hexadecimal structure of the vmdk files .by using virtual disk format 5.0 which is avilable on the
vmware site i get some knowledge of the structure of the vmdk file .
when i read the structure of the -flat vmdk its similar to the hard disk structure i can easily seen the MBR(master boot record) ,PBR(partition boot
record) and $MFT(master file table) in the -flat vmdk?
but when i go for the sparse its quite different from the -flat i am using the formula provided in the virtual disk format 5.0 through this i can only
seen the MBR and PBR but when we aplly this fomula for $MFT(master file table) i found nothing ?
so my question is that how i can reach this master file table in -sparse.vmdk on my way (byte by byte reading)so that i can complete my thesis and
degree also..?
thanxs in advance
preux
-
Doesn’t MSCS clustering require RDMs? And, so doesn’t this mean that VADP cannot be used for backups?



21 comments
Comments feed for this article
Trackback link: http://blog.scottlowe.org/2009/10/30/vmware-ha-vmware-ft-and-os-clustering/trackback/