everRun VM, VMware HA, and VMware FT

Having previously discussed Marathon Technologies’ everRun VM product in conjunction with XenServer HA as part of XenServer 5, I think that I have some useful information to bring to the recent discussion that has come to light about everRun VM vs. VMware HA vs. VMware FT.

Apparently, this discussion started with a blog entry by Marathon titled VMware FT – The Top Four Reasons it’s Kinda Sorta Fault Tolerance. Mike DiPetrillo of VMware responded from his personal blog, first tackling Marathon’s blog post and then again tackling comparisons posted on Marathon’s web site. Various others also weighed in, such as Duncan at Yellow Bricks and TechTarget’s Server Virtualization Blog.

I’ve spoken with the folks from Marathon a couple of different times about everRun and its functionality, so let me attempt to compare these three products—everRun VM, VMware HA, and VMware FT—with an eye toward understanding the differences between them.

  • Marathon everRun VM provides two levels of protection: Level 1 and Level 2. Level 1 is basic failover, and is included in XenServer 5 as XenServer HA. In this regard, it is essentially the same as VMware HA in that it will restart VMs in the event of host failure. Both products calculate available capacity for failover but do not reserve those resources in advance; hence, neither of them can provide guaranteed failover. VMware HA seems to have an upper hand here because admission control can actually prevent users from powering on VMs if there are not enough resources to provide failover for that VM. From all information I have been able to obtain, everRun VM Level 1/XenServer HA lacks that ability, and it’s possible therefore that users could power on more VMs than the resource pool could sustain in the event of hardware failure. Both products should be considered “best effort” as a result. Users wanting to make comparisons between Marathon everRun VM and VMware HA should constrain their comparison to everRun Level 1. Otherwise, the comparison is not a like-to-like comparison.
  • Marathon everRun VM goes on to add Level 2 protection for component-level failure. It’s true that this level of protection exceeds anything that can be provided via VMware HA today. With component-level protection, I/O to or from a failed storage device or a failed network device is transparently redirected to another host, where an identical VM environment has been established. Please note that the two VM environments are not both executing at the same time, but that resources on the secondary host are reserved and cannot be used by any other VMs. These resources include not only RAM, but also storage and networking. If there is a host failure, the VM is restarted on the secondary host. Because resources were pre-allocated, everRun VM is able to provide guaranteed restart on the secondary host. The functionality provided by everRun VM when configured for Level 2 protection exceeds any functionality that VMware HA has today.
  • On the flip side, however, it’s also fair to note that VMware has not needed to provide component-level fault tolerance because they’ve supported storage multipathing and NIC teaming for quite some time. It’s my understanding that those features have only recently made it into the XenServer product line.
  • VMware Fault Tolerance (FT) and everRun VM Level 3 are comparable. Both establish an identical VM on another host and keep that VM “mirrored” with the original VM. If there is a host failure, the “mirrored” VM will automatically take over right where the primary was when it failed. It appears that everRun VM might have an edge here because it again supports component-level failover, but given that neither product is available yet it’s still a bit too early to be making calls on which product is “better”.
  • As for the “complexity” of one product versus the other, both have their own complexities. Marathon everRun VM requires a dedicated network link, called the “Availability Link”, in order to provide the component-level protection. I would assume the Availability Link will be needed for everRun VM Level 3 as well. That corresponds directly to VMware FT’s logging NIC. VMware HA does not require any special NICs or unique configurations; it’s unclear if the same is true for everRun VM Level 1/XenServer HA protection. I’ll have to call Marathon out on their knocks against setting up NIC teaming and storage multipathing; those tasks may be complicated in XenServer environments but are drop-dead simple in VMware ESX environments. The same goes for enabling VMware HA and VMware DRS.

As you can see, each product has its own set of strengths and weaknesses.

As a final note, as SearchServerVirtualization.com stated, comparisons between these two product sets are a bit irrelevent anyway: VMware’s functionality works only with VMware ESX environments, and Marathon’s functionality works only with XenServer. It’s not like users have to choose between them in the same virtualization environment.

I welcome everyone’s input and thoughts on this matter. Please contribute in the comments to this article.

Tags: , , , , , ,

  1. Duncan’s avatar

    You’re right that users don’t have to choose between them in the same environment but HA is one of the key components in my opinion with DRS for virtualization. I can imagine people would also make their choice for a certain hyper-visor based on DR features. And about FT and Level3, they aren’t comparable. Maybe I misunderstood it, but Level 3 is only available for certain platforms(windows only)… FT just synchronizes, and doesn’t care about the actual OS.

    I totally agree that both products have it’s own pros and cons. And I think it’s great that there’s finally some competition, this will only make the development of cool technology go faster than it’s going today.

  2. slowe’s avatar

    Duncan,

    I was not aware that everRun VM Level 3 protection was only for Windows-based VMs; do you have a link for that information? If that’s true, then it clearly tilts the balance toward VMware FT, which is guest OS-agnostic.

    I would also agree that the addition of DRS to a VMware FT-based solution is quite valuable, because the secondary VM can be intelligently placed anywhere within the cluster based on resource availability. I haven’t seen any indication that everRun can do the same; in fact, it looks like they work only with host pairs.

  3. Chuck’s avatar

    This is from the everRun VM whitepaper:
    PROTECT ANY WINDOWS APPLICATION
    everRun VM is application independent – it works
    with any Windows Server application. No cluster
    awareness needed. No customization required.

    It doesn’t say that it won’t work with another guest OS, but if it works anything like everRun FT it requires drivers inside the guest and possibly modifications to the HAL.

    We currently have multiple everRun FT installations, and the biggest limitation we’ve run into is that the guest VM can only use a single CPU(or core). I was excited to hear the VMware announcement because the services we’re using everRun for would benefit from SMP, but could easily be virtualized; however one of the comments on Mike D.’s blog seems to indicate that VMware FT has the same single CPU limitation.

  4. Jack Pastor’s avatar

    Look … nobody is getting a free Tandem here. Regardless of who come out first with true “system level” fault tolerance, we’re still talking single virtual CPUs being protected, so if you’re planning on using either technology to protect your Exchange or SQL servers, plan on a wait, eh?

    I would not say VMWare has any kind of upper hand, (nor does XenServer) necessarily. VMware’s touted “memory over-commit”, one of the few features left that they can lord over Xen, makes re-starting in HA more complex. XenServer’s “weakness” in dedicating static memory to VMs actually makes it far more predictable in deciding if sufficient resources on remaining hosts are available to start VMs from a failed host.

    This is still a “niche” feature to protect non-critical apps on low-powered VMs, but let’s look at reality. the world of virtualization is evolving rapidly, and also-rans like Xen and Hyper-V are gaining features far more rapidly than VMware expected.

    It appears that a lot of value-add is going to come from third-parties, which might erode some of the price advantages of Citrix / Virtual Iron etc., but will certainly give the “kitchen-sink-of-features” approach of VMware a challenge to those who prefer the “best-of-breed” cliche’ over “one-throat-to-choke.”

  5. Mike DiPetrillo’s avatar

    Scott,

    Thanks for putting a “neutral” perspective on things. I think you’re spot on in your analysis. I also agree with Duncan on the value of DRS in helping to distribute load in the cluster so you can actually use VMware FT without impacting other loads.

    Keep up the good work on the blog!

  6. slowe’s avatar

    Chuck,

    I believe you are correct about the single vCPU limitation on VMs that will be protected by VMware FT.

    Jack,

    All virtualization solutions have their strengths and their weaknesses. VMware is not perfect and isn’t necessarily the right solution for every organization–and the same could be said of Hyper-V, XenServer, KVM, or Virtual Iron.

    Mike,

    If you’re in agreement with me, then I must not have been “neutral” enough! :-)

    Seriously, though, it’s important to recognize that everRun does have some advantages over VMware HA, and VMware HA has some advantages over everRun. Organizations need to choose the solution that best fits their needs.

  7. MTC’s avatar

    Great post and fair evaluation of both technologies. VMware is clearly the industry leader, and great product, however it is nice to see that customers have a choice in which solution that can deploy. Over time that gap will start to close specific to feature sets

    VMware FT is great technology and only validates the efforts and offerings of Marathon and Citrix. everRun FT will be very similar in nature compared to VMware FT, and as you stated, it just adds the dimension of component level re-direction and can be deployed without shared storage.

    Some customers may choose to tier their highly available applications and adopt a lower collapse ratio over a dense consolidation ratio per server. Like they do with teired storage taday.

    Where component level HA brings value are in those situations where customers are cost conscious and deploy on Blade solutions or inexpensive rack and stack such as Dell 2950′s where real-estate is a limitation in terms of slot count. I single failed component which represents a SPOF can cause a service level outage of VMs without initiating an HA event. many customers deploy on this class server based on cost. There are simply not enough slots in some cases to team across multiple NIC’s or you are in a trade off where you deploy a single FC-HBA. In that case even redundant FC switches do not help.

    Again, these are common deployments and I am not insinuating that IT people are ignorant for choosing a blade architecture. There are always trade-offs… fast safe or cheap and companies are budget conscious in today’s market.

    Healthy competition is just plain a good thing for the customers and drives innovation on

  8. Virtualization Master’s avatar

    Hi,

    I have came across a great post & video on the upcoming VMware FT at http://www.virtualizationteam.com/virtualization-vmware/vmware-esx-40-ft-fault-tolerant-sneak-peek.html so check it out to find out VMware upcoming surprise with VMware 4.0. It will be the best available continuity feature available in the virtualization market. what do u think?

    Enjoy,
    Virtualization Master

  9. Choo’s avatar

    Right. BUT, VMware HA or FT need to SAN infrastructure. So, This is high cost than everRun. Also, we must do consider of Virtual Center fail of VMware HA. Then, VMs stop running.

  10. slowe’s avatar

    Choo, you are correct in that VMware HA and VMware FT require a SAN infrastructure. The rest of your statements, however, are incorrect.

    If vCenter Server (VirtualCenter Server) fails, VMs do *NOT* stop running. A VM won’t stop running unless the ESX/ESXi host upon which it is running fails. Additionally, once VMware HA is configured, the loss of vCenter Server won’t prevent VMware HA from operating, so that if an ESX/ESXi host fails, VMware HA *WILL* restart that host’s VMs on another host in the cluster.

    Thanks for your comment!

  11. chad king’s avatar

    Not that it may matter much now – but throw DRS FT in the mix now – lots of fun :-)

  12. Mike’s avatar

    Is there anyone that can tell me how to assign the target ID to a drive? It’s an FT setup, I have one drive with the same ID as the other drive on the same server. I am not able to do a mirror copy. I’m stuck…HP says its marathon, Marathon says it’s HP…honestly tired of dealing with both. Any help would be greatly appreciated….

    Thanks,