You are currently browsing articles tagged VMotion.

Welcome to Technology Short Take #41, the latest in my series of random thoughts, articles, and links from around the Internet. Here’s hoping you find something useful!


  • Network Functions Virtualization (NFV) is a networking topic that is starting to get more and more attention (some may equate “attention” with “hype”; I’ll allow you to draw your own conclusion there). In any case, I liked how this article really hit upon what I personally feel is something many people are overlooking in NFV. Many vendors are simply rushing to provide virtualized versions of their solution without addressing the orchestration and automation side of the house. I’m looking forward to part 2 on this topic, in which the author plans to share more technical details.
  • Rob Sherwood, CTO of Big Switch, recently published a reasonably in-depth look at “modern OpenFlow” implementations and how they can leverage multiple tables in hardware. Some good information in here, especially on OpenFlow basics (good for those of you who aren’t familiar with OpenFlow).
  • Connecting Docker containers to Open vSwitch is one thing, but what about using Docker containers to run Open vSwitch in userspace? Read this.
  • Ivan knocks centralized SDN control planes in this post. It sounds like Ivan favors scale-out architectures, not scale-up architectures (which are typically what is seen in centralized control plane deployments).
  • Looking for more VMware NSX content? Anthony Burke has started a new series focusing on VMware NSX in pure vSphere environments. As far as I can tell, Anthony is up to 4 posts in the series so far. Check them out here: part 1, part 2, part 3, and part 4. Enjoy!


  • Good friend Simon Seagrave is back to the online world again with this heads-up on a potential NIC issue with an HP Proliant firmware update. The post also contains a link to a fix for the issue. Glad to see you back again, Simon!
  • Tom Howarth asks, “Is the x86 blade server dead?” (OK, so he didn’t use those words specifically. I’m paraphrasing for dramatic effect.) The basic premise of Tom’s position is that new technologies like server-side caching and VSAN/Ceph/Sanbolic (turning direct-attached storage into shared storage) will dramatically change the landscape of the data center. I would generally agree, although I’m not sure that I agree with Tom’s statement that “complexity is reduced” with these technologies. I think we’re just shifting the complexity to a different place, although it’s a place where I think we can better manage the complexity (and perhaps mask it). What do you think?


Cloud Computing/Cloud Management

  • Juan Manuel Rey has launched a series of blog posts on deploying OpenStack with KVM and VMware NSX. He has three parts published so far; all good stuff. See part 1, part 2, and part 3.
  • Kyle Mestery brought to my attention (via Twitter) this list of the “best newly-available OpenStack guides and how-to’s”. It was good to see a couple of Cody Bunch’s articles on the list; Cody’s been producing some really useful OpenStack content recently.
  • I haven’t had the opportunity to use SaltStack yet, but I’m hearing good things about it. It’s always helpful (to me, at least) to be able to look at products in the context of solving a real-world problem, which is why seeing this post with details on using SaltStack to automate OpenStack deployment was helpful.
  • Here’s a heads-up on a potential issue with the vCAC upgrade—the upgrade apparently changes some configuration files. The linked blog post provides more details on which files get changed. If you’re looking at doing this upgrade, read this to make sure you aren’t adversely affected.
  • Here’s a post with some additional information on OpenStack live migration that you might find useful.

Operating Systems/Applications

  • RHEL7, Docker, and Puppet together? Here’s a post on just such a use case (oh, I forgot to mention OpenStack’s involved, too).
  • Have you ever walked through a spider web because you didn’t see it ahead of time? (Not very fun.) Sometimes I feel that way with certain technologies or projects—like there are connections there with other technologies, projects, trends, etc., that aren’t quite “visible” just yet. That’s where I am right now with the recent hype around containers and how they are going to replace VMs. I’m not so sure I agree with that just yet…but I have more noodling to do on the topic.


  • “Server SAN” seems to be the name that is emerging to describe various technologies and architectures that create pools of storage from direct-attached storage (DAS). This would include products like VMware VSAN as well as projects like Ceph and others. Stu Miniman has a nice write-up on Server SAN over at Wikibon; if you’re not familiar with some of the architectures involved, that might be a good place to start. Also at Wikibon, David Floyer has a write-up on the rise of Server SAN that goes into a bit more detail on business and technology drivers, friction to adoption, and some recommendations.
  • Red Hat recently announced they were acquiring Inktank, the company behind the open source scale-out Ceph project. Jon Benedict, aka “Captain KVM,” weighs in with his thoughts on the matter. Of course, there’s no shortage of thoughts on the acquisition—a quick web search will prove that—but I find it interesting that none of the “big names” in storage social media had anything to say (not that I could find, anyway). Howard? Stephen? Chris? Martin? Bueller?


  • Doug Youd pulled together a nice summary of some of the issues and facts around routed vMotion (vMotion across layer 3 boundaries, such as across a Clos fabric/leaf-spine topology). It’s definitely worth a read (and not just because I get mentioned in the article, either—although that doesn’t hurt).
  • I’ve talked before—although it’s been a while—about Hyper-V’s choice to rely on host-level NIC teaming in order to provide network link redundancy to virtual machines. Ben Armstrong talks about another option, guest-level NIC teaming, in this post. I’m not so sure that using guest-level teaming is any better than relying on host-level NIC teaming; what’s really needed is a more full-featured virtual networking layer.
  • Want to run nested ESXi on vCHS? Well, it’s not supported…but William Lam shows you how anyway. Gotta love it!
  • Brian Graf shows you how to remove IP pools using PowerCLI.

Well, that’s it for this time around. As always, I welcome all courteous comments, so feel free to share your thoughts, ideas, rants, links, or feedback in the comments below.

Tags: , , , , , , , , , , , , ,

Welcome to Technology Short Take #28, the first Technology Short Take for 2013. As always, I hope that you find something useful or informative here. Enjoy!


  • Ivan Pepelnjak recently wrote a piece titled “Edge and Core OpenFlow (and why MPLS is not NAT)”. It’s an informative piece—Ivan’s stuff is always informative—but what really drew my attention was his mention of a paper by Martin Casado, Teemu Koponen, and others that calls for a combination of MPLS and OpenFlow (and an evolution of OpenFlow into “edge” and “core” versions) to build next-generation networks. I’ve downloaded the paper and intend to review it in more detail. I’d love to hear from any networking experts who’ve read the paper—what are your thoughts?
  • Speaking of Ivan…it also appears that he’s quite pleased with Microsoft’s implementation of NVGRE in Hyper-V. Sounds like some of the other vendors need to get on the ball.
  • Here’s a nice explanation of CloudStack’s physical networking architecture.
  • The first fruits of Brad Hedlund’s decision to join VMware/Nicira have shown up in this joint article by Brad, Bruce Davie, and Martin Casado describing the role of network virutalization in the software-defined data center. (It doesn’t matter how many times I say or write “software-defined data center,” it still feels like a marketing term.) This post is fairly high-level and abstract; I’m looking forward to seeing more detailed and in-depth posts in the future.
  • Art Fewell speculates that the networking industry has “lost our way” and become a “big bag of protocols” in this article. I do agree with one of the final conclusions that Fewell makes in his article: that SDN (a poorly-defined and often over-used term) is the methodology of cloud computing applied to networking. Therefore, SDN is cloud networking. That, in my humble opinion, is a more holistic and useful way of looking at SDN.
  • It appears that the vCloud Connector posts (here and here) that (apparently) incorrectly identify VXLAN as a component/prerequisite of vCloud Connector have yet to be corrected. (Hat tip to Kenneth Hui at VCE.)


Nothing this time around, but I’ll watch for content to include in future posts.


  • Here’s a link to a brief (too brief, in my opinion, but perhaps I’m just being overly critical) post on KVM virtualization security, authored by Dell TechCenter. It provides some good information on securing the libvirt communication channel.

Cloud Computing/Cloud Management

  • Long-time VMware users probably remember Mike DiPetrillo, whose website has now, unfortunately, gone offline. I mention this because I’ve had this article on RabbitMQ AMQP with vCloud Director sitting in my list of “articles to write about” for a while, but some of the images were missing and I couldn’t find a link for the article. I finally found a link to a reprinted version of the article on DZone Enterprise Integration. Perhaps the article will be of some use to someone.
  • Sam Johnston talks about reliability in the cloud with a discussion on the merits of “reliable software” (software designed for failure) vs. “unreliable software” (more traditional software not designed for failure). It’s a good article, but I found the discussion between Sam and Massimo (of VMware) as equally useful.

Operating Systems/Applications


  • Want some good details on the space-efficient sparse disk format in vSphere 5.1? Andre Leibovici has you covered right here.
  • Read this article for good information from Andre on a potential timeout issue with recomposing desktops and using the View Storage Accelerator (aka context-based read cache, CRBC).
  • Apparently Cormac Hogan, aka @VMwareStorage on Twitter, hasn’t gotten the memo that “best practices” is now outlawed. He should have named this series on NFS with vSphere “NFS Recommended Practices”, but even misnamed as they are, the posts still have useful information. Check out part 1, part 2, and part 3.
  • If you’d like to get a feel for how VMware sees the future of flash storage in vSphere environments, read this.


  • This is a slightly older post, but informative and useful nevertheless. Cormac posted an article on VAAI offloads and KAVG latency when observed in esxtop. The summary of the article is that the commands esxtop is tracking are internal to the ESXi kernel only; therefore, abnormal KAVG values do not represent any sort of problem. (Note there’s also an associated VMware KB article.)
  • More good information from Cormac here on the use of the SunRPC.MaxConnPerIP advanced setting and its impact on NFS mounts and NFS connections.
  • Another slightly older article (from September 2012) is this one from Frank Denneman on how vSphere 5.1 handles parallel Storage vMotion operations.
  • A fellow IT pro contacted me on Twitter to see if I had any idea why some shares on his Windows Server VM weren’t working. As it turns out, the problem is related to hotplug functionality; the OS sees the second drive as “removable” due to hotplug functionality, and therefore shares don’t work. The problem is outlined in a bit more detail here.
  • William Lam outlines how to use new tagging functionality in esxcli in vSphere 5.1 for more comprehensive scripted configurations. The new tagging functionality—if I’m reading William’s write-up correctly—means that you can configure VMkernel interfaces for any of the supported traffic types via esxcli. Neat.
  • Chris Wahl has a nice write-up on the behavior of Network I/O Control with multi-NIC vMotion traffic. It was pointed out in the comments that the behavior Chris describes is documented, but the write-up is still handy, and an important factor to keep in mind in your designs.

I suppose I should end it here, before this “short take” turns into a “long take”! In any case, courteous comments are always welcome, so if you have additional information, clarifications, or corrections to share regarding any of the articles or links in this post, feel free to speak up below.

Tags: , , , , , , , , , , , , ,

Welcome to Technology Short Take #15, the latest in my irregular series of posts on various articles and links on networking, servers, storage, and virtualization—everything a growing data center engineer needs!


My thoughts this time around are pretty heavily focused on VXLAN, which continues to get lots of attention. I talked about posting a dissection of VXLAN, but I have failed miserably; fortunately, other people smarter than me have stepped up to the plate. Here are a few VXLAN-related posts and articles I’ve found over the last couple of weeks:

  • There is a three-part series over at Coding Relic that does a great job of explaining VXLAN, the components of VXLAN, and how it works. Here are the links to the series: part 1, part 2, and part 3. One note of clarification: in part 3 of the series, Denny talks about a VTEP gateway. Right now, the VTEP gateway is the server itself; anytime a packet on a VXLAN-enabled network leaves the physical server to go to a different physical server, it will be VXLAN-encapsulated. It won’t be decapsulated until it hits the destination VTEP (the ESXi server hosting the destination VM). If (when?) VXLAN awareness hits physical switches, then the possibility of a VTEP gateway existing outside the server exists. Personally, it kind of makes sense—to me, at least—to build VTEP gateway functionality into vShield Edge.
  • Some people aren’t quite so enamored with VXLAN; one such individual is Greg Ferro. I respect Greg a great deal, so it was interesting to me to read his article on why VXLAN is “full of fail”. Some of his comments are only slightly related to VXLAN (the rant over IEEE vs. IETF, for example), but Greg’s comment about VMware building a new standard instead of “leveraging the value of networking infrastructure” echoes some of my own thoughts. I understand that VXLAN accomplishes things that existing standards apparently do not, but was a new standard really necessary?
  • Omar Sultan of Cisco took the time to compile some questions and answers about VXLAN. One thing that is made more clear—for me, at least—in Omar’s post is the fact that VXLAN doesn’t address connectivity to the vApps from the “outside” world. While VXLAN provides a logical isolated network segment that can span multiple Layer 3 networks and allow applications to communicate with each other, VXLAN doesn’t address the Layer 3 addressing that must exist outside the VXLAN tunnel. In fact, in my discussions with some of the IETF draft authors at VMworld, they indicated that VXLAN would require a NAT device or a DNS update in order to address changes in externally-accessible applications. This, by the way, is why you’ll still need technologies like OTV and LISP (or their equivalents); see this post for more information on how VXLAN, OTV, and LISP are complementary. If I’m wrong, please feel free to correct me.
  • In case you’re still unclear about the key problem that VXLAN attempts to address, this quote from Ivan Pepelnjak might help (the full article is here):

    VXLAN tries to solve a very specific IaaS infrastructure problem: replace VLANs with something that might scale better. In a massive multi-tenant data center having thousands of customers, each one asking for multiple isolated IP subnets, you quickly run out of VLANs.

  • Finally, you might find this PDF helpful. Ignore the first 13 slides or so; they’re marketing fluff, to be honest. However, the remainder of the slides have some useful information on VXLAN and how it’s expected to be implemented.


I didn’t really stumble across anything strictly server hardware-related; either I’m just not plugged into the right resources (anyone want to make some recommendations?) or it was just a quiet period. I’ll assume it was the former.



  • Did you see this post about new network simulation functionality in VMware Workstation 8?
  • Here’s a good walk-through on setting up vMotion across multiple network interfaces.
  • VMware vSphere Design co-author Maish Saidel-Keesing has a post here on how to approximate the functionality of netstat on ESXi.
  • William Lam has a “how to” on installing the VMware VSA with running VMs.
  • Fellow vSpecialist Andre Leibovici did a write-up on a proof of concept that the vSpecialists did for a customer involving Vblock, VPLEX, and VDI. This was a pretty cool use case, in my opinion, and worth having a look if you need to design a highly available environment.
  • Thinking about playing with vShield 5? That’s a good idea, but check here to learn from the mistakes of others first. You’ll thank me later.
  • The question of defragmenting guest OS disks has come up again and again; here’s the latest take from Cormac Hogan of VMware. He makes some great points, but I suspect that this question is still far from settled.

It’s time to wrap up now; I hope that you found something useful. As always, thanks for reading! Feel free to share your views or thoughts in the comments below.

Tags: , , , , , , , , ,

Beth Pariseau recently published an article discussing the practical value of long-distance vMotion, partially in response to EMC’s announcement of VPLEX Geo at EMC World 2011. In that article, Beth quotes some text from a tweet I posted as well as some text from Chad Sakac’s recent post on VPLEX Geo. However, there are a couple inaccuracies from Beth’s article that I really feel need to be cleared up:

  1. Long-distance vMotion and stretched clusters are not the same thing.
  2. L2 adjacency for virtual machines is not the same as L2 adjacency for the vMotion interfaces.

Regarding point #1, in her article, Beth implies that Chad’s statement “Stretched vSphere clusters over [long] distances are, as of right now, still not supported” is a statement that long-distance vMotion is not supported. Long-distance vMotion, over distances with latencies of less than 5 ms round trip time (RTT), is fully supported. What’s not supported is a stretched cluster, which is not a prerequisite for long-distance vMotion (as I pointed out in the stretched clusters presentation Beth also referenced). If you want to do long-distance vMotion, you don’t need to set up a stretched cluster, so statements of support for stretched clusters cannot be applied as statements of support for long-distance vMotion. Let’s not confuse the two, as they are separate and distinct.

Regarding point #2, the L2 adjacency for virtual machines (VMs) is absolutely necessary for distance vMotion. As I explained here, it is possible to use a Layer 3 protocol to handle the actual VMkernel (vMotion) traffic, but the VMs themselves still require Layer 2 adjacency. If you don’t maintain a single Layer 2 domain for the VMs, then VMs would have to change their IP addresses on a live migration. That’s REALLY BAD and it completely breaks live migration. Once again, there is a very separate and distinct behavior that you’re trying to modify with large L2 domains.

Am I off? Speak your mind in the comments below.

Tags: , , , , ,

The topic of vMotion, it’s practicality, and Layer 2 adjacency for vMotion has been a topic I’ve visited a few times over the last several months. The trend got kicked off with a post on vMotion reality, in which I attempted to debunk an article claiming vMotion was only a myth. The series continued with a discussion of the practicality of vMotion, where I again discussed the various Layer 2 requirements for vMotion.

In the vMotion practicality article, reader Paul Pindell, an employee of F5 Networks, discusses the networking requirements for vMotion. To quote from his comment:

Notice that there is no requirement for the vMotion VMkernel interfaces of the ESX(i) hosts to have what was termed Layer 2 adjacency. The vMotion VMkernel interfaces are not required to be on the same subnet, VLAN, nor on the same L2 broadcast domain. The vMotion traffic on VMkernel interfaces is routable. IP-based storage traffic on VMkernel interfaces is also routable. Thus there are no L2 adjacency requirements for vMotion to succeed.

I was intrigued by this statement, so I contacted Duncan Epping (of Yellow Bricks fame) and discussed the matter with him. Duncan has also posted on this topic on his site as well; both his post and my post are the result of our discussion and collaboration around this matter.

So is Layer 2 adjacency for vMotion a requirement, or not? In the end, the answer is that Layer 2 adjacency for VMkernel interfaces configured for vMotion is not required; vMotion over a Layer 3 interface will work. The caveat is that routed vMotion, as it has sometimes been called, hasn’t been through the extensive VMware QA process and therefore is not yet supported. (Please don’t mistake my use of the word “yet” as any sort of commitment by VMware to actually support it.)

In summary, then: vMotion across two different subnets will, in fact, work, but it’s not yet supported by VMware. As additional information becomes available—as Duncan indicated, the VMware KB article is going to be updated to avoid misunderstanding—I’ll update this post accordingly.

Tags: , , , ,

About a month ago I posted an article titled The vMotion Reality. In that article, I challenged the assertion that vMotion was a myth, not widely implemented in data centers today due to networking complexity and networking limitations.

In a recent comment to that article, a reader again mentions networking limitations—in this instance, the realistic limits of a large Layer 2 broadcast domain—as a gating factor for vMotion and limiting its “practical use” in today’s data centers. Based on the original article’s assertions and the arguments found in this comment, I’m beginning to believe that a lot of people have a basic misunderstanding of the Layer 2 adjacency requirements for vMotion and what that really means. In this post, I’m going to discuss vMotion’s Layer 2 requirements and how they interact (and don’t interact) with other VMware networking functionality. My hope is that this article will provide a greater understanding of this topic.

First, I’d like to use a diagram to help explain what we’re discussing here. In the diagram below, there is a single ESX/ESXi host with a management interface, a vMotion interface, and a couple of network interfaces dedicated to virtual machine (VM) traffic.

VLAN Behaviors with VMware ESX/ESXi

As you can see from this highly-simplified diagram, there are three basic types of network interfaces that ESX/ESXi uses: management interfaces, VMkernel interfaces, and VM networking interfaces. Each of them is configured separately and, in many configurations, quite differently.

For example, the management interface (in VMware ESX this is the Service Console interface, in VMware ESXi this is a separate instance of a VMkernel interface) is typically configured to connect to an access port with access to a single VLAN. In Cisco switch configurations, the interface configuration would look something like this:

interface GigabitEthernet 1/1
  switchport mode access
  switchport access vlan 27
  spanning-tree portfast

There might be other commands present, but these are the basic recommended settings. This makes the management interfaces on an ESX/ESXi host act, look, feel, and behave just exactly like the network interfaces on pretty much every other system in the data center. Although VMware HA/DRS cluster design considerations might lead you toward certain Layer 2 boundaries, there’s nothing stopping you from putting management interfaces from different VMware ESX/ESXi hosts in different Layer 3 VLANs and still conducting vMotion operations between them.

So, if it’s not the management interfaces that are gating the practicality of vMotion in today’s data centers, it must be the VM networking interfaces, right? Not exactly. Although the voices speaking up against vMotion in this online discussion often cite Layer 3 VLAN concerns as the primary problem with vMotion—stating, rightfully so, that the IP address of a migrated virtual machine cannot and should not change—these individuals are overlooking the recommended configuration for VM networking interfaces in a VMware ESX/ESXi environment.

Physical network interfaces in a VMware ESX/ESXi host that will be used for VM networking traffic are most commonly configured to act as 802.1Q VLAN trunks. For example, the Cisco switch configuration for a port connected to a network interface being used for VM networking traffic would look something like this:

interface GigabitEthernet1/10
  switchport trunk encapsulation dot1q
  switchport mode trunk
  switchport trunk allowed vlan 1-499
  switchport trunk native vlan 499
  spanning-tree portfast trunk

As before, some commands might be missing or some additional commands might be present, but this gives you the basic configuration. In this configuration, the VM networking interfaces are actually capable of supporting multiple VLANs simultaneously. (More information on the configuration of VLANs with VMware ESX/ESXi can be found in this aging but still very applicable article.)

In practical use what this means is that any given VMware ESX/ESXi host could have VMs running on it that exist on a completely different and separate VLAN than the management interface. In fact, any given VMware ESX/ESXi host might have multiple VMs running across multiple separate and distinct VLANs. And as long as the ESX/ESXi hosts are properly configured to support the same VLANs, you can easily vMotion VMs between ESX/ESXi hosts when those VMs are in different VLANs. So, the net effect is that ESX/ESXi can easily support multiple VLANs at the same time for VM networking traffic, and this means that vMotion’s practical use isn’t gated by some inherent limitation of the VM networking interfaces themselves.

Where in the world, then, does this Layer 2 adjacency thing keep coming from? If it’s not management interfaces (it isn’t), and it’s not VM networking interfaces (it isn’t), then what is it? It’s the VMkernel interface that is configured to support vMotion. In order to vMotion a VM from one ESX/ESXi host to another, each host’s vMotion-enabled VMkernel interface has to be in the same IP subnet (i.e., in the same Layer 2 VLAN or broadcast domain). Going back to Cisco switch configurations, a vMotion-enabled VMkernel port will be configured very much like a management interface:

interface GigabitEthernet 1/5
  switchport mode access
  switchport access vlan 37
  spanning-tree portfast

This means that the vMotion-enabled VMkernel port is just an access port (no 802.1Q trunking) in a single VLAN.

Is this really a limitation on the practical use of vMotion? Hardly. In my initial rebuttal of the claims against vMotion, I pointed out that because it is only the VMkernel interface that must share Layer 2 adjacency, all this really means is that a vMotion domain is limited to the number of vMotion-enabled VMkernel interfaces you can put into a single Layer 2 VLAN/broadcast domain. VMs are unaffected, as you saw earlier, as long as the VM networking interfaces are correctly configured to support the appropriate VLAN tags, and the management interfaces do not, generally speaking, have any significant impact.

Factoring in the consolidation ratio, as I did in my earlier post, and you’ll see that it’s possible to support very large numbers of VMs spread across as many different VLANs as you want with a small number of ESX/ESXi hosts. Consider 200 ESX/ESXi hosts—whose vMotion-enabled VMkernel interfaces would have to share a broadcast domain—with a consolidation ratio of 12:1. That’s 2,400 VMs that can be supported across as many different VLANs simultaneously as we care to configure. Do you want 400 of those VMs in one VLAN while 300 are in a different VLAN? No problem, you only need to configure the physical switches and the virtual switches appropriately.

Let’s summarize the key points here:

  1. Only the vMotion-enabled VMkernel interface needs to have Layer 2 adjacency within a data center.
  2. Neither management interfaces nor VM networking interfaces require Layer 2 adjacency.
  3. VM networking interfaces are easily capable of supporting multiple VLANs at the same time.
  4. When the VM networking interfaces on multiple ESX/ESXi hosts are identically configured, VMs on different VLANs can be migrated with no network disruption even though the ESX/ESXi hosts themselves might all be in the same VLAN.
  5. Consolidation ratios multiply the number of VMs that can be supported with Layer 2-adjacent vMotion interfaces.

Based on this information, I hope it’s clearer that vMotion is, in fact, quite practical for today’s data centers. Yes, there are design considerations that come into play, especially when it comes to long-distance vMotion (which requires stretched VLANs between multiple sites). However, this is still an emerging use case; the far broader use case is within a single data center. Within a single data center, all the key points I provided to you apply, and there is no practical restriction to leveraging vMotion.

If you still disagree (or even if you agree!), feel free to speak up in the comments. Courteous and professional comments (with full disclosure, where applicable) are always welcome.

Tags: , , , , ,

The quite popular GigaOm site recently published an article titled “The VMotion Myth”, in which the author, Alex Benik, debunks the myth of vMotion and the live migration of virtual machines (VMs).

In his article, Benik states that the ability to dynamically move workloads around inside a single data center or between two data centers is, in his words, “far from an operational reality today”. While I’ll grant you that inter-data center vMotion isn’t the norm, vMotion within a data center is very much an operational reality of today. I believe that Benik’s article is based on some incorrect information and incomplete viewpoints, and I’d like to clear things up a bit.

To quote from Benik’s article:

Currently, moving a VM from a one physical machine to another has two important constraints. First, both machines must share the same storage back end, typically a Fibre Channel/iSCSI SAN or network-attached storage. Second, the physical machines must reside in the same VLAN or subnet. This means that inside a single data center, one can only move a VM across a relatively small number of physical machines. Not exactly what the marketing guys would have you believe.

Benik is correct in that shared storage is required in order to use vMotion. The use of shared storage in VMware environments is quite common for this very reason. Note also that shared storage is required in order to leverage other VMware-specific functionality such as VMware High Availability (HA), VMware Distributed Resource Scheduler (DRS), and VMware Fault Tolerance (FT).

On the second point—regarding VLAN requirements—Benik is not entirely correct. You see, by their very nature VMware ESX/ESXi hosts tend to have multiple network interfaces. The majority of these network interfaces, in particular those that carry the traffic to and from VMs, are what are called trunk ports. Trunk ports are special network interfaces that carry multiple VLANs at the same time. Because these network interfaces carry multiple VLANs simultaneously, VMware ESX/ESXi hosts can support VMs on multiple VLANs simultaneously. (For a more exhaustive discussion of VLANs with VMware ESX/ESXi, see this collection of networking articles I’ve written.)

But that’s only part of the picture. It is true that there is one specific type of interface on a VMware ESX/ESXi host that must reside in the same VLAN on all the hosts between which live migration is required. This port is a VMkernel interface that has been enabled for vMotion. Without sharing connectivity between VMkernel interfaces on the same VLAN, vMotion cannot take place. I suppose it is upon this that Benik bases his statement.

However, the requirement that a group of physical systems share a common VLAN on a single interface is hardly a limitation. First, let’s assume that you are using an 8:1 consolidation ratio; this is an extremely conservative ratio considering that most servers have 8 cores (thus resulting in a 1:1 VM-to-core ratio). Still, assuming an 8:1 consolidation ratio and a maximum of 253 VMware ESX/ESXi hosts on a single Class C IP subnet, that’s enough physical systems to drive over 2,000 VMs (2,024 VMs, to be precise). And remember that these 2,024 VMs can be distributed across any number of VLANs themselves because the network interfaces that carry their traffic are capable of supporting multiple VLANs simultaneously. This means that the networking group can continue to use VLANs for breaking up broadcast domains and segmenting traffic, just like they do in the physical world.

Bump the consolidation ratio up to 16:1 (still only 2:1 VM-to-core ratio on an 8 core server) and the number of VMs that can be supported with a single VLAN for vMotion is over 4,000. How is this a limitation again? Somehow I think that we’d need to pay more attention to CPU, RAM, and storage usage before being concerned about the fact that vMotion requires Layer 2 connectivity between hosts. And this isn’t even taking into consideration that organizations might want multiple vMotion domains!

Clearly, the “limitation” that a single interface on each physical host share a common VLAN isn’t a limitation at all. Yes, it is a design consideration to keep in mind. But I would hardly consider it a limitation and I definitely don’t think that it’s preventing customers from using vMotion in their data centers today. No, Mr. Benik, there’s no vMotion myth—only vMotion reality.

Tags: , , , ,

In early April when I traveled to Houston to meet with HP about the ProLiant G6 servers—a trip which resulted in this blog post—I got into a discussion with representatives from HP and Intel regarding the Xeon 5500 CPUs and Enhanced VMotion Compatibility (EVC).

As you probably already know, EVC builds upon hardware extensions provided by AMD and Intel (AMD in the AMD-V extensions, Intel in FlexMigration) that allows a CPU to “alter” its personality or identification so as to allow live migration (i.e., VMotion) between CPUs that would otherwise be incompatible. Without these hardware extensions, you would have had to resort to unsupported CPU masking, which I’ve also discussed (here’s a summary of articles I’ve written on the topic). With the hardware extensions and EVC, the hypervisor can now take over the process of creating custom CPU masks that take all CPUs in the cluster down to the “lowest common denominator” (i.e., EVC masks off all features except those that are common to all CPUs in the cluster).

Ed Haletky provides a good overview of VMware EVC, in case you need more information.

Anyway, that’s enough background information. When I was in Houston, HP and Intel were discussing all the great new features that were available in the Xeon 5500. Knowing that introducing new CPU features generally meant introducing VMotion boundaries (like I describe here), I asked this question: “What happens to all these new features when you put a Xeon 5500-based server into a cluster with older servers and use EVC to guarantee VMotion compatibility?” My thought was that EVC would downgrade the Xeon 5500 by masking new features, thus negating many of the performance benefits offered by the Xeon 5500.

Obviously, the architectural changes introduced by the Xeon 5500 (specifically, Quick Path Interconnect) wouldn’t be affected, but what about Extended Page Tables (EPT)? Or VMDq? Or FlexPriority?

There were folks there from HP and Intel, and they were all stumped. They had to call in a few experts, but finally we got an answer from an Intel engineer on the VMware Alliance Team who was able to provide more information:

… The only features that would be disabled are some new instructions that are accessible to end-user applications living in Ring 3. For NHM-EP those would be the SSE4.2 and SSE4.1 instructions… No other features would be compromised. All the features of NHM-EP that provide performance and are under the direct control of the OS or hypervisor in Ring 0 will be fully usable: EPT, HT, NuMA, FlexPriority, VMDq, VT-x, VT-d, etc.

So there you have it: you can mix Xeon 5500-based servers in with older servers running previous generations of Xeon CPUs without sacrificing the performance benefits gained from the Xeon 5500.

Tags: , , , , ,

Over the last few years I’ve written a few articles on CPU masking within VMware environments. To make them easier to locate I wanted to bring them all together here. So, here are the links to these articles:

Sneaking Around VMotion Limitations
VMotion Compatibility
More on CPU Masking

Keep in mind that almost all forms of CPU masking are unsupported by VMware, so use this information at your own risk. However, in non-production environments that don’t support Enhanced VMotion Compatibility (EVC), this may be your only option.

Tags: , , , ,

My irregular “Virtualization Short Takes” series was put on hold some time ago after I started work on Mastering VMware vSphere 4. Now that work on the book is starting to wind down just a bit, I thought it would be a good time to try to resurrect the series. So, without further delay, welcome to the return of Virtualization Short Takes!

  • Trigged by a series of blog posts by Arnim van Lieshout on VMware ESX memory management (Part 1, Part 2, and Part 3), Scott Herold decided to join the fray with this blog post. Both Scott’s post and Arnim’s posts are good reading for anyone interested in getting a better idea of what’s happening “under the covers,” so to speak, when it comes to memory management.
  • Perhaps prompted by my post on upgrading virtual machines in vSphere, a lot of information has come to light regarding the PVSCSI driver. Some are advocating changes to best practices to incorporate the PVSCSI driver, but others seem to be questioning the need to move away from a single drive model (a necessary move since PVSCSI isn’t supported for boot drives). Personally, I just want VMware to support the PVSCSI driver on boot drives.
  • Eric Sloof confirms for us that name resolution is still the Achilles’ Heel of VMware High Availability in VMware vSphere.
  • I don’t remember where I picked up this VMware KB article, but it sure would be handy if VMware could provide more information about the issue, such as what CPUs might be affected. Otherwise, you’re kind of shooting in the dark, aren’t you?
  • Upgraded to VMware vSphere, and now having issues with VMotion? Thanks to VMwarewolf, this pair of VMware KB articles (here and here) might help resolve the issue.
  • Chad Sakac of EMC and co-conspirator for the storage portion of Mastering VMware vSphere 4 (pre-order here), has been putting out some very good posts:
  • Leo Raikhman pointed me to this article about IRQ sharing between the Service Console and the VMkernel. I think I’ve mentioned this issue here before…but after over a 1,000 posts, it’s hard to keep track of everything. In any case, there’s also a VMware KB article on the matter.
  • And speaking of Leo, he’s been putting up some great information too: notes on migrating Ubuntu servers (in turn derived from these notes by Cody at ProfessionalVMware), a rant on CDP support in ESX, and a note about the EMC Storage Viewer plugin. Good work, Leo!
  • If you are interested in a run-down of the storage-related changes in VMware vSphere, check out this post from Stephen Foskett.
  • Rick Vanover notes a few changes to the VMFS version numbers here. The key takeaway here is that no action is required, but you may want to plan some additional tasks after your vSphere upgrade to optimize the environment.
  • In this article, Chris Mellor muses on how far VMware may go in assimilating features provided by their technology partners. This is a common question; many people see the addition of thin provisioning within vSphere as a direct affront to array vendors like NetApp, 3PAR, and others who also provide thin provisioning features in the array themselves. I’m not so convinced that this feature is as competitive as it is complementary. Perhaps I’ll write a post about that in the near future…oh wait, never mind, Chad already did!
  • File this one away in the “VMware-becoming-more-like-Microsoft” folder.
  • My occasional mentions of Crossbow prompted a full-on explanation of the Open Networking functionality of OpenSolaris by a Sun engineer. It kind of looks like SR-IOV and VMDirectPath to me…sort of. Don’t you think?
  • If you are thinking about how to incorporate HP Virtual Connect Flex-10 into your VMware environment, Frank Denneman has some thoughts to share. I’ve been told by HP that I have some equipment en route with which I can do some additional testing (the results of which will be published here, of course!), but I haven’t seen it yet.
  • OK, I guess that should just about do it. Thanks for reading, and please share your thoughts, interesting links, or (pertinent) rants in the comments.

    Tags: , , , , , , , , , ,

    « Older entries