OTV and VXLAN Layer 3 Connectivity Compared

Building large-scale L2 networks, including stretched L2 networks, seems to be all the rage these days, driven in part by virtual machine mobility (aka vMotion in VMware vSphere environments or XenMotion in Citrix XenServer environments). While this isn’t always a good idea—some might say it’s never a good idea—it is still something that many organizations are evaluating.

With the announcement of VXLAN at VMworld 2011, a new question seems to have arisen: can I use VXLAN instead of (insert some other protocol here) to create my stretched L2 networks? In this post, I’d like to compare the use of VXLAN with OTV (Overlay Transport Virtualization) for that very purpose. Of course, since VXLAN hasn’t actually been released, the discussion is partially theoretical.

My primary focus in this post will be how each of these protocols handles traffic patterns in the course of addressing the need for L2 connectivity over routed L3 networks.

First, let’s look at VXLAN. The figure below is taken from my revised L3 connectivity with VXLAN post, which I encourage you to read for more details.

As you can see, once a VM inside a VXLAN segment is migrated to a new network, the traffic “trombones” back and forth across the VXLAN segment because all traffic has to pass through a single vShield Edge (VSE) instance. This brings up a key limitation of VXLAN that I think is important to point out: VXLAN has an innate dependency on VSE, and VSE cannot be made redundant. That’s right—you can’t have VSE-specific failover functionality; instead, you have to rely on vSphere HA, VM Monitoring, and other features. That means failover times in the minutes, not seconds. What do you think that will do to network connections?

Now, let’s compare VXLAN’s L3 connectivity with OTV. First, here’s a diagram to show connectivity with OTV before a VM is migrated to the second site:

No real surprises here. I’ll just point out here that a typical OTV deployment following “recommended practices” will use redundant Nexus 7000 switches, as shown here. That’s a key advantage that OTV has over VXLAN—the ability to provide redundancy is there and redundancy is easily built into the solution, with failover times in the seconds (or better).

Now, take a look at the post-migration traffic flows with OTV:

In case you didn’t notice it, let me point out the obvious: note the lack of traffic tromboning here. Here’s how it’s accomplished (and documented in this blog post by Ron Fuller, aka @ccie5851 or VDCBadger to his friends):

  • Each Nexus 7000 pair runs HSRP.
  • The HSRP hello packets are filtered (blocked) from the OTV interfaces. This keeps the HSRP pairs in each data center from knowing about the pair in the other data center.
  • Each HSRP pair runs the same virtual IP (the default gateway for the 10.1.1.0/24 subnet).

In this configuration, once the VM migrates to the second site the HSRP pair at the second site won’t need to send traffic across the OTV link to reach the migrated VM. This appears to be a significant advantage to OTV—a greater knowledge of the routing topology allows OTV to be more intelligent about how traffic should be directed across/around the network.

<aside>Of course, this doesn’t address L3 routing concerns from subnets not directly attached to the Nexus 7000 pairs. For that, we’d need something like LISP.</aside>

As I see it—and networking experts are welcome to jump in if I’m mistaken—this gives OTV two key advantages over VXLAN:

  1. OTV, because it is running on physical networking equipment, is more intelligent than VXLAN about how traffic is directed/routed in/around/across a network. This can result in more efficient utilization of a data center interconnect as a result of reduced “traffic tromboning.”
  2. OTV, because it is running on physical networking equipment, can provide better redundancy and faster failover than VXLAN (which relies on single instances of VSE).

It’s entirely possible that if VXLAN ever makes it into physical network equipment that these advantages of OTV will be nullified.

It’s also important to point out that while OTV and VXLAN have some overlap in functionality they are partially targeted at solving different problems. While both protocols address L2 connectivity across L3 networks, VXLAN also addresses the exhaustion of the VLAN address space in larger networks (especially service provider networks). This is an issue that OTV does not try to address. However, it seems to me that OTV would co-exist better with a solution like Q-in-Q, which could (as far as I can tell) address the VLAN ID exhaustion issue.

Once again, I encourage network experts to chime in and share their views. If I’ve misstated something, please let me know. Questions, thoughts, and comments are always welcome.

Tags: , , ,

  1. Antonio’s avatar

    I think OTV and VXLAN have different uses. OTV help to extend L2 between datancenters and VXLAN extend L2 between VM hosts.

  2. Brad Hedlund’s avatar

    Hi Scott,
    An interesting thought came to mind after reading this..
    If you decided to implement VXLAN, you have precluded yourself from stretching the L2 segments within VXLAN to another DC using a physical network technology like OTV or VPLS. Because, there is no bridging mechanism available yet between VXLAN and a physical LAN.

    Those who choose to implement VXLAN will have the app stack active in one data center, and use the more proven and robust LB + GSLB method for failing applications over to another data center.

    Why on earth would anyone have an app stack fragmented between two data centers, facilitated by vmotion, and all the L2 interconnect complexities that come with it? That’s a rhetorical question btw, you don’t need to answer that. :-)

    Cheers,
    Brad
    (Dell Force10)

  3. Wade Holmes’s avatar

    Hi Scott,

    I don’t really think your comparison is accurate, as you are mixing the features of a product available today (VSE), and basing the architectural comparison of a released product with integration with the unknown futures of an unreleased technology (VXLAN). Unless you are basing the comparison on the capabilities of VSE when integrated with VXLAN in the future (which is publicly unknown at this time), the VXLAN, OTV comparison is null and void.

  4. slowe’s avatar

    Antonio, as I pointed out in the post, I recognize that VXLAN and OTV cannot be directly compared, since there are things that VXLAN was designed to do (like extend the VLAN address space) that OTV was not designed to do.

    Brad, you are—of course—correct. Since there is not (yet) termination of VXLAN segments on physical devices, the use of VXLAN and OTV/VPLS simultaneously for the same VLANs isn’t something organizations will be able to accomplish.

    Wade, I respectfully disagree. You’ll note in the article that I explicitly stated this discussion is partially theoretical since VXLAN has not yet been released. That being said, the mechanics of how we expect VXLAN to work are not likely to change between now and actual release, and therefore this discussion *is* valid. Since VMware has not seen fit to discuss any potential future features of VSE (such as redundancy), that information cannot be incorporated into this discussion. When VXLAN is finally released, we can revisit this discussion and see how the situation has changed at that time. Until then, or until more information is available, customers still need to make decisions about technology choices, and discussions like this are necessary. Thanks!

  5. dj’s avatar

    Great info.

    I would be interested in adding open standards such as VPLS or the upcoming E-VPN (formally MAC-VPN) to this comparison. I am never really all that thrilled with having to deploy proprietary standards, especially ones that seem to only be supported on only certain models/cards in a vendors portfolio.

    Plus MPLS based technology provides TE and sub-sec failovers I believe are so nice features.

  6. Ryan B’s avatar

    Is VSE really a requirement for VXLAN, or can you put any multihomed VM with routing functionality in its place?

  7. Wade Holmes’s avatar

    I agree that these discussions, and bringing to light technology and design considerations that should be made when evaluating VXLAN are healthy. The only part I had issue with is coming to a conclusion based on theoretical information. Until theory is made practice, no real conclusion can be made.

  8. Duncan’s avatar

    Nice article Scott. VMware is working on addressing some of the concerns mentioned in your article. I would suggest you communicate these to the appropriate people within our organization to ensure they are aware and can be correctly prioritized.

    Thanks,

  9. slowe’s avatar

    Excellent. Duncan, it’s good to know that VMware is working to address some of the concerns I mentioned in my article. If you would be so kind as to introduce me to the appropriate contacts within VMware, I’d be happy to discuss these issues with them. Thanks!

  10. Massimo’s avatar

    Scott, I won’t embark in a “what co-exists better with what” discussion but I believe one of the limitations of QinQ is that it doesn’t support duplicated MAC addresses… which may be a problem in particular environment where multiple layer2 segments are created to clone workloads in a test/dev scenario.

    Happy to be corrected if that is not the case but that’s what I remember off the top of my head.

    Cheers.

    Massimo.

  11. slowe’s avatar

    Massimo, I believe you are correct in that Q-in-Q would not address duplicate MAC addresses—an issue that VXLAN *will* address, if I’m not mistaken. Hence my statement that while there are differences between these protocols, those differences are (in part, at least) due to the fact that they strive to address different sets of problems. Thanks for your comment!

  12. Andre Leibovici’s avatar

    Scott,

    Nive write up, as always. I completely get the technical viewpoint, however I wonder what the licencing costs for both products would look like.

    According to http://etherealmind.com/nexus-7000-discount-otv-license-nxos/ the cost per N7K chassis would be ~USD$40000. Total of USD$160000

    Could this be a decision point for smaller orgs to adopt strechted clusters?

    Andre

  13. slowe’s avatar

    Andre, there is no question that licensing and acquisition costs will be a factor to consider. Thanks for pointing that out!

  14. Jon Hudson’s avatar

    Keep in mind that VXLAN is VERY early.

    And while I agree OTV is more mature, for many it’s dead on arrival since being pulled from the IETF standards process in early 2011. (personally very bummed by this)

    And while VXLAN has no charter yet ( or last I checked ) they are at least trying to creat a standards based solution in the IETF.

    I’ll always choose an A- solution that is standards based than a proprietary A+ solution.

    Jon

    @the_socialist
    (Brocade pays my bills, they do not however practice mind control)

  15. Kulin Shah’s avatar

    Scott, time to revisit the comparative discussion (is there even one?) now that VxLAN is supported on physical devices (Arista next-gen 7100 series switches, Brocade FPGA-based VDX) that can potentially be placed not only as ToR for bare-metal servers and appliances but also act as a gateway switch between Data Centers.

    I am guessing this would address each of the customer’s concern that OTV was addressing plus in an open, multi-vendor and more scalable manner.

  16. Jon Hawks’s avatar

    It’s nice to see a stake in the ground to spark conversations that reap rounded opinions and expand knowledge. My experience with OTV is that in data center #1, that Nexus core contols the vlans running across OTV in data center #2. In the implementation, data center #2 didn’t ever know about those OTV Vlans. The servers ran from control via data center #1, the users didn’t know where the servers lived and it ran as though we pulled a wire from dc#1 to dc#2, just to take advantage of dc#2′s space and power. If dc#1 dies, those devices running from OTV vlans, were stranded, until a network administrator intervened.

    Now, given the slowness of configuring physical network devices (some that even have virtual ‘somethings’ built in them, i.e. firewalls, load-balancers, swtiches) we certainly need to put an abstraction layer on them to speed up provisioning, provided we don’t muck up trouble-shooting. Faced with such slowness, I know that if I sponsored massive VM farms of any flavor, I’d want to get clear of the physical limits and all those hard fought and embedded rules for routing, access control and so-on. It would sell my VM farm if I could abstract from any/all physical network limits, put on my own hypervisor for networking and play all I wanted to, simply. Of course all those special rules for security may not be the same but as long as I used the same names and labels on my own security facade, I might do well. Ok, so how then do I make active-active data center designs work? How do I impose disaster failover vs recovery, assuming that failover is instant and recovery is days?

    Coming out of the box I may not have it all at-the-ready, but there may be a chance that I can put a stretched vlan out there and just like that little slider on a weight-scale in the locker room, I can move my vlan anywhere and weigh-in, so to speak.

    I wonder if the underlying design of VxLAN is to simply accommodate the abstraction layer (much like a hypervisor provides) that is necessary to get out of the racket of physical network administration. It gets it’s own broadcast domain that you can apparently push through then entire multi-tenant network and across data centers. So, if I can bury my layer 2 vlans inside a layer 3 and carry them from point to point, I avoid collisions with tenants who like using the same rfc 1918 space. Hmmm – sounds like an internal method of using GRE tunnel theory.

    In the instant gratification world of I.T. demands are highest I’d love to see someone a lot smarter than I take hold of the basic networking requirements imposed by IPv4 and work through the many ways to leverage layer 2 and layer 3 vlans in a multi-tenant environment that provides the speed and agility requirements of instantaneous and perfect network solutions, using only a few factoids handed to them.

    What else is out there?
    Nicira may help but needs to be proven when put to the test, especially, trouble-shooting.
    Should we attempt to think? Make our own solutions based on existing vendor capabilities? EoMPLS? GREs? OTV maybe?

    Need some better perspectives here. Mostly, for a virtualized machine environment that needs to be active-active across a pair of data centers.

    My preference would be to have three data centers at-the-ready, with the same footprint for hosts, storage and network, and inter-linked by pairs of 10 Gbps trunked links. I want to be able to move and host anywhere, anytime, at will. What kind of network does this well?

  17. slowe’s avatar

    Kulin, thanks for your comment. (And great to meet you in person at VMworld!) There is no question that this post—now about 9 months old—needs to be revisited as hardware-based VTEPs for VXLAN are now becoming available. As you rightfully point out, this does change the picture somewhat and makes VXLAN and OTV more comparable. Of course, the real question is what happens to VXLAN and the hardware VTEPs when the IETF NVO3 group finishes their work on an overlay standard…

    Jon Hawks, OTV certainly has its drawbacks (every technology does). As I mentioned to Kulin above, the answer to many of your questions will, I think, be found in a standardized IETF protocol that provides the overlay encapsulation that both OTV and VXLAN currently provide today in nonstandard form. The adoption of an IETF standard will encourage interoperability among hypervisors and hardware vendors, something that doesn’t exist at all today (to my knowledge).

  18. Jon Hudson’s avatar

    We have a LOT to do in a _very_ short period of time. However I am very optimistic we are on the right track and the chairs have been kicking a$$ and keeping this moving lighting fast (especially for a standards group). Many people are really standing out and doing some great work.

    Also keep in mind that there is nothing keeping us from incorporating as much of VXLAN into NVo3 as makes sense. There has even been talk of providing some method of being perhaps supporting multiple encapsulation methods. There are a lot of good opportunities to really come up with something that really makes a difference.

    For those who attend VMworld SF this year you could almost taste VXLAN in the air. So much excitement. Good demos from folks like Arista and Brocade (who pays my bills). And I think you are going to see multiple products from each company providing VTEP support in hardware as time goes on. We are VERY early in this game, where we sit in 5 years will be interesting.

    Jon Hawks, you bring up some great points. And I will say on a personal note, I’m quite bummed OTV didn’t continue on a standards track. Doing so did not guarantee multi vendor support (see lisp :-/) but it would have been nice to give customers more choice.

    And I REALLY encourage you guys (and everyone reading) to follow the mail list, add value where you think you can. We’ve seen good folks like @bradhedlund jump in on occasion and as I said early Ivan has become a frequent and welcomed voice on the list. http://datatracker.ietf.org/wg/nvo3/charter/

    The more viewpoints we get the better NVo3 will be. We HAVE to get a standard that will work across all hypervisors and all vendors. This current VXLAN, STT, NVGRE mess we are in doesn’t help anyone.

    As you may be able to tell, I’m a wee bit excited =)