Networking

You are currently browsing articles tagged Networking.

Recent Cisco Product Launch

Cisco recently launched a number of new products and new versions of products aimed at showing Cisco’s dedication and innovation in network switching. You can get Cisco’s summary of the products here. I’ll start with the hardware-focused announcements.

Hardware-Focused Announcements

Cisco’s announcements on the hardware side primarily centered around new switching capabilities in the form of 40/100 Gigabit Ethernet (GE) cards for both the Nexus 7000 series platform as well as a 40 GE card for the Catalyst 6500 series. (The 40 GE card for the Catalyst 6500 series does require the newer Supervisor Engine 2T, however.)

<aside>Did you know that the 40/100 GE specification, 802.3ba, was ratified in June 2010? I didn’t realize it has been ratified that long.</aside>

For the Nexus 7000 series, Cisco unveiled two cards:

  • The M2-Series 6-port 40 Gigabit Ethernet Module with XL Option (how’s that for a mouthful?) sports, as the name suggests, 6 non-blocking 40 GE ports. With 16 of these blades in the 18-slot Nexus 7018, that provides up to 96 ports of 40 GE connectivity. The “XL Option” in the name of the card enables the card—in conjunction with the Scalable Feature License—to support more IPv4 routes (up to 1 million, depending on several factors), more IPv6 routes (up to 350,000, again depending on several factors), and more access control list (ACL) entries. These increased route and ACL limits could be useful in environments with multiple Virtual Routing and Forwarding (VRF) or multiple Virtual Device Context (VDC) instances. Based on the Cisco data sheet, it looks like this card can work with either Fabric-1 or Fabric-2 modules, although you’ll need Fabric-2 modules for the most throughput.
  • The M2-Series 2-port 100 Gigabit Ethernet Module with XL Option provides two non-blocking 100 GE ports. The “XL Option” functions in the same way as with the 6-port 40 GE card, and it does work with both Fabric-1 and Fabric-2 modules. Here’s the official Cisco data sheet.

Based on the data sheet, it looks like both of these cards will require version 6.1 of NX-OS, which—to my knowledge—is a brand-new release. (Version 6.0 of NX-OS for the Nexus 7000 series was released in late December.)

One interesting note about the 40 GE card is that it supports the use of a breakout cable that allows a single 40 GE port to support four 10 GE connections. This is true for both the Nexus 7000 and Catalyst 6500 cards, as far as I can tell. (Cisco references a FourX connector for the Catalyst 6500 card, but does not reference the same connector for the Nexus blade, instead simply mentioning a “breakout cable”.)

Cisco also introduced the Catalyst 4500-X, a fixed configuration 10 GE aggregation switch. They mentioned “40 GE readiness,” but it’s not clear when 40 GE uplinks will make their way to this particular platform.

Rounding out the hardware announcements was the Nexus 3064-X, a follow-up the low-latency Nexus 3000 series of switches introduced some time ago. This version offers lower power consumption and additional reductions in switching latency. The specific switching latency reductions were not specified anywhere that I found.

Software-Focused Announcements

The software-focused announcements were primarily centered around the Nexus 1000V/1010 and a new feature called Easy Virtual Network (EVN).

  • A new version of the Nexus 1000V was announced that supports VXLAN (Virtual Extensible LAN). I’ve discussed VXLAN extensively (see here), as have others in the industry, so I won’t rehash that again.
  • Also of note is that the Cisco Virtual Security Gateway (VSG) will offer zone-based firewalling services to VMs on VXLAN segments.
  • The Nexus 1010-X is a “beefed up” version of the Nexus 1010 (yes, I know this is technically a hardware product), intended to support more virtual networks. See here for more information on the differences between the Nexus 1010 and the Nexus 1010-X.
  • Easy Virtual Network (EVN) is the most interesting of the software-based announcements (to me, at least). Cisco touts EVN as “fully compatible with established standards, including Multiprotocol Labl Switching (MPLS), MPLS VPN over IP (multipoint generic routing encapsulation, also known as mGRE), Multi-Virtual Route Forwarding (also known as VRF-Lite), and others.” (More information available here.) However, it appears that EVN doesn’t actually use any of these mechanisms. In fact, it’s unclear to me exactly what mechanism EVN does use. The data sheet (linked above) mentions the use of a VNET tag, and indicates that the VNET tag is stored in the 802.1q VLAN ID field. This looks like Cisco is creating yet another Layer 3 VPN solution, instead of leveraging existing solutions. Why not add VXLAN support to the hardware switches instead of creating EVN? Maybe I’m completely missing the mark here…feel free to correct me (courteously, of course!).

Why This Matters to You

All these product announcements and new product versions are pretty cool, but you might be wondering, “Why does this matter to me?” Good question. Here are my thoughts:

  • The 40 GE and 100 GE cards are important in data centers where we are seeing increased deployment of 10 GE to the servers. Once motherboard manufacturers start using 10 GE LoM (LAN on Motherboard) ports, the deployment of 10 GE to servers in data centers will naturally increase. Deploying 40 GE and 100 GE uplinks in 10 GE-heavy environments makes sense (depending on the details, naturally).
  • You already knew the VXLAN-capable 1000V was on the way (this was alluded to in the original VXLAN announcement at VMworld 2011), so no real surprises there.
  • The Nexus 1010-X simply allows you to run more VSBs (virtual service blades), such as the Nexus 1000V VSM (virtual supervisor module). If you’re a Nexus 1000V customer, you might have already started investigating the use of the Nexus 1010 to host the VSMs. Large customers (service providers, perhaps?) had a need for more VSBs on a single Nexus 1010, hence the Nexus 1010-X.
  • EVN…well, I’m stuck on EVN. I certainly see the need for simpler separation of traffic, but again I must ask why Cisco appears to be creating something new instead of re-using protocols that are perfectly suited for this purpose? Maybe I need a networking expert to explain it to me. (When I understand it, I’ll post an explanation that everyone else can understand. Fair?)

As always, feel free to post any clarifications or corrections in the comments below. I’d love to hear from any networking gurus on any of the points that I raised in this article.

Tags: , ,

Welcome to Technology Short Take #20, the latest collection of various data center-related links, articles, and thoughts. I hope you find something useful here.

Networking

  • For all the writing and thinking I’ve done on VXLAN (here, here, and here), someone (thanks Ed!) mentioned something to me recently that I hadn’t even considered. (It’s so simple I’m embarrassed that I overlooked it.) VXLAN uses UDP for its encapsulation. What about dropped packets, lack of sequencing, etc., that is possible with UDP? What impact is that going to have on the “inner protocol” that’s wrapped inside the VXLAN UDP packets? Or is this not an issue in modern networks any longer?
  • Since Ivan thinks we do need to worry about spaghetti and horseshoes, I’m curious to know if he thinks we need to worry about UDP as the transport for VXLAN.
  • Jake Howering of Cisco (with whom I’ve worked on a few occasions) posted a couple of good articles on the Cisco Data Center and Cloud blog. Jake focuses on DCI (data center interconnect), so his articles naturally reflect that focus. The first article discusses the use of LISP for ingress path optimization in VM mobility scenarios. Ingress path optimization is only half of the solution, though; Jake discusses the other half in the second article, which covers egress path optimization with FHRP filtering, something I discussed in my recent VXLAN/OTV article.
  • It looks like you can’t create a vPC (virtual port channel) between a Nexus 55xx and a Nexus 50×0 switch.
  • Kyle Mestery speculates a bit here about integrating VXLAN in OpenStack Quantum.

Servers/Applications/Operating Systems

  • I’ll put this under the “Servers” category even though it’s storage-related: In this post, Ryan Hughes explains how to use the UCSM CLI to do some cool—and useful—boot from SAN troubleshooting.
  • Purely by accident, I stumbled across this humorously-titled series regarding Oracle on Fibre Channel. It turns out the author works for EMC, although I didn’t know that ahead of time. In any case, if you’re interested in finding out how manly men deploy Oracle, check out the series titled “Manly Men Only Deploy Oracle with Fibre Channel” (parts 1, 2, 3, 4, 5, 6, 7, and 8).

Storage

  • Clint Kitson recently published a guide on setting up the FC zoning for WAN communications on a VPLEX Metro cluster. This article is a good complement to my own MDS zoning articles (creating MDS zones, managing MDS zones, using device aliases) and shows some practical examples of those commands in action.
  • Simon Seagrave points out a firmware update for Iomega IX2 and IX4 storage devices that fixes problems with Time Machine under Mac OS X 10.7 “Lion”. Thanks Simon!
  • This post by Hans De Leenheer was written in July of last year, but it only came to my attention in late November after I met Hans in Vienna during my “round the world” trip. (Great guy, by the way.) He asked me to respond to his criticism of EMC’s “mega-launch” in early 2011. I wish I could say Hans is way off, but I personally agree that the hyperbole was a bit much. I do have to disagree a bit with his comment about no new products; while EMC didn’t introduce all new hardware for some lines (like the VMAX), they did introduce new array software (like Enginuity 5875 for the VMAX). Does that count as a “new product”? I guess the answer to that depends on whether you work in marketing. At least it’s good to hear that Hans feels EMC is, in his words, “a solid company making solid storage solutions”. Perhaps EMC should focus more on that…
  • Erik Smith has written a very detailed post on FC/FCoE hard zoning versus soft zoning. It’s an excellent post, and well worth the time if you aren’t already an FC/FCoE expert. By the way, if you haven’t yet subscribed to Erik’s RSS feed, I highly recommend it. He puts out some great stuff.
  • David Robertson wrote up the steps for creating a Linux-based SMC/SPA server, in the event you use EMC Symmetrix in your environment.

Virtualization

Here are a few other virtualization-related links that I also thought you might find interesting or useful:

Are operational considerations overlooked in virtualization?
VMware KB – Detaching a datastore or storage device from multiple ESXi 5.0 hosts
What happened to vShield in vSphere 5?
Linux VMware Tools Install Changes to Upstart
VMware KB – Disabling VAAI Thin Provisioning Block Space Reclamation (UNMAP) in ESXi 5.0

Cloud Computing

  • You’ll note that I didn’t lump cloud computing under virtualization, since I think that there’s more to “cloud” than virtualization (although I would grant that virtualization is a big part of it). So much more, in fact, that people are referring to cloud computing as complex. (Gasp!) But this post puts cloud complexity in perspective. If you are merely a consumer of services, then cloud is simple; if, however, you are a provider of services, then cloud is more complex. Good article.

That’s it for this time around. If you have any comments or thoughts on anything shared here, feel free to speak up in the comments.

Tags: , ,

Welcome to Technology Short Take #19, the first Technology Short Take for 2012. Here’s this year’s first collection of links, articles, and thoughts regarding virtualization, storage, networking, and other data center technology-related topics. I hope you find something useful!

Networking

  • While configuration limits aren’t the most exciting reading, they are important from time to time. Here’s some configuration limits for the UCS 6100 and 6200 series.
  • Understanding the differences—both positive and negative—between the various approaches to solving a particular challenge is a key skill. That’s why I like this article on HP Flex-10 versus NIOC for VDI. The author (Dwayne) weighs the pros and cons of both approaches in helping to shape network traffic for VDI deployments using 10Gb Ethernet.
  • It would appear that my recent VXLAN and OTV connectivity posts (incorrect VXLAN post here, corrected VXLAN post here, and OTV/VXLAN post here) sparked a discussion about whether we really need to concern ourselves with traffic trombones. On one side we have Brad Hedlund speculating that the network should be treated like a large virtual I/O fabric; on the other side we have Greg Ferro countering that we do need to be concerned about the topology of the network. I can see both sides of the argument, but at this stage of the game, I’m inclined to agree more with Greg. In the future (it’s unclear how far in the future) I think that Brad’s points will be more valid, but not right now.
  • This post by Ivan Pepelnjak on VXLAN, IP multicast, OpenFlow, and control planes highlights some of the current limitations with VXLAN and thus reinforces why I think that Brad’s arguments are a bit ahead of their time.
  • A few folks had some write-ups on Embrane Heleos: Greg Ferro, Jason Edelman, Brad Hedlund, Brad Casemore, and Ivan Pepelnjak. My question (and this is spurred in part by some comments by Brad Casemore): is this another Cisco spin-in move?

Servers/Operating Systems/Applications

Storage

Virtualization

And that it’s for this time around; as always, I hope you’ve found something useful here. Courteous comments are always welcome; feel free to speak up below.

Tags: , , , , , , , , , , ,

Rather than posting some sort of “2011 in review” article where I talk about how many visitors the site had or how many RSS subscribers there are, I thought I’d instead focus on the upcoming year and some of the projects in which I’ll be involved. By describing some of the projects that I’m undertaking this year in 2012, that gives you—the readers—a rough idea of some of the types of content that will likely appear in the coming year.

Here are some of my 2012 projects (some of these I’ve already tweeted about):

  1. I’m going to learn to script in Perl. Many people have asked why Perl and why not Python or Ruby or something else. Honestly, I don’t have a really good answer for you. I tried (unsuccessfully) to teach myself Perl a couple of years ago, so I still have the O’Reilly Learning Perl book. Rather than spending money to learn some other scripting language, it seemed reasonable to revisit Perl again and just leverage the resources I already have. You might see a few Perl-related posts here and there as I work through Learning Perl, but I’ll try not to bore you with elementary stuff.

  2. I’m going to learn German. Same scenario here—many people have asked why German and why not Spanish or French. I do have an answer this time: I seem to be spending a fair amount of time in Vienna, so German seemed to make sense. I also have a series of customer meetings planned in Germany in the first quarter of this year. Plus, German is completely new and different than anything I’ve learned before, and I wanted to challenge myself to learn and think in new ways. It’s unlikely that this will find its way into any blog posts, but you never know…

  3. I’m going to become much more familiar with the Xen hypervisor. I haven’t yet decided if I’ll focus strictly on the open source version of Xen or Citrix XenServer; I’m open to suggestions there. No, this doesn’t mean that I’m abandoning VMware or anything like that; I just want to expand my knowledge. You can’t simply discount Xen; after all, Amazon EC2 is built on Xen. Along with this dive into Xen, I’ll also be looking very closely at Open vSwitch and OpenStack. I’d expect that a great deal of this education will eventually end up in various blog posts here.

  4. I’m going to pursue my CCNP. I “re-achieved” CCNA last year, and this year I’m pursuing my CCNP. As with Xen, I’m confident that the learning curve required to move closer to (or even achieve) CCNP will result in a number of related blog posts on various networking technologies or concepts.

I do have a few other projects planned for this upcoming year, but I’m not quite ready to discuss those publicly yet. At least one of these other projects will be something new that I haven’t done before. Stretching myself and my skills/experience in new directions is a bit of a theme this year.

If you have any tips/tricks/advice to share on any of these upcoming projects, or if there are specific things related to these projects that you’d like to see blogged about here, please let me know in the comments. Thanks, and I hope that 2012 is going to be as exciting for you as it will be for me!

Tags: , , , ,

Building large-scale L2 networks, including stretched L2 networks, seems to be all the rage these days, driven in part by virtual machine mobility (aka vMotion in VMware vSphere environments or XenMotion in Citrix XenServer environments). While this isn’t always a good idea—some might say it’s never a good idea—it is still something that many organizations are evaluating.

With the announcement of VXLAN at VMworld 2011, a new question seems to have arisen: can I use VXLAN instead of (insert some other protocol here) to create my stretched L2 networks? In this post, I’d like to compare the use of VXLAN with OTV (Overlay Transport Virtualization) for that very purpose. Of course, since VXLAN hasn’t actually been released, the discussion is partially theoretical.

My primary focus in this post will be how each of these protocols handles traffic patterns in the course of addressing the need for L2 connectivity over routed L3 networks.

First, let’s look at VXLAN. The figure below is taken from my revised L3 connectivity with VXLAN post, which I encourage you to read for more details.

As you can see, once a VM inside a VXLAN segment is migrated to a new network, the traffic “trombones” back and forth across the VXLAN segment because all traffic has to pass through a single vShield Edge (VSE) instance. This brings up a key limitation of VXLAN that I think is important to point out: VXLAN has an innate dependency on VSE, and VSE cannot be made redundant. That’s right—you can’t have VSE-specific failover functionality; instead, you have to rely on vSphere HA, VM Monitoring, and other features. That means failover times in the minutes, not seconds. What do you think that will do to network connections?

Now, let’s compare VXLAN’s L3 connectivity with OTV. First, here’s a diagram to show connectivity with OTV before a VM is migrated to the second site:

No real surprises here. I’ll just point out here that a typical OTV deployment following “recommended practices” will use redundant Nexus 7000 switches, as shown here. That’s a key advantage that OTV has over VXLAN—the ability to provide redundancy is there and redundancy is easily built into the solution, with failover times in the seconds (or better).

Now, take a look at the post-migration traffic flows with OTV:

In case you didn’t notice it, let me point out the obvious: note the lack of traffic tromboning here. Here’s how it’s accomplished (and documented in this blog post by Ron Fuller, aka @ccie5851 or VDCBadger to his friends):

  • Each Nexus 7000 pair runs HSRP.
  • The HSRP hello packets are filtered (blocked) from the OTV interfaces. This keeps the HSRP pairs in each data center from knowing about the pair in the other data center.
  • Each HSRP pair runs the same virtual IP (the default gateway for the 10.1.1.0/24 subnet).

In this configuration, once the VM migrates to the second site the HSRP pair at the second site won’t need to send traffic across the OTV link to reach the migrated VM. This appears to be a significant advantage to OTV—a greater knowledge of the routing topology allows OTV to be more intelligent about how traffic should be directed across/around the network.

<aside>Of course, this doesn’t address L3 routing concerns from subnets not directly attached to the Nexus 7000 pairs. For that, we’d need something like LISP.</aside>

As I see it—and networking experts are welcome to jump in if I’m mistaken—this gives OTV two key advantages over VXLAN:

  1. OTV, because it is running on physical networking equipment, is more intelligent than VXLAN about how traffic is directed/routed in/around/across a network. This can result in more efficient utilization of a data center interconnect as a result of reduced “traffic tromboning.”
  2. OTV, because it is running on physical networking equipment, can provide better redundancy and faster failover than VXLAN (which relies on single instances of VSE).

It’s entirely possible that if VXLAN ever makes it into physical network equipment that these advantages of OTV will be nullified.

It’s also important to point out that while OTV and VXLAN have some overlap in functionality they are partially targeted at solving different problems. While both protocols address L2 connectivity across L3 networks, VXLAN also addresses the exhaustion of the VLAN address space in larger networks (especially service provider networks). This is an issue that OTV does not try to address. However, it seems to me that OTV would co-exist better with a solution like Q-in-Q, which could (as far as I can tell) address the VLAN ID exhaustion issue.

Once again, I encourage network experts to chime in and share their views. If I’ve misstated something, please let me know. Questions, thoughts, and comments are always welcome.

Tags: , , ,

Within the last couple of days, I received an e-mail notification that UIM/Operations 3.0 had been finalized and was now generally available (i.e., it was now considered GA).

For those that aren’t familiar, UIM has two flavors:

  • UIM/Provisioning (also referred to as UIM/P), which is tasked with handling provisioning/de-provisioning tasks in a Vblock. This would include tasks like deploying UCS B-series blades, zoning FC fabrics, and setting up storage pools.
  • UIM/Operations (also referred to as UIM/O) is tasked with providing near real-time visibility into the Vblock, as well as root cause and impact analysis.

In addition to support for UIM/P 3.0 (more info here) and all associated Vblock types, this latest release of UIM/O adds the following features:

  • Model-based deterministic automated root cause analysis for faults in a Vblock environment
  • Automated impact analysis that visualizes impact on higher-order abstractions such as vApps, UIM Services (these are defined within UIM/P) and Vblocks
  • Event forwarding via SNMP traps to enable northbound integration
  • Automation of trap reception from MDS and Nexus switches
  • Saving and restoring user preferences

As with UIM/P, the new version of UIM/O is available to authorized users on Powerlink:

Home > Support > Software Downloads and Licensing > Downloads E-I > Ionix Unified Infrastructure Manager/Operations

Documentation for UIM/O 3.0 is also available on Powerlink:

Home > Support > Technical Documentation and Advisories > Software ~ E-I ~ Documentation > Ionix Family > Ionix for Data Center Automation and Compliance > Ionix Unified Infrastructure Manager/Operations > 3.0 and Service Packs

(Think that’s a deep enough structure to navigate?)

Enjoy!

Tags: , , ,

Welcome to Technology Short Take #18! I hope you find something useful in this collection of networking, OS, storage, and virtualization links. Enjoy!

Networking

The number of articles in my “Networking” bucket continues to overflow; I have so many articles on so many topics (soft switching, OpenFlow, Open vSwitch, MPLS) that it’s hard to get my head wrapped around all of it. Here are a few posts that stuck out to me:

  • Ivan Pepelnjak has a very well-written post explaining the various ways that virtual networking can be decoupled from the physical network.
  • I stumbled across a trio of articles by Denton Gentry on hash tables (part 1, part 2, and part 3). This is an interesting perspective I hadn’t considered before; as we move more into software-defined networks (SDNs), why are we continuing to use the same mechanisms as before? Why not take advantage of more efficient mechanisms as part of this transition?

Servers/Operating Systems

  • Nigel Poulton and I traded a few tweets during HP Discover Vienna about SCSI Express (or SCSI over PCIe, SoP). He wrote up his thoughts about SoP and its future in the storage industry here. Further Twitter-based discussions about fabrics led him to say that HP buying Xsigo would bring the competition back against UCS. I’m not so sure I agree. Xsigo’s server fabric technology/product is interesting, but it seems to me that it’s still adding layers of abstraction that aren’t necessary. As SR-IOV, MR-IOV, and PCIe extension matures, it seems to me that Ethernet as the fabric is going to win. If that’s the case, and HP wants to bring the hurt against UCS, they’re going to have to invest in Ethernet-based fabrics.
  • Speaking of UCS, here’s a “how to” on deploying the UCS Platform Emulator on vSphere. You might also like the UCS PE configuration follow-up post.
  • Here’s what looks to be a handy Mac OS X utility to track how long until your Active Directory password expires. Sounds simple, yes, but useful.

Storage

Virtualization

  • Jason Boche, after some collaboration with Bob Plankers, wrote up a good procedure for expanding the vCloud Director Transfer Server storage space. It’s definitely worth a read if you’re going to be working with vCloud Director.
  • Microsoft has released version 3.2 of the Linux Integration Services for Hyper-V. The new release adds integrated mouse support, updated network drivers, and fixes an issue with SCVMM compatibility.
  • Julian Wood, who I had the opportunity to meet in Copenhagen at VMworld 2011, has published a four-part series on managing vSphere 5 certificates. Follow these links for the series: part 1, part 2, part 3, and part 4.
  • Thinking of deploying Oracle on vSphere? You should probably read this three-part series from VMware’s Business Critical Applications blog: part 1 is here, part 2 is here, and part 3 is here.
  • I’m so used to dealing with VLANs in a vSphere environment, I didn’t consider the challenges that might come up when using them with VMware Workstation. Fortunately, this author did—read his post on mapping VLANs to VMnets in VMware Workstation.
  • I thought that this article on virtual disks with business critical applications would be a deep dive on which virtual disk formats (thin, lazy zeroed, eager zeroed) are best suited for various applications. While the article does discuss the different virtual disk formats, unfortunately that’s as far as it goes.
  • Fellow VMware vSphere Design co-author Forbes Guthrie highlights an important design concern with AutoDeploy: what about a virtual vCenter instance? Read his full article for the in-depth discussion.
  • This post by William Lam gives a good overview of when vSphere MoRefs change (or don’t change).
  • Here’s a good explanation why NIC teaming can’t be used with iSCSI binding.
  • Cormac Hogan also posted a nice overview of some new vmkfstools enhancements in vSphere 5.
  • Terence Luk posts a detailed procedure to help recover VMware Site Recovery Manager in the event of a failure of one of the SRM servers. Good information—thanks Terence!

And that’s it for this time around. Feel free to add your thoughts in the comments below—all comments are welcome! (Please provide full disclosure of vendor affiliations/employment where applicable. Thanks!)

Tags: , , , , , , , ,

In my earlier post on VXLAN and Layer 3 connectivity, I had a fatal flaw in my thinking and in my diagrams that was corrected for me in the comments to that post. In this post, I want to revisit the idea of Layer 3 connectivity with VXLAN and include the corrected information (and new diagrams).

The “fatal flaw” was that I was working under the impression that we’d have to change network address translation (NAT) mappings on the vShield Edge (VSE) instance that was handling NAT for a particular VXLAN segment. As a result of this incorrect thinking, I stated that VXLAN broke Layer 3 connectivity. As it turns out, I was wrong.

Instead—and this makes perfect sense now that my flawed thinking was pointed out—the VSE instance continues to serve as the default Layer 3 gateway for the workload(s) inside the VXLAN segment.

Consider this diagram, which shows how a workload external to a VXLAN segment communicates with a workload inside a VXLAN segment:

Note that in this diagram, the Linux workload outside the VXLAN segment communicates via the VSE instance handling NAT for that particular VXLAN segment. The VSE instance (VSE 1) passes that communication to the internal workload, and the return traffic follows the same path. Layer 3 connectivity outside of the VXLAN segment is handled via traditional/normal Layer 2/3 methods.

Now consider this diagram, which shows the same communication, but after the Windows-based workload inside the VXLAN segment has now migrated to a different location:

Note that even though the Windows-based workload inside the VXLAN segment now resides on a completely separate VTEP (ESXi 2, in this case), the traffic from the Linux-based workload outside the VXLAN segment continues to move through VSE 1. That’s because VSE 1 is still the Layer 3 default gateway for the IP subnet inside the VXLAN segment. Therefore—and this is where I was wrong earlier—Layer 3 connectivity is not broken, but it does have to “horseshoe” across to the other data center and then back again, as illustrated above. This is the classic traffic pattern that we see with other overlay technologies, like OTV.

For me, while this addresses Layer 3 connectivity after a migration with VXLAN, it does bring up other questions:

  • How does one provide redundancy at the VSE level? Is there VRRP support in VSE, or an equivalent function?
  • Because Layer 3 connectivity is maintained, what now is the role of OTV? Is OTV relegated to handling Layer 2 extensions only for non-virtualized workloads?
  • How do we now propose to handle the “horseshoe” routing issue? It would seem to me that the only way to address this would be to port support for LISP (or an equivalent protocol) into VSE.

Feel free to post any questions, thoughts, or corrections in the comments below. Thanks!

Tags: , , ,

Examining VXLAN

It’s taken me far too long to write this post, that’s for sure. Since the announcement of VXLAN at VMworld earlier in the year, I’ve been searching for additional information on these questions: “What is VXLAN? How does it fit into the broader networking landscape? Why did we need a new standard?” I talked to Cisco, I attended a VMworld session about networking futures, I talked to some of the authors of the IETF draft on VXLAN, I read (most of) the VXLAN draft, and I studied some existing protocols that one might think could have been put to use. I think I’m finally ready to try to address these questions.

What is VXLAN?

The answer to this question is taken directly from the IETF draft (the emphasis is mine):

This document describes Virtual eXtensible Local Area Network (VXLAN), which is used to address the need for overlay networks within virtualized data centers accommodating multiple tenants.

I think it’s important to keep this purpose in mind. While it’s a bit simplistic to state it this way, VXLAN is—essentially—a proposed standards-based replacement for the proprietary MAC-in-MAC encapsulation that is currently used in vCloud Director. Instead of using MAC-in-MAC encapsulation, VXLAN uses MAC-in-IP encapsulation, with multicast groups to handle MAC learning and unique UDP source ports to help with load balancing across multiple links. Yes, that’s a bit of a simplification, but I think it gets the main point across.

How does VXLAN fit into the broader networking landscape?

Trying to answer this question is what has occupied the majority of the time it’s taken to write this post. You can’t explain how VXLAN fits into the broader networking landscape without having a minimal understanding, at least, of what the rest of the networking landscape looks like. I had to dig in a bit deeper to MPLS, OTV, FabricPath/TRILL, and other standards/emerging standards. I’m sure that I’ve still omitted some technologies that should have been included, and I know that there are still (so much) more to learn about the technologies I did include.

Based on the information I was able to gather, the answer to this second question really builds on the answer to the first question. VXLAN only really addresses a few fundamental concerns:

  • A shortage of VLAN address space (the theoretical limit is 4094 VLANs, with many switches supporting fewer than that)
  • An inability to support multi-tenancy (both from a scale perspective as well as a separation perspective)
  • Problems with Layer 2 connectivity across disparate virtual data centers

VXLAN addresses these concerns in this way:

  • It adds a 24-bit VXLAN Network Identifier (VNI), expanding the realm of potentially unique identifiers to just shy of 17 million (16.7 million). This addresses any scale-based concerns of multitenancy.
  • It wraps Layer 2 frames in Layer 3 packets. This addresses the other part of any multitenancy concerns (VXLAN hides duplicate MAC addresses, duplicate IP addresses, and duplicate VLAN IDs found in separate VNIs). This also addresses the Layer 2 connectivity issues between disparate virtual data centers.

And that’s really about it. It doesn’t address Layer 2 multipathing/STP, it doesn’t address Layer 2 connectivity in the physical world (layer 2 connectivity is only preserved at the virtualization level), and it doesn’t address Layer 3 routing issues created by stretched VLANs and VM mobility designs. Which brings us to our third question…

Why did we need a new standard?

This answer builds on the previous two answers. Once you have a clear understanding of what VXLAN was designed to do, and how VXLAN fits into the rest of the networking protocols, then this answer is pretty easy:

  • If you’ve been reading my articles, you know already that VXLAN doesn’t preserve all forms of Layer 3 connectivity. Because it doesn’t, you still need protocols like OTV to address Layer 2/3 connectivity at the physical level.
  • Because you still need protocols like OTV to achieve VM mobility (for the time being, at least), you’re still going to need protocols like LISP to fix funny routing issues being caused by IP addresses from the same subnet existing in multiple locations at the same time.
  • Because VXLAN doesn’t address Layer 2 multipathing concerns, you still need protocols like TRILL and technologies like FabricPath.
  • Because using MPLS—which, by the way, would also address the 3 concerns VXLAN addresses—would require MPLS-enabled/MPLS-aware equipment throughout the data center, that would make an MPLS-based solution difficult for many enterprises to adopt. Using an IP encapsulation scheme means that existing physical networking equipment doesn’t have to change. (Although it might change—to add VXLAN support—at some point in the future.)

I was not a fan of VMware (apparently) driving the creation of an entirely new networking standard. However, as I dug into this, I began to see that while other solutions almost addressed these concerns, none of them were a really good fit. Yes, using MPLS probably would have worked. Using GRE might have worked (take NVGRE, for example, but that’s also a proposed new protocol). To really address the concerns head-on, though, required a solution that was written/created expressly for that purpose, and that’s VXLAN. It’s just important, though, to really understand what VXLAN is as well as what VXLAN isn’t. Otherwise, you’ll find yourself trying to fit VXLAN to a solution for which it really wasn’t intended—which, by the way, was why VXLAN was created in the first place.

Comments, corrections, and clarifications are always welcome!

Tags: , , ,

Some Initial MPLS Reading

I mentioned on Twitter yesterday that I was doing some basic/introductory reading on MPLS, and someone asked what materials I was using. While I’m still very early in the process of trying to understand MPLS, I thought I might share the resources I’ve used so far in trying to wrap my head around MPLS, what it is, and the basics of how it works.

Here are some of the sites I’ve used so far:

MPLS Terminology
MPLS VPN terminology
MPLS Basics – LSR Terminology
Cisco Nexus 7000 Series NX-OS MPLS Configuration Guide
MPLS, Multi-Protocol Label Switching

As you can see, right now I’m focusing on what I call the grammar—that is, the day-to-day terminology and acronyms that are prevalent throughout any and all discussions of MPLS. Being able to recognize and know what an LSR is or what label imposition means is important and prepares me for future stages of learning. (Some people may recognize my use of “grammar” here as taken from the classical education approach.)

Even based on my limited reading so far, I’m beginning to get an idea of why MPLS can be so useful—and why MPLS can be complex. I’m looking forward to continuing my MPLS education. Resources and recommended reading are welcome in the comments!

Tags: ,

« Older entries