Networking

You are currently browsing articles tagged Networking.

Rather than posting some sort of “2011 in review” article where I talk about how many visitors the site had or how many RSS subscribers there are, I thought I’d instead focus on the upcoming year and some of the projects in which I’ll be involved. By describing some of the projects that I’m undertaking this year in 2012, that gives you—the readers—a rough idea of some of the types of content that will likely appear in the coming year.

Here are some of my 2012 projects (some of these I’ve already tweeted about):

  1. I’m going to learn to script in Perl. Many people have asked why Perl and why not Python or Ruby or something else. Honestly, I don’t have a really good answer for you. I tried (unsuccessfully) to teach myself Perl a couple of years ago, so I still have the O’Reilly Learning Perl book. Rather than spending money to learn some other scripting language, it seemed reasonable to revisit Perl again and just leverage the resources I already have. You might see a few Perl-related posts here and there as I work through Learning Perl, but I’ll try not to bore you with elementary stuff.

  2. I’m going to learn German. Same scenario here—many people have asked why German and why not Spanish or French. I do have an answer this time: I seem to be spending a fair amount of time in Vienna, so German seemed to make sense. I also have a series of customer meetings planned in Germany in the first quarter of this year. Plus, German is completely new and different than anything I’ve learned before, and I wanted to challenge myself to learn and think in new ways. It’s unlikely that this will find its way into any blog posts, but you never know…

  3. I’m going to become much more familiar with the Xen hypervisor. I haven’t yet decided if I’ll focus strictly on the open source version of Xen or Citrix XenServer; I’m open to suggestions there. No, this doesn’t mean that I’m abandoning VMware or anything like that; I just want to expand my knowledge. You can’t simply discount Xen; after all, Amazon EC2 is built on Xen. Along with this dive into Xen, I’ll also be looking very closely at Open vSwitch and OpenStack. I’d expect that a great deal of this education will eventually end up in various blog posts here.

  4. I’m going to pursue my CCNP. I “re-achieved” CCNA last year, and this year I’m pursuing my CCNP. As with Xen, I’m confident that the learning curve required to move closer to (or even achieve) CCNP will result in a number of related blog posts on various networking technologies or concepts.

I do have a few other projects planned for this upcoming year, but I’m not quite ready to discuss those publicly yet. At least one of these other projects will be something new that I haven’t done before. Stretching myself and my skills/experience in new directions is a bit of a theme this year.

If you have any tips/tricks/advice to share on any of these upcoming projects, or if there are specific things related to these projects that you’d like to see blogged about here, please let me know in the comments. Thanks, and I hope that 2012 is going to be as exciting for you as it will be for me!

Tags: , , , ,

Building large-scale L2 networks, including stretched L2 networks, seems to be all the rage these days, driven in part by virtual machine mobility (aka vMotion in VMware vSphere environments or XenMotion in Citrix XenServer environments). While this isn’t always a good idea—some might say it’s never a good idea—it is still something that many organizations are evaluating.

With the announcement of VXLAN at VMworld 2011, a new question seems to have arisen: can I use VXLAN instead of (insert some other protocol here) to create my stretched L2 networks? In this post, I’d like to compare the use of VXLAN with OTV (Overlay Transport Virtualization) for that very purpose. Of course, since VXLAN hasn’t actually been released, the discussion is partially theoretical.

My primary focus in this post will be how each of these protocols handles traffic patterns in the course of addressing the need for L2 connectivity over routed L3 networks.

First, let’s look at VXLAN. The figure below is taken from my revised L3 connectivity with VXLAN post, which I encourage you to read for more details.

As you can see, once a VM inside a VXLAN segment is migrated to a new network, the traffic “trombones” back and forth across the VXLAN segment because all traffic has to pass through a single vShield Edge (VSE) instance. This brings up a key limitation of VXLAN that I think is important to point out: VXLAN has an innate dependency on VSE, and VSE cannot be made redundant. That’s right—you can’t have VSE-specific failover functionality; instead, you have to rely on vSphere HA, VM Monitoring, and other features. That means failover times in the minutes, not seconds. What do you think that will do to network connections?

Now, let’s compare VXLAN’s L3 connectivity with OTV. First, here’s a diagram to show connectivity with OTV before a VM is migrated to the second site:

No real surprises here. I’ll just point out here that a typical OTV deployment following “recommended practices” will use redundant Nexus 7000 switches, as shown here. That’s a key advantage that OTV has over VXLAN—the ability to provide redundancy is there and redundancy is easily built into the solution, with failover times in the seconds (or better).

Now, take a look at the post-migration traffic flows with OTV:

In case you didn’t notice it, let me point out the obvious: note the lack of traffic tromboning here. Here’s how it’s accomplished (and documented in this blog post by Ron Fuller, aka @ccie5851 or VDCBadger to his friends):

  • Each Nexus 7000 pair runs HSRP.
  • The HSRP hello packets are filtered (blocked) from the OTV interfaces. This keeps the HSRP pairs in each data center from knowing about the pair in the other data center.
  • Each HSRP pair runs the same virtual IP (the default gateway for the 10.1.1.0/24 subnet).

In this configuration, once the VM migrates to the second site the HSRP pair at the second site won’t need to send traffic across the OTV link to reach the migrated VM. This appears to be a significant advantage to OTV—a greater knowledge of the routing topology allows OTV to be more intelligent about how traffic should be directed across/around the network.

<aside>Of course, this doesn’t address L3 routing concerns from subnets not directly attached to the Nexus 7000 pairs. For that, we’d need something like LISP.</aside>

As I see it—and networking experts are welcome to jump in if I’m mistaken—this gives OTV two key advantages over VXLAN:

  1. OTV, because it is running on physical networking equipment, is more intelligent than VXLAN about how traffic is directed/routed in/around/across a network. This can result in more efficient utilization of a data center interconnect as a result of reduced “traffic tromboning.”
  2. OTV, because it is running on physical networking equipment, can provide better redundancy and faster failover than VXLAN (which relies on single instances of VSE).

It’s entirely possible that if VXLAN ever makes it into physical network equipment that these advantages of OTV will be nullified.

It’s also important to point out that while OTV and VXLAN have some overlap in functionality they are partially targeted at solving different problems. While both protocols address L2 connectivity across L3 networks, VXLAN also addresses the exhaustion of the VLAN address space in larger networks (especially service provider networks). This is an issue that OTV does not try to address. However, it seems to me that OTV would co-exist better with a solution like Q-in-Q, which could (as far as I can tell) address the VLAN ID exhaustion issue.

Once again, I encourage network experts to chime in and share their views. If I’ve misstated something, please let me know. Questions, thoughts, and comments are always welcome.

Tags: , , ,

Within the last couple of days, I received an e-mail notification that UIM/Operations 3.0 had been finalized and was now generally available (i.e., it was now considered GA).

For those that aren’t familiar, UIM has two flavors:

  • UIM/Provisioning (also referred to as UIM/P), which is tasked with handling provisioning/de-provisioning tasks in a Vblock. This would include tasks like deploying UCS B-series blades, zoning FC fabrics, and setting up storage pools.
  • UIM/Operations (also referred to as UIM/O) is tasked with providing near real-time visibility into the Vblock, as well as root cause and impact analysis.

In addition to support for UIM/P 3.0 (more info here) and all associated Vblock types, this latest release of UIM/O adds the following features:

  • Model-based deterministic automated root cause analysis for faults in a Vblock environment
  • Automated impact analysis that visualizes impact on higher-order abstractions such as vApps, UIM Services (these are defined within UIM/P) and Vblocks
  • Event forwarding via SNMP traps to enable northbound integration
  • Automation of trap reception from MDS and Nexus switches
  • Saving and restoring user preferences

As with UIM/P, the new version of UIM/O is available to authorized users on Powerlink:

Home > Support > Software Downloads and Licensing > Downloads E-I > Ionix Unified Infrastructure Manager/Operations

Documentation for UIM/O 3.0 is also available on Powerlink:

Home > Support > Technical Documentation and Advisories > Software ~ E-I ~ Documentation > Ionix Family > Ionix for Data Center Automation and Compliance > Ionix Unified Infrastructure Manager/Operations > 3.0 and Service Packs

(Think that’s a deep enough structure to navigate?)

Enjoy!

Tags: , , ,

Welcome to Technology Short Take #18! I hope you find something useful in this collection of networking, OS, storage, and virtualization links. Enjoy!

Networking

The number of articles in my “Networking” bucket continues to overflow; I have so many articles on so many topics (soft switching, OpenFlow, Open vSwitch, MPLS) that it’s hard to get my head wrapped around all of it. Here are a few posts that stuck out to me:

  • Ivan Pepelnjak has a very well-written post explaining the various ways that virtual networking can be decoupled from the physical network.
  • I stumbled across a trio of articles by Denton Gentry on hash tables (part 1, part 2, and part 3). This is an interesting perspective I hadn’t considered before; as we move more into software-defined networks (SDNs), why are we continuing to use the same mechanisms as before? Why not take advantage of more efficient mechanisms as part of this transition?

Servers/Operating Systems

  • Nigel Poulton and I traded a few tweets during HP Discover Vienna about SCSI Express (or SCSI over PCIe, SoP). He wrote up his thoughts about SoP and its future in the storage industry here. Further Twitter-based discussions about fabrics led him to say that HP buying Xsigo would bring the competition back against UCS. I’m not so sure I agree. Xsigo’s server fabric technology/product is interesting, but it seems to me that it’s still adding layers of abstraction that aren’t necessary. As SR-IOV, MR-IOV, and PCIe extension matures, it seems to me that Ethernet as the fabric is going to win. If that’s the case, and HP wants to bring the hurt against UCS, they’re going to have to invest in Ethernet-based fabrics.
  • Speaking of UCS, here’s a “how to” on deploying the UCS Platform Emulator on vSphere. You might also like the UCS PE configuration follow-up post.
  • Here’s what looks to be a handy Mac OS X utility to track how long until your Active Directory password expires. Sounds simple, yes, but useful.

Storage

Virtualization

  • Jason Boche, after some collaboration with Bob Plankers, wrote up a good procedure for expanding the vCloud Director Transfer Server storage space. It’s definitely worth a read if you’re going to be working with vCloud Director.
  • Microsoft has released version 3.2 of the Linux Integration Services for Hyper-V. The new release adds integrated mouse support, updated network drivers, and fixes an issue with SCVMM compatibility.
  • Julian Wood, who I had the opportunity to meet in Copenhagen at VMworld 2011, has published a four-part series on managing vSphere 5 certificates. Follow these links for the series: part 1, part 2, part 3, and part 4.
  • Thinking of deploying Oracle on vSphere? You should probably read this three-part series from VMware’s Business Critical Applications blog: part 1 is here, part 2 is here, and part 3 is here.
  • I’m so used to dealing with VLANs in a vSphere environment, I didn’t consider the challenges that might come up when using them with VMware Workstation. Fortunately, this author did—read his post on mapping VLANs to VMnets in VMware Workstation.
  • I thought that this article on virtual disks with business critical applications would be a deep dive on which virtual disk formats (thin, lazy zeroed, eager zeroed) are best suited for various applications. While the article does discuss the different virtual disk formats, unfortunately that’s as far as it goes.
  • Fellow VMware vSphere Design co-author Forbes Guthrie highlights an important design concern with AutoDeploy: what about a virtual vCenter instance? Read his full article for the in-depth discussion.
  • This post by William Lam gives a good overview of when vSphere MoRefs change (or don’t change).
  • Here’s a good explanation why NIC teaming can’t be used with iSCSI binding.
  • Cormac Hogan also posted a nice overview of some new vmkfstools enhancements in vSphere 5.
  • Terence Luk posts a detailed procedure to help recover VMware Site Recovery Manager in the event of a failure of one of the SRM servers. Good information—thanks Terence!

And that’s it for this time around. Feel free to add your thoughts in the comments below—all comments are welcome! (Please provide full disclosure of vendor affiliations/employment where applicable. Thanks!)

Tags: , , , , , , , ,

In my earlier post on VXLAN and Layer 3 connectivity, I had a fatal flaw in my thinking and in my diagrams that was corrected for me in the comments to that post. In this post, I want to revisit the idea of Layer 3 connectivity with VXLAN and include the corrected information (and new diagrams).

The “fatal flaw” was that I was working under the impression that we’d have to change network address translation (NAT) mappings on the vShield Edge (VSE) instance that was handling NAT for a particular VXLAN segment. As a result of this incorrect thinking, I stated that VXLAN broke Layer 3 connectivity. As it turns out, I was wrong.

Instead—and this makes perfect sense now that my flawed thinking was pointed out—the VSE instance continues to serve as the default Layer 3 gateway for the workload(s) inside the VXLAN segment.

Consider this diagram, which shows how a workload external to a VXLAN segment communicates with a workload inside a VXLAN segment:

Note that in this diagram, the Linux workload outside the VXLAN segment communicates via the VSE instance handling NAT for that particular VXLAN segment. The VSE instance (VSE 1) passes that communication to the internal workload, and the return traffic follows the same path. Layer 3 connectivity outside of the VXLAN segment is handled via traditional/normal Layer 2/3 methods.

Now consider this diagram, which shows the same communication, but after the Windows-based workload inside the VXLAN segment has now migrated to a different location:

Note that even though the Windows-based workload inside the VXLAN segment now resides on a completely separate VTEP (ESXi 2, in this case), the traffic from the Linux-based workload outside the VXLAN segment continues to move through VSE 1. That’s because VSE 1 is still the Layer 3 default gateway for the IP subnet inside the VXLAN segment. Therefore—and this is where I was wrong earlier—Layer 3 connectivity is not broken, but it does have to “horseshoe” across to the other data center and then back again, as illustrated above. This is the classic traffic pattern that we see with other overlay technologies, like OTV.

For me, while this addresses Layer 3 connectivity after a migration with VXLAN, it does bring up other questions:

  • How does one provide redundancy at the VSE level? Is there VRRP support in VSE, or an equivalent function?
  • Because Layer 3 connectivity is maintained, what now is the role of OTV? Is OTV relegated to handling Layer 2 extensions only for non-virtualized workloads?
  • How do we now propose to handle the “horseshoe” routing issue? It would seem to me that the only way to address this would be to port support for LISP (or an equivalent protocol) into VSE.

Feel free to post any questions, thoughts, or corrections in the comments below. Thanks!

Tags: , , ,

Examining VXLAN

It’s taken me far too long to write this post, that’s for sure. Since the announcement of VXLAN at VMworld earlier in the year, I’ve been searching for additional information on these questions: “What is VXLAN? How does it fit into the broader networking landscape? Why did we need a new standard?” I talked to Cisco, I attended a VMworld session about networking futures, I talked to some of the authors of the IETF draft on VXLAN, I read (most of) the VXLAN draft, and I studied some existing protocols that one might think could have been put to use. I think I’m finally ready to try to address these questions.

What is VXLAN?

The answer to this question is taken directly from the IETF draft (the emphasis is mine):

This document describes Virtual eXtensible Local Area Network (VXLAN), which is used to address the need for overlay networks within virtualized data centers accommodating multiple tenants.

I think it’s important to keep this purpose in mind. While it’s a bit simplistic to state it this way, VXLAN is—essentially—a proposed standards-based replacement for the proprietary MAC-in-MAC encapsulation that is currently used in vCloud Director. Instead of using MAC-in-MAC encapsulation, VXLAN uses MAC-in-IP encapsulation, with multicast groups to handle MAC learning and unique UDP source ports to help with load balancing across multiple links. Yes, that’s a bit of a simplification, but I think it gets the main point across.

How does VXLAN fit into the broader networking landscape?

Trying to answer this question is what has occupied the majority of the time it’s taken to write this post. You can’t explain how VXLAN fits into the broader networking landscape without having a minimal understanding, at least, of what the rest of the networking landscape looks like. I had to dig in a bit deeper to MPLS, OTV, FabricPath/TRILL, and other standards/emerging standards. I’m sure that I’ve still omitted some technologies that should have been included, and I know that there are still (so much) more to learn about the technologies I did include.

Based on the information I was able to gather, the answer to this second question really builds on the answer to the first question. VXLAN only really addresses a few fundamental concerns:

  • A shortage of VLAN address space (the theoretical limit is 4094 VLANs, with many switches supporting fewer than that)
  • An inability to support multi-tenancy (both from a scale perspective as well as a separation perspective)
  • Problems with Layer 2 connectivity across disparate virtual data centers

VXLAN addresses these concerns in this way:

  • It adds a 24-bit VXLAN Network Identifier (VNI), expanding the realm of potentially unique identifiers to just shy of 17 million (16.7 million). This addresses any scale-based concerns of multitenancy.
  • It wraps Layer 2 frames in Layer 3 packets. This addresses the other part of any multitenancy concerns (VXLAN hides duplicate MAC addresses, duplicate IP addresses, and duplicate VLAN IDs found in separate VNIs). This also addresses the Layer 2 connectivity issues between disparate virtual data centers.

And that’s really about it. It doesn’t address Layer 2 multipathing/STP, it doesn’t address Layer 2 connectivity in the physical world (layer 2 connectivity is only preserved at the virtualization level), and it doesn’t address Layer 3 routing issues created by stretched VLANs and VM mobility designs. Which brings us to our third question…

Why did we need a new standard?

This answer builds on the previous two answers. Once you have a clear understanding of what VXLAN was designed to do, and how VXLAN fits into the rest of the networking protocols, then this answer is pretty easy:

  • If you’ve been reading my articles, you know already that VXLAN doesn’t preserve all forms of Layer 3 connectivity. Because it doesn’t, you still need protocols like OTV to address Layer 2/3 connectivity at the physical level.
  • Because you still need protocols like OTV to achieve VM mobility (for the time being, at least), you’re still going to need protocols like LISP to fix funny routing issues being caused by IP addresses from the same subnet existing in multiple locations at the same time.
  • Because VXLAN doesn’t address Layer 2 multipathing concerns, you still need protocols like TRILL and technologies like FabricPath.
  • Because using MPLS—which, by the way, would also address the 3 concerns VXLAN addresses—would require MPLS-enabled/MPLS-aware equipment throughout the data center, that would make an MPLS-based solution difficult for many enterprises to adopt. Using an IP encapsulation scheme means that existing physical networking equipment doesn’t have to change. (Although it might change—to add VXLAN support—at some point in the future.)

I was not a fan of VMware (apparently) driving the creation of an entirely new networking standard. However, as I dug into this, I began to see that while other solutions almost addressed these concerns, none of them were a really good fit. Yes, using MPLS probably would have worked. Using GRE might have worked (take NVGRE, for example, but that’s also a proposed new protocol). To really address the concerns head-on, though, required a solution that was written/created expressly for that purpose, and that’s VXLAN. It’s just important, though, to really understand what VXLAN is as well as what VXLAN isn’t. Otherwise, you’ll find yourself trying to fit VXLAN to a solution for which it really wasn’t intended—which, by the way, was why VXLAN was created in the first place.

Comments, corrections, and clarifications are always welcome!

Tags: , , ,

Some Initial MPLS Reading

I mentioned on Twitter yesterday that I was doing some basic/introductory reading on MPLS, and someone asked what materials I was using. While I’m still very early in the process of trying to understand MPLS, I thought I might share the resources I’ve used so far in trying to wrap my head around MPLS, what it is, and the basics of how it works.

Here are some of the sites I’ve used so far:

MPLS Terminology
MPLS VPN terminology
MPLS Basics – LSR Terminology
Cisco Nexus 7000 Series NX-OS MPLS Configuration Guide
MPLS, Multi-Protocol Label Switching

As you can see, right now I’m focusing on what I call the grammar—that is, the day-to-day terminology and acronyms that are prevalent throughout any and all discussions of MPLS. Being able to recognize and know what an LSR is or what label imposition means is important and prepares me for future stages of learning. (Some people may recognize my use of “grammar” here as taken from the classical education approach.)

Even based on my limited reading so far, I’m beginning to get an idea of why MPLS can be so useful—and why MPLS can be complex. I’m looking forward to continuing my MPLS education. Resources and recommended reading are welcome in the comments!

Tags: ,

Note: I’ve posted a follow-up to this article with some corrected information. Please read here.

I’ve been doing quite a bit of networking-related reading over the last few weeks, and VXLAN has been a key topic of this networking-related reading (along with OTV, MPLS, and OpenFlow). Since VXLAN’s announcement at VMworld US 2011, there have been some pretty good articles written and published about VXLAN. Here are a few, for example:

Digging Deeper into VXLAN, Part 1
VXLAN Deep Dive, Part 2: Looking at the Options
Digging Deeper in VXLAN, Pt 3, More FAQs
The Care and Feeding of VXLAN
VXLAN Part Deux
VXLAN Conclusion
Google+ discussions on VXLAN
VXLAN Primer – Part 1, BORGcube Blogs

However, the one thing that I haven’t seen a great discussion about is the impact of VXLAN on Layer 3 connectivity. I personally have fielded a number of questions about whether VXLAN will fix Layer 3 network connectivity problems with stretched clusters. So, I thought I’d take a stab here. Networking gurus (you know who you are), feel free to straighten me out if I’m wrong.

First, let’s start with a few basic things that we know about VXLAN:

  • We know that VXLAN encapsulates Layer 2 frames into Layer 3 packets (using UDP).
  • We know that VXLAN adds a 24-bit VXLAN Network Identifier (VNI) that allows for up to 16 million unique combinations.
  • We know that VXLAN Segments are built between VXLAN Tunnel End Points (VTEPs). In the initial implementation of VXLAN, the VTEP will be the Nexus 1000V VEM on an ESXi host.
  • We know that (for now) VXLAN is not understood by any physical networking devices (the transport that carries the encapsulated frames only needs to an IP-based network). (VXLAN encapsulation is a subset of OTV encapsulation, so in theory the Nexus 7000 hardware is capable of decoding VXLAN.)

With that information in mind, I’d like to use the following diagram to frame the discussion.

In the diagram, there are two ESXi hosts acting as VTEPs. Between them exist two VXLAN segments with two different VNIs (VNI 738 and VNI 864). Because VXLAN works by encapsulating Layer 2 frames into Layer 3 packets and then routing these packets between VTEPs, VXLAN accomplishes one of its primary goals: it extends Layer 2 connectivity across Layer 3 networks.

But what does that mean, exactly?

Let’s look a bit more closely. The brown shape loosely represents Layer 2 connectivity within VNI 738 (a given VXLAN segment) and its associated VLAN(s). The Windows-based VM on the ESXi host on the left can communicate via Layer 2 with the Linux-based VM on the ESXi host on the right, even though those ESXi hosts reside in completely different broadcast domains separated by a Layer 3 routed network. The key phrase here, in my mind, is that VXLAN extends Layer 2 connectivity within a given VXLAN segment.

This is not, however, the sort of “extending Layer 2 connectivity across Layer 3 networks” that people are expecting.

What people are expecting from this phrase is that you could migrate a VM from the ESXi host on the left to the ESXi host on the right (as indicated in the diagram by the large arrow pointing from left to right) and still have full IP connectivity.

In this case, the VM itself will be able to maintain the same IP address, and other VMs in the same VXLAN segment will continue to communicate with the migrated VM without any issues. But hold on a second…

We know that VXLAN allows for duplicate IP addressing schemes across different VXLAN segments (but not in the same VXLAN segment), duplicate MAC addresses across different VXLAN segments (but not in the same VXLAN segment), and duplicate VLAN IDs across different VXLAN segments (but not in the same VXLAN segment). You could, for example, use the same IP addressing scheme, same MAC addresses, and same VLAN IDs in the brown (VNI 738) and blue (VNI 864) VXLAN segments. VXLAN wouldn’t care, and the VMs inside those VXLAN segments would be unaware of this duplicity.

However, what VXLAN doesn’t address is IP translation; that functionality is relegated to a network address translator. In this case, it’s vShield Edge (VSE). So, in the instance where a VM is migrated between different Layer 3 networks, note that the only way to maintain IP connectivity from outside the VXLAN segment is to update the address translation tables and—here’s the kicker—assign the VM a new (and different) externally-accessible IP address. That breaks IP connectivity—even with dynamic DNS updates, clients will still have cached DNS responses pointing them back to the (now inactive) old external IP address. Thus, VXLAN breaks Layer 2/3 connectivity to other systems outside the VXLAN segment.

This issue, by the way, would be why various networking gurus have repeatedly stated that VXLAN does not replace OTV. To fix the issue described above, you’d have to use OTV to stretch the external-to-VXLAN VLANs so that the NAT mappings could remain unchanged and the externally accessible IP address would remain the same.

Before you assume that I knocking VXLAN, let me reaffirm that I’m not. I only felt that there hadn’t been a good, solid, understandable explanation of what sorts of connectivity were and were not extended/affected by VXLAN. Hopefully, this message has helped bring some clarity to the topic.

If I have misrepresented anything, presented something incorrectly, or if you have questions/clarifications, please let me know in the comments. Thanks!

UPDATE: As a couple of readers pointed out in the comments (thanks!), the Layer 3 connectivity isn’t quite as dire as what I’ve described. Instead of the VM’s address having to change due to a change in NAT mappings on a VSE, instead the VM’s traffic will “trombone” back to the original VSE that acts as the VXLAN segment’s default gateway. Again, thanks for the clarification/correction all!

Tags: , , ,

One of the features added in vSphere 5 was a software FCoE initiator. This is a feature that has gotten relatively little attention, other than a brief mention here and there. I’m not entirely sure why the software FCoE initiator in vSphere 5 hasn’t gotten more attention, but this past week I had the opportunity to work with the software FCoE initiator in a bit more detail. In this post I’m going to describe in a bit more detail how to set up the software FCoE initiator; in future posts, I hope to be able to provide some performance and troubleshooting information.

Prerequisites

In order to use the software FCoE initiator, you must have a network interface card (NIC) that supports the FCoE offloads. The Intel X520 NICs certainly do; these are the cards that I used in my testing. (Disclaimer: Intel provided me a pair of X520 NICs at no charge to use in this testing.) There might be others, I don’t know; the VMware HCL doesn’t seem to provide an easy way of identifying those NICs that do support the FCoE offloads vs. those NICs that don’t.

If you have a NIC that doesn’t support FCoE offloads, you won’t even be able to add a software FCoE initiator:

If, on the other hand, your NIC does support FCoE offloads, you’ll see the option to add a software FCoE initiator, like this:

Obviously, you’ll want to be sure that your NIC does support FCoE offloads before continuing.

Setting Up Networking for Software FCoE

Before you can actually set up a software FCoE initiator, you’ll first need to configure your networking appropriately. There is one important piece of information you’ll want to be sure to have before you continue: the ID of the VLAN for FCoE traffic.

If you’ve read my article on setting up FCoE on a Nexus 5000, then you know that—on a Nexus 5000 series switch, anyway—you must map the FCoE VSAN to a VLAN (using the fcoe vsan XXX command). You’re going to need that VLAN ID, so make sure that you have it. In a dual fabric setup, you’re going to have two different VLAN IDs. You’ll need both.

Once you have those VLAN IDs, you can then proceed with the networking setup:

  1. Create a VMkernel port on your ESXi host. When creating the VMkernel port, when prompted for VLAN ID specify the FCoE VLAN on fabric A.
  2. I don’t know why (and maybe this will be fixed in a future release), but you’ll need to assign an IP address to each VMkernel port. The IP address will not be used, so just pick an address on a subnet that you don’t use. (Don’t use a subnet that you are using elsewhere on the ESXi host or you could run into IP routing weirdness—remember that the VMkernel uses the first interface it finds for a particular route.)
  3. After you’ve created the VMkernel port, modify the NIC teaming policy to only use the physical uplink that is physically connected to fabric A. This might require a bit of sleuthing and/or using CDP/LLDP data to ensure that you have the right uplink selected.
  4. Create a second VMkernel port on your ESXi host, this time specifying the FCoE VLAN ID for fabric B and modifying the NIC teaming policy
    When creating the VMkernel ports, specify the appropriate VLAN IDs—one for fabric A, one for fabric B. Modify the NIC teaming policy to only use the physical uplink connected to fabric B, again using physical tracing and CDP/LLDP data as needed to verify it.

At this point, you should now have two VMkernel ports, each with separate (unused) IP addresses and configured for separate VLAN IDs and separate physical uplinks. The VLAN IDs and the physical uplinks should correspond to the FCoE fabrics (fabric A and fabric B).

Setting Up the Software FCoE Initiator

With the networking in place, you can actually add the software FCoE initiator using the “Add…” button on the Storage Adapters view of the Configuration tab. This opens up the Add Software Adapter dialog box I showed you earlier, where you can select “Add Software FCoE Adapter” and click OK.

That option opens the following dialog box:

You’ll note that the VLAN ID is 0 and isn’t changeable. I couldn’t find any way to enable this field, and in my testing it proved unnecessary to change it (it worked anyway). Select the appropriate physical uplink and click OK. You’ll do this twice—once for fabric A, and again for fabric B. After you’ve done this twice, you’ll have two software FCoE adapters:

For each of these two software FCoE adapters, you’ll see a node WWN and a port WWN listed. You can use these values in creating the zones and zonesets (see here for more information). First, though, you’ll want to be sure that the software FCoE adapter is actually talking to the fabric correctly; the best way to do that is to use the show flogi data command (on a Nexus 5000; other vendors’ switches will use a slightly different command). The outcome of the show flogi data command will look something like this:

In this screenshot (taken from an SSH session into a Nexus 5010), you can see that a device has logged into vfc1009 on VSAN 300. If you compare the port name and node name, you’ll see that they match one of the software FCoE adapters shown earlier. This is only one of the two fabrics; a matching result was seen from the other fabric, showing that both software FCoE adapters successfully logged into their respective fabrics.

At this point, the rest of the configuration consists of creating the appropriate device aliases (if desired); defining zones and zonesets; and presenting storage from the storage array(s) to the initiator(s). Since these are topics that are fairly well understood and well documented elsewhere, I won’t rehash them again here.

In future posts, I hope to be able to do provide some performance and some troubleshooting information. However, it will likely be early 2012 before I have that opportunity. In the meantime, if anyone has any additional information they’d like to share—including any corrections or clarifications—please feel free to add them to the comments below.

Tags: , , , ,

Welcome to Technology Short Take #17, another of my irregularly-scheduled collections of various data center technology-related links, thoughts, and comments. Here’s hoping you find something useful!

Networking

  • I think it was J Metz of Cisco that posted this to Twitter, but this is a good reference to the various 10 Gigabit Ethernet modules.
  • I’ve spoken quite a bit about stretched clusters and their potential benefits. For an opposing view—especially regarding the use of stretched clusters as a disaster avoidance solution—check out this article. It’s a nice counterpoint, especially from the perspective of the network.
  • Anyone know anything about sFlow?
  • Here’s a good post on VXLAN that has some useful information. I’d just like to point out that VXLAN is really only intended to address Layer 2 communications “within” a vApp or a collection of VMs (perhaps a single organization’s VMs), and doesn’t do anything to address Layer 3 routing/accessibility for clients (or “consumers”) attempting to connect to those systems. For that, you’ll still need—at least today—technologies like OTV, LISP, and others.
  • A quick thought that I’m still exploring: what’s the impact of OpenFlow on technologies like VXLAN, NVGRE, and others? Does SDN eliminate the need for these technologies? I’d be curious to hear your thoughts.

Servers/Operating Systems

  • If you’ve adopted Mac OS X Lion 10.7, you might have noticed some problems connecting to older servers/NAS devices running AFP (AppleTalk Filing Protocol). This Apple KB article describes a fix. Although I’m running Snow Leopard now, I was running Lion on a new MacBook Pro and I can attest that this fix does work.
  • This Microsoft KB article describes how to extend the Windows Server 2008 evaluation period. I’ve found this useful for Windows Server 2008 instances in the lab that I need for longer 60 days but that I don’t necessarily want to activate (because they are transient).

Storage

  • Jason Boche blogged about a way to remove stubborn hosts from Unisphere. I’ve personally never seen this problem, but it’s nice to know how to address it should it occur.
  • Who would’ve thought that an HDD could serve as a cache for an SSD? Shouldn’t it be the other way around? Normally, that would probably be the case, but as described here there are certain instances and ways in which using an HDD as a cache for an SSD can improve performance.
  • Scott Drummonds wraps up his 3 part series on flash storage in part 3, which contains information on sizing flash storage. If you haven’t been reading this series, I’d recommend giving it a look.
  • Scott also weighs in on the flash as SSD vs. flash on PCIe discussion. I’d have to agree that interfaces are important, and the ability of the industry to successfully leverage flash on the PCIe bus is (today) fairly limited.
  • Henri updated his VNXe blog series with a new post on EFD and RR performance. No real surprises here, although I do have one question for Henri: is that your car in the blog header?

Virtualization

  • Interested in setting up host-only networking on VMware Fusion 4? Here’s a quick guide.
  • Kenneth Bell offers up some quick guidelines on when to deploy MCS versus PVS in a XenDesktop environment. MCS vs. PVS is a topic of some discussion on the vSpecialist mailing list as they have very different IOPs requirements and I/O profiles.
  • Speaking of VDI, Andre Leibovici has two articles that I wanted to point out. First, Andre does a deep dive on Video RAM in VMware View 5 with 3D; this has tons of good information that is useful for a VDI architect. (The note about the extra .VSWP overhead, for example, is priceless.) Andre also has a good piece on VDI and Microsoft Outlook that’s worth reading, laying out the various options for Outlook-related storage. If you want to be good at VDI, Andre is definitely a great resource to follow.
  • Running Linux in your VMware vSphere environment? If you haven’t already, check out Bob Plankers’ Linux Virtual Machine Tuning Guide for some useful tips on tuning Linux in a VM.
  • Seen this page?
  • You’ve probably already heard about Nick Weaver’s new “Uber” tool, a new VM alignment tool called UBERAlign. This tool is designed to address VM alignment, a problem with how guest file systems are formatted within a VMDK. For more information, see Nick’s announcement here.
  • Don’t disable DRS when you’re using vCloud Director. It’s as simple as that. (If you want to know why, read Chris Colotti’s post.)
  • Here’s a couple of great diagrams by Hany Michael on vCloud Director management pods (both public cloud and private cloud management).
  • People automatically assume that “virtualization” means consolidating multiple workloads onto a single physical server. However, virtualization is really just a layer of abstraction, and that layer of abstraction can be used in a variety of ways. I spoke about this in early 2010. This article (written back in March of 2011) by Brad Hedlund picks up on that theme to show another way that virtualization—or, as he calls it, “inverse virtualization”—can be applied to today’s data centers and today’s applications.
  • My discussion on the end of the infrastructure engineer generated some conversations, which is good. One of the responses was by Aaron Sweemer in which he discusses the new (but not new) “data layer” and expresses a need for infrastructure engineers to be aware of this data layer. I’d agree with a general need for all infrastructure engineers to be aware of the layers above them in the stack; I’m just not convinced that we all need to become application developers.
  • Here’s a great post by William Lam on the missing piece to creating your own vSEL cloud. I’ll tell you, William blogs some of the coolest stuff…I wish I could dig in as deep as he does in some of this stuff.
  • Here’s a nice look at the use of PowerCLI to help with the automation of DRS rules.
  • One of my projects for the upcoming year is becoming more knowledgeable and conversant with the open source Xen hypervisor and Citrix XenServer. I think that the XenServer Design Handbook is going to be a useful resource for that project.
  • Interested in more information on deploying Oracle databases on vSphere? Michael Webster, aka @vcdxnz001 on Twitter, has a lengthy article with lots of information regarding Oracle on vSphere.
  • This VMware KB article describes how to enable centralized logging for vCloud Director cells. This is particularly important for HA environments, where VMware’s recommended HA strategy involves the use of multiple vCD cells.

I guess I should wrap it up here, before this post gets any longer. Thanks for reading this far, and feel free to speak up in the comments!

Tags: , , , , , , , , , , , , , ,

« Older entries § Newer entries »