November 2010

You are currently browsing the monthly archive for November 2010.

The Cisco Nexus 2000 series fabric extender (or “fex”) is an implementation of network interface virtualization. Because the Nexus 2000 fabric extender acts as a remote line card to the upstream IV-capable bridge (a Nexus 5000 series switch, typically), all the configuration takes place on the upstream bridge. In this post, I’ll describe how to connect a Nexus 2148T to a Nexus 5010. In reality, the process is incredibly simple and not really worthy of a blog post, but in the interest of completeness I’ll document it here.

The key to making the fabric extender work is the switchport mode fex-fabric command, used on a specific interface to let the Nexus 5000 know that a fabric extender is connected to that port:

nexus(config)# interface eth2/1
nexus(config-if)# switchport mode fex-fabric

You can use the switchport mode fex-fabric command on a specific interface, as shown above, or on a port-channel:

nexus(config)# interface port-channel 2
nexus(config-if)# switchport mode fex-fabric
nexus(config-if)# interface eth2/1
nexus(config-if)# switchport mode fex-fabric
nexus(config-if)# channel-group 2 mode on
nexus(config-if)# interface eth2/2
nexus(config-if)# switchport mode fex-fabric
nexus(config-if)# channel-group 2 mode on
nexus(config-if)# interface port-channel 2
nexus(config-if)# fex associate 100

In this example, I created a port-channel with two interfaces and associated the fabric extender to the port-channel, providing link-level redundancy for the connection between the Nexus 2148 and the upstream Nexus 5010.

To verify that the fabric extender is coming online properly, you can use a few different commands. The show fex detail command produces some useful output (only the first several lines are included here):

FEX: 100 Description: FEX0100   state: Online
  FEX version: 4.2(1)N1(1) [Switch version: 4.2(1)N1(1)]
  FEX Interim version: 4.2(1)N1(1)
  Switch Interim version: 4.2(1)N1(1)
  Extender Model: N2K-C2148T-1GE, Extender Serial: JAF1321ANJN
  Part No: 73-12009-05
  Card Id: 70, Mac Addr: 00:0d:ec:cf:03:42, Num Macs: 64
  Module Sw Gen: 12594 [Switch Sw Gen: 21]
  post level: complete
pinning-mode: static Max-links: 1
  Fabric port for control traffic: Eth2/1
  Fabric interface state:
    Po2 - Interface Up. State: Active
    Eth2/1 - Interface Up. State: Active

And that’s really about it. Yes, there are more advanced configurations—I’m interested in exploring the use of virtual port-channels upstream of a Nexus 2000 series fabric extender to multiple Nexus 5000 switches—but this will get you started.

Feel free to post corrections, suggestions, or any other feedback in the comments!

Tags: , ,

As part of an ongoing effort to expand the functionality of the vSpecialist lab in the EMC RTP facility, we recently added a pair of Cisco MDS 9134 Fibre Channel switches. These Fibre Channel switches are connected to a pair of Cisco Nexus 5010 switches, which handle Unified Fabric connections from a collection of CNA-equipped servers. To connect the Nexus switches to the MDS switches, we used SAN port channels to bond multiple Fibre Channel interfaces together for both redundancy and increased aggregate throughput. Here is how to configure SAN port channels to connect a Cisco Nexus switch to a Cisco MDS switch.

If you are interested, more in-depth information can be found here on Cisco’s web site.

Although I’ve broken out the configuration for the MDS and the Nexus into separate sections, the commands are very similar. In my situation, the MDS 9134 was running NX-OS 5.0(1a) and the Nexus 5010 was running NX-OS 4.2(1)N1(1).

Configuring the Cisco MDS 9134

To configure the MDS 9134 with a SAN port channel, use the following commands.

First, create the SAN port channel with the interface port-channel command, like this:

mds(config)# interface port-channel 1

You can replace the “1″ at the end of that command with any number from 1 to 256; it’s just the numeric identifier for that SAN port channel. The SAN port channel number does not have to match on both ends.

Once you’ve created the SAN port channel, then add individual interfaces with the channel-group command:

mds(config)# interface fc1/16
mds(config-if)# channel-group 1

The “1″ specified in the channel-group command has to match the number specified in the earlier interface port-channel command. This might seem obvious, but I wanted to point it out nevertheless.

Repeat this process for each interface you want to add to the SAN port channel. In my example, I used two interfaces.

When you add an interface to the SAN port channel, NX-OS reminds you to perform a matching configuration on the switch at the other end, then use the no shutdown command to make the interfaces (and the SAN port channel) active. Let’s look first at the commands for configuring the Nexus, then we’ll examine what it looks like when we bring the SAN port channel online.

Configuring the Cisco Nexus 5010

The commands here are very similar to the MDS 9134. First, you need to create the SAN port channel using the interface san-port-channel command (note the slight difference in commands between the MDS and the Nexus here):

nexus(config)# interface san-port-channel 1

As with the MDS, the number at the end simply serves as a unique identifier for the SAN port channel and can range from 1 to 256.

Then add interfaces to the SAN port channel using the channel-group command:

nexus(config)# interface fc2/1
nexus(config-if)# channel-group 1
nexus(config-if)# interface fc2/2
nexus(config-if)# channel-group 1

As I’ve shown above, simply repeat the process for each interface you want to add to the SAN port channel. As on the MDS, NX-OS reminds you to perform a matching configuration on the opposite end of the link and then issue the no shutdown command.

Bringing Up the SAN Port Channel

Once a matching configuration is performed on both ends, then you can use the no shutdown command (which you can abbreviate to simply no shut) to activate the interfaces and the SAN port channel. After activating the interfaces, a show interface port-channel (on the MDS) or a show interface san-port-channel (on the Nexus) will show you the status of the SAN port channel. Only the first few lines of output are shown below (this output is taken from the Nexus):

nexus# sh int san-port-channel 1
san-port-channel 1 is trunking (Not all VSANs UP on the trunk)
    Hardware is Fibre Channel
    Port WWN is 24:01:00:05:9b:7b:0c:80
    Admin port mode is auto, trunk mode is on
    snmp link state traps are enabled
    Port mode is TE
    Port vsan is 1
    Speed is 4 Gbps
    Trunk vsans (admin allowed and active)  (1)
    Trunk vsans (up)                        ()
    Trunk vsans (isolated)                  ()
    Trunk vsans (initializing)              (1)

A couple of useful pieces of information are available here:

  • First, you can see that the SAN port channel is not fully up; it’s still initializing. This is shown by the “Not all VSANs UP on the trunk” message, as well as by the “Trunk vsans (initializing)” line.
  • Second, you can see the only a single member is up. Note the speed of the SAN port channel is listed as 4 Gbps.
  • Third, note that this is a trunking port, meaning that it could carry multiple VSANs. This is noted by the “Port mode is TE” line as well as the first line of the output (“san-port-channel 1 is trunking”).

As it turns out, I’d cabled the connections wrong; after I fixed the connections and gave the SAN port channel a small amount of time to initialize, the output was different (this output is taken from the MDS):

nexus# sh int port-channel 1
port-channel 1 is trunking
    Hardware is Fibre Channel
    Port WWN is 24:01:00:05:73:a7:72:00
    Admin port mode is auto, trunk mode is on
    snmp link state traps are enabled
    Port mode is TE
    Port vsan is 1
    Speed is 8 Gbps
    Trunk vsans (admin allowed and active)  (1)
    Trunk vsans (up)                        (1)
    Trunk vsans (isolated)                  ()
    Trunk vsans (initializing)              ()

Now you can see that both members of the SAN port channel are active (“Speed is 8 Gbps”) and that all VSANs are trunking across the SAN port channel.

At this point, you are now ready to proceed with creating VSANs, zones, and zonesets. Refer to these articles for more information on MDS zone creation and management via CLI:

New User’s Guide to Configuring Cisco MDS Zones via CLI
New User’s Guide to Managing Cisco MDS Zones via CLI

As always, questions, clarifications, or corrections are welcome—just add them below in the comments. Thanks!

Tags: , , , ,

Welcome to Technology Short Take #7! This time around I have a collection of links from networking, servers, storage, and virtualization. Our hot topics in this issue include Fibre Channel over Ethernet (FCoE) and its need—or lack thereof—for congestion management, Ubuntu on Hyper-V, the benefits of VAAI, and more!

Networking

I have a lot of FCoE-related links this time around. I’m not sure if that means FCoE has been getting more coverage or if it’s just a case of confirmation bias.

  • Need to decrypt a Cisco type 7 password? This page provides instructions on how it can be done. (Please be sure to use your powers for good, not evil.)
  • This blog post catalogue is a link list to a treasure trove of networking information.
  • I suppose this is one way of dealing with requests to do long-distance vMotion. I’m not so sure I agree that it’s an effective way.
  • The use of NIV to create the equivalent of multi-hop FCoE is something I discussed a while ago, but Brad Hedlund recently revisited it again. I can see both sides of the argument—both “for” and “against” considering fabric extenders as multi-hop FCoE—and I can see the need to use standard terminology to describe things. Without standard terminology, “multi-hop FCoE” means different things to different vendors and it’s hard for customers to make valid comparisons.
  • Erik Smith, a relatively new blogger, has a great introduction to FIP, FIP Snooping Bridges, and FCFs. If you’re new to FCoE—or even if you aren’t and want more detail—this is a great read with loads of relevant information. I’m looking forward to more of Erik’s posts on this topic.
  • The blog battle over FCoE’s need for QCN rages on. Joe Onisick does a good job of explaining QCN and why it might/might not be necessary, so if you’re unfamiliar with the debate that’s a good place to start. Ivan Pepelnjak breaks down 802.1Qau (the QCN standard) even further, providing more details on its operation and behavior. He then weights into the debate with this quick explanation and this comparison to Frame Relay. In the end, the answer to the question of FCoE’s need for QCN really boils down to everyone’s favorite IT answer: “It depends.” In this case, it depends upon your network design. With more DCB-capable switches between the end nodes and the FCFs, QCN becomes more valuable. With fewer (or no) DCB-capable switches between the end nodes and the FCFs, QCN offers far less benefit.

Servers

I’m adding this section because I have some articles that apply to servers, but not necessarily to virtualization. Since it fits in nicely with the data center theme of Technology Short Takes, it seems like a reasonable addition.

  • Jeff Allen, a UCS-focused CSE at Cisco, recently posted this guide to SAN boot with Cisco UCS. It’s definitely worth a read, especially if you’re new to UCS or haven’t done boot from SAN with UCS before.
  • I haven’t had nearly the time to blog about Cisco UCS as I would have liked, but Brian Gracely included me in this list of people to follow for Cisco UCS information. Thanks, Brian! I’ll do my best to earn my inclusion on the list.
  • Chris Fendya of WWT posted instructions on how to slipstream the Cisco UCS drivers into the installation of Windows Server 2003.

Storage

It’s funny to me that the storage section of these posts is typically the shortest. There are plenty of storage-related blogs out there, but almost all of them are high-level and tend not to provide the sort of down-to-earth “in the trenches” information I like to include. If readers have any suggestions for blogs that provide this sort of information, I’d love to hear them.

  • InformationWeek recently published this article on how to break free from Tier 1 SAN vendors. (Disclosure: I work for just such a Tier 1 SAN vendor.) I can’t say that I agree with the author’s reasoning; by the same token, customers should be able to go out and buy white box servers. Yet, companies such as HP and Dell are still selling lots of servers. Why is that? Because the value of a top-tier server is greater than the sum of its parts, and the same can be said for Tier 1 storage arrays. Now, having said that, I do agree that storage virtualization—which was the real focus of the InformationWeek article—can bring a lot of value and flexibility to the data center. I just don’t think that storage virtualization and Tier 1 storage arrays are mutually exclusive.
  • Here is a good “how to” on enabling ALUA and Round Robin multipathing with ESX and a CLARiiON CX4 array.
  • Bob Plankers has a great article on the impact of VAAI on storage operations. In this post, he shows how the write rate for his VAAI-capable HDS AMS 2500 drops to nothing when cloning templates. This is a great demonstration of how VAAI helps offload storage operations from the hosts to the array. Keep in mind that VAAI might not make operations faster, but it will make them more efficient. (It’s a subtle distinction, but an important one nevertheless.)
  • In the event you are considering pursuing CCIE Storage—a task that I’ve been strongly considering undertaking—Brian Feeny posted a list of CCIE Storage preparation resources.

Virtualization

That wraps up this installation of Technology Short Takes. As always, your comments, thoughts, suggestions, or clarifications are welcome, so please speak up in the comments!

Tags: , , , , , , , ,

I’ve been invited to participate in a couple of upcoming podcasts and thought I’d post something here in case you are interested in listening in.

First up is the Packet Pushers Podcast, hosted by Greg Ferro, Ethan Banks, and Dan Hughes. Joining me is Ivan Pepelnjak of Cisco IOS Hints and Tricks. We’ll be discussing areas of intersection between networking and virtualization and the resulting concerns. It should be a great podcast—I’m both excited and a bit apprehensive. After all, it’s not everyday that you get the opportunity to “talk shop” with a group of very talented and very accomplished professionals. I hope I can hold my own!

The second podcast is Coffee With Thomas, hosted by Thomas Jones. I have a feeling this podcast won’t be quite so intense, since Thomas’ podcasts are intended to be casual and conversational. Still, Thomas has some pretty pointed questions he’s planning on asking, so we’ll see!

If you get the opportunity to download and listen to either of these podcasts after they’ve been published (it will be a few weeks yet), I and the podcast producers would certainly appreciate your support.

Tags: ,

Welcome to Technology Short Take #6, the latest collection of links, articles, and thoughts on virtualization, networking, storage, and the intersection of the three. I’m going to try a slightly new format for this post in my Technology Short Take series; I’m interested in knowing what you think about the format.

Networking

  • If you’ve worked with UNIX (or UNIX variants like Linux or BSD), then you’re probably familiar with regular expressions. This post breaks down the use of regular expressions in Cisco IOS, which I found useful.
  • To get a better understand of some of the IEEE standards that serve to make Ethernet lossless (and thus be a viable transport for FCoE), I recommend having a look at two articles on Cisco IOS Hints and Tricks. The first article is on 802.1Qaz (Enhanced Transmission Selection, or ETS) and the second article touches on PFC/ETS and how it impacts storage traffic. The article on PFC’s interaction with storage traffic is particularly good, as it really sheds light on how PFC/ETS will interact with FCoE in a way that at first seems counterintuitive. Once you think about it for a little bit, though, it does make sense.
  • This article also provides a quick summary of the various Data Center Bridging (DCB) standards, in case you need a review of what’s involved. (Another post on the DCB standards is available here as well.)
  • Aaron Conaway has a good post on using SLA monitoring on the PIX/ASA.
  • Need to add a static ARP entry on a Nexus switch? Jeff (aka fryguy_pa on Twitter) has a quick how-to on adding static ARP entries on NX-OS.
  • Carole Warner Reece posted what I thought was a useful how-to on setting up back-to-back vPCs on Cisco Nexus switches. In her example, she used two pairs of Nexus 7000 switches, but it could have just as easily been Nexus 7000s and Nexus 5000s. I’m not sure I entirely understand the benefit of this configuration; she states that it’s loop-free but I think I’m going to need to think on this for a bit longer in order to fully get why someone might use this particular configuration.
  • And speaking of ways to eliminate Spanning Tree, Jeremy Filliben has a comparison of a few of the methods. In the spirit of redundancy, here’s another good post on the topic of Spanning Tree and it’s role in FCoE multipathing.
  • Josh O’Brien reminds readers in this post that there’s a reason the no ip routing command exists in Layer 3-capable switches. “Just because you can, doesn’t mean you should…”
  • I experimented with policy-based routing in my virtualized GNS3 environment a while ago, but never could make it work. I didn’t really have a use case; I was just experimenting and (hopefully) learning. I might have to use this post for some guidance next time I give it another try.

Storage

  • If you’re in the market for a new storage array, you might want to check support for the vStorage APIs for Array Integration (VAAI). This page has a list of VAAI-capable arrays and the necessary firmware required to support VAAI.
  • Erik Zandboer describes a potential issue with VMware ESXi 4.1, host profiles, and CLARiion auto-registration. I’m not aware of any workaround for the problem yet.
  • I think that this has been covered quite well elsewhere, but I did want to point it out for the sake of completeness. Fellow vSpecialist team member Nick Weaver (lynxbat on Twitter) has created a PowerShell front-end for the EMC Celerra arrays, which he’s calling UBERShell. It’s quite impressive what Nick’s managed to create. Go have a look if you haven’t already.

Virtualization

  • I don’t know if this one counts as virtualization or networking, but either way I guess it doesn’t really matter. I found this VMware KB article on the use of the “Route based on IP hash” setting and the fact that it is not supported on Cisco UCS. This is because the UCS 6100 series fabric interconnects currently don’t support any form of cross-stack link aggregation.
  • Here’s another article that I’m not quite sure is networking or virtualization, but I’ll put it under virtualization. Cisco UCS offers the ability to do fabric failover, i.e., provide NIC redundancy in hardware so that the OS/hypervisor doesn’t need to. Prior to the 1.4 release of the UCS Manager software, though, there were problems with learned MAC addresses, like the ones given to VMs. I’ll let Brad Hedlund explain the rest of it to you in this great post on UCS fabric failover. As with so many of Brad’s articles, this one is well-written and very informative.
  • I think I’m missing something with regard to Hyper-V’s new dynamic memory feature. I’ve been reading a few posts (this one springs to mind) about how using dynamic memory can increase VM density. I agree with that. As I understand Hyper-V’s dynamic memory, it’s about configuring a VM with the minimum amount of RAM and then letting it “burst” up to a pre-configured maximum, but without actually overcommitting memory on the host (because, according to Microsoft, overcommitment is bad). So what happens to the extra memory that VMs aren’t using? You can’t overcommit, so that memory sits there idle, even though you could probably use it. Or am I missing something?
  • In the event you’re interested in exploring Hyper-V’s dynamic memory feature, Ben Armstrong describes here what’s required to use it.
  • This two-part series on VSS and VMware is great reading to get a better understanding of how VSS integration in VMware works (or doesn’t work). Part 1 is here; Part 2 is here. Also, there’s a good follow-up to both these articles that provides additional information on application-level quiescing.
  • Jason Boche has been on a bit of a blogging marathon recently, cranking out some good stuff. There’s this article on reducing FT logging traffic for disk read intensive workloads, this post on an issue with VMware DPM, a post on hardware status and maintenance mode (applies only to pre-vSphere 4.1 environments), and finally a post on the conversion from CPU Ready to %RDY. All of these are good articles and well worth the read.
  • Duncan’s latest article on using esxplot on Mac OS X is pretty cool; I might have to try that myself. Of course, Duncan produces lots and lots of good stuff (which is why he’s perennially voted #1). One example: this brief summary of some VMware HA futures. If you haven’t visited his site recently (or if you aren’t subscribed to his RSS feed), you’re doing yourself a disservice.
  • Reviewing vscsiStats data using 3D surface charts is awesome. I love it.
  • These two articles by William Lam on running VMs on Dropbox and backing up VMs to Dropbox are interesting (and different), but not entirely practical. Of course, I don’t think that William necessarily intended them to be practical, they struck me as more of “I wonder if I could…” type of situation. Still, it’s interesting to see how data synchronization tools like Dropbox could change the way we view/use VMs. If only there was a hardware-based solution that was transparent to the hypervisor…oh, wait, there is: it’s called VPLEX! (Sorry, I couldn’t resist. I’ll try to behave next time.)

I have a ton more links and posts that I’d love to include, but in the interest of time and length I’ll stop here for today. I hope that you find something interesting and useful here!

Tags: , ,

Another topic arose over the last few days on the vSpecialist mailing list around an event that is logged by VMware vCenter Server when you use Storage I/O Control (SIOC) in conjunction with EMC MirrorView. (MirrorView, for those that don’t know, is a replication solution for the EMC CLARiiON arrays.) The focus on the discussion was around the fact that vCenter Server logs an event to the effect that an “external I/O workload has been detected on shared datastore running Storage I/O Control (SIOC) for congestion management”. This particular event is documented by VMware in this VMware KB article.

One of the recommendations for using SIOC is that you not connect “external workloads”—that is, workloads that are not managed by the same vCenter Server instance—to a shared datastore that is enabled for SIOC. I called this out in my liveblog of this SIOC session at VMworld 2010. Clearly, it’s impossible for vCenter Server to enforce limits and shares defined by SIOC when there are workloads external to and not under the control of vCenter Server.

In this particular case, it’s entirely OK to ignore this event. Using SIOC in conjunction with MirrorView is OK; the event just stems from the fact that there is, in fact, an external workload (MirrorView) that is affecting the datastore. Some vSpecialists suggested that the event shouldn’t even be there if we are simply going to tell users to disregard it, and I agree. For now, though, the recommendation is to simply ignore the event and continue.

Longer term, I think it’s safe to say that detection of these sorts of conditions, and the reporting of these conditions, will improve. First off, as was pointed out in the discussion thread among the vSpecialists, it’s pretty cool that VMware vSphere can detect this. As the integration between the storage layer and VMware vSphere improves (think more detailed instrumentation of the storage layer being made visible to the hypervisor), these sorts of storage awareness will also improve.

Feel free to post any comments, questions, or clarifications below.

UPDATE: I mention above that it’s OK to ignore this event, but in reality it would be prudent to double-check the reason that this event is being logged. If the only reason this event is being generated is due to replication via MirrorView, then the event is benign and you shouldn’t be terribly concerned. However, you might find that there are other factors that are generating this event, in which case you should most definitely take action. Be sure to review the VMware KB article and verify that there are not other contributing factors that might be causing this event.

Tags: , , , ,

Last week I published a quick note about RecoverPoint-VAAI interoperability that outlined some potential concerns around the use of VAAI with RecoverPoint. In that post—which was based on current information from the RecoverPoint product management team—I called out the need to disable some VAAI functionality because it was our understanding that RecoverPoint ignored certain VAAI commands instead of rejecting them as not implemented. (Rejecting them is actually the preferred behavior, since it forces the VMware ESX/ESXi hosts to fall back to pre-VAAI operation.)

Today I received word that the current version of RecoverPoint (available today) does properly reject unsupported VAAI commands instead of ignoring them, when used in conjunction with the array splitter out of FLARE 30. (You might initially question the need for the splitter out of FLARE 30, but recall that FLARE 30 is the version needed to support VAAI.) This is good news!

So what does this mean? There are two key takeaways:

  1. You do not need to disable any VAAI functionality on your VMware ESX/ESXi hosts. With the FLARE 30 array splitter, RecoverPoint will properly reject unimplemented or unsupported VAAI commands.
  2. Remember that the current version of RecoverPoint (available right now) does support hardware-accelerated locking.

I also received confirmation that the next release of RecoverPoint will implement proper rejection of unimplemented or unsupported VAAI commands when used with intelligent fabric splitters. Again, this is good news—it means that you won’t have to disable VAAI functionality with fabric splitters with the next release. For the current release, though, you’ll still need to disable VAAI functionality with the fabric splitters.

Here’s a quick summary, then, of the configurations and the steps required:

  • With the FLARE 30 array splitter: No additional configuration required. Hardware-accelerated locking is fully supported, and all other commands properly rejected. There is no need to disable VAAI on the VMware ESX/ESXi hosts.
  • With the fabric splitters: In current release, VAAI commands are ignored, not rejected. You need to disable VAAI on hosts (see here for information how). The next release will reject VAAI commands properly; at that point, VAAI can be left enabled. Disable VAAI until then.

If you have any questions or comments, please let me know.

Tags: , , , , ,

Last week I was invited to speak at a joint session of the East Tennessee VMware and EMC user groups in Knoxville. Despite some bad weather that kept some people from attending, the meeting was a great success. I’m posting a copy here, via SlideShare, of the presentation that I used for that meeting. Note that the actual presentation had some embedded videos in it that SlideShare won’t translate; these appear as blank slides.

I hope you enjoy it and find it informative!

 

Questions, suggestions, and clarifications are always welcome! Feel free to speak up in the comments.

Tags: , , , ,