November 2009

You are currently browsing the monthly archive for November 2009.

The next meeting of the Triangle Area VMware User Group will be next Thursday, December 10, 2009, from 11:30AM to 2PM. The meeting will be held at RTP Organization in Building 18/1807 at 104 TW Alexander Drive in Research Triangle Park. A map of the location (courtesy of Google Maps) is available here, and a schedule/agenda of the meeting is available here.

I will be presenting at the meeting during a “Stump the Expert” (or maybe it should be “Stump the vExpert”?) session at the end of the meeting. I insisted that it wouldn’t be very hard to stump this vExpert; if you don’t believe me, go back and watch the video from the Ask The Experts VMworld sessions. Nevertheless, I appreciate the opportunity to join VMware, Veeam, and local users during the meeting and I hope that I’ll be able to provide some useful information.

I hope to see you there!

Tags: ,

Two technologies that seem to have come to the fore recently are NPIV (N_Port ID Virtualization) and NPV (N_Port Virtualization). Judging just by the names, you might think that these two technologies are the same thing. While they are related in some aspects and can be used in a complementary way, they are quite different. What I’d like to do in this post is help explain these two technologies, how they are different, and how they can be used. I hope to follow up in future posts with some hands-on examples of configuring these technologies on various types of equipment.

First, though, I need to cover some basics. This is unnecessary for those of you that are Fibre Channel experts, but for the rest of the world it might be useful:

  • N_Port: An N_Port is an end node port on the Fibre Channel fabric. This could be an HBA (Host Bus Adapter) in a server or a target port on a storage array.
  • F_Port: An F_Port is a port on a Fibre Channel switch that is connected to an N_Port. So, the port into which a server’s HBA or a storage array’s target port is connected is an F_Port.
  • E_Port: An E_Port is a port on a Fibre Channel switch that is connected to another Fibre Channel switch. The connection between two E_Ports forms an Inter-Switch Link (ISL).

There are other types of ports as well—NL_Port, FL_Port, G_Port, TE_Port—but for the purposes of this discussion these three will get us started. With these definitions in mind, I’ll start by discussing N_Port ID Virtualization (NPIV).

N_Port ID Virtualization (NPIV)

Normally, an N_Port would have a single N_Port_ID associated with it; this N_Port_ID is a 24-bit address assigned by the Fibre Channel switch during the FLOGI process. The N_Port_ID is not the same as the World Wide Port Name (WWPN), although there is typically a one-to-one relationship between WWPN and N_Port_ID. Thus, for any given physical N_Port, there would be exactly one WWPN and one N_Port_ID associated with it.

What NPIV does is allow a single physical N_Port to have multiple WWPNs, and therefore multiple N_Port_IDs, associated with it. After the normal FLOGI process, an NPIV-enabled physical N_Port can subsequently issue additional commands to register more WWPNs and receive more N_Port_IDs (one for each WWPN). The Fibre Channel switch must also support NPIV, as the F_Port on the other end of the link would “see” multiple WWPNs and multiple N_Port_IDs coming from the host and must know how to handle this behavior.

Once all the applicable WWPNs have been registered, each of these WWPNs can be used for SAN zoning or LUN presentation. There is no distinction between the physical WWPN and the virtual WWPNs; they all behave in exactly the same fashion and you can use them in exactly the same ways.

So why might this functionality be useful? Consider a virtualized environment, where you would like to be able to present a LUN via Fibre Channel to a specific virtual machine only:

  • Without NPIV, it’s not possible because the N_Port on the physical host would have only a single WWPN (and N_Port_ID). Any LUNs would have to be zoned and presented to this single WWPN. Because all VMs would be sharing the same WWPN on the one single physical N_Port, any LUNs zoned to this WWPN would be visible to all VMs on that host because all VMs are using the same physical N_Port, same WWPN, and same N_Port_ID.
  • With NPIV, the physical N_Port can register additional WWPNs (and N_Port_IDs). Each VM can have its own WWPN. When you build SAN zones and present LUNs using the VM-specific WWPN, then the LUNs will only be visible to that VM and not to any other VMs.

Virtualization is not the only use case for NPIV, although it is certainly one of the easiest to understand.

<aside>As an aside, it’s interesting to me that VMotion works and is supported with NPIV as long as the RDMs and all associated VMDKs are in the same datastore. Looking at how the physical N_Port has the additional WWPNs and N_Port_IDs associated with it, you’d think that VMotion wouldn’t work. I wonder: does the HBA on the destination ESX/ESXi host have to “re-register” the WWPNs and N_Port_IDs on that physical N_Port as part of the VMotion process?</aside>

Now that I’ve discussed NPIV, I’d like to turn the discussion to N_Port Virtualization (NPV).

N_Port Virtualization

While NPIV is primarily a host-based solution, NPV is primarily a switch-based technology. It is designed to reduce switch management and overhead in larger SAN deployments. Consider that every Fibre Channel switch in a fabric needs a different domain ID, and that the total number of domain IDs in a fabric is limited. In some cases, this limit can be fairly low depending upon the devices attached to the fabric. The problem, though, is that you often need to add Fibre Channel switches in order to scale the size of your fabric. There is therefore an inherent conflict between trying to reduce the overall number of switches in order to keep the domain ID count low while also needing to add switches in order to have a sufficiently high port count. NPV is intended to help address this problem.

NPV introduces a new type of Fibre Channel port, the NP_Port. The NP_Port connects to an F_Port and acts as a proxy for other N_Ports on the NPV-enabled switch. Essentially, the NP_Port “looks” like an NPIV-enabled host to the F_Port on the other end. An NPV-enabled switch will register additional WWPNs (and receive additional N_Port_IDs) via NPIV on behalf of the N_Ports connected to it. The physical N_Ports don’t have any knowledge this is occurring and don’t need any support for it; it’s all handled by the NPV-enabled switch.

Obviously, this means that the upstream Fibre Channel switch must support NPIV, since the NP_Port “looks” and “acts” like an NPIV-enabled host to the upstream F_Port. Additionally, because the NPV-enabled switch now looks like an end host, it no longer needs a domain ID to participate in the Fibre Channel fabric. Using NPV, you can add switches and ports to your fabric without adding domain IDs.

So why is this functionality useful? There is the immediate benefit of being able to scale your Fibre Channel fabric without having to add domain IDs, yes, but in what sorts of environments might this be particularly useful? Consider a blade server environment, like an HP c7000 chassis, where there are Fibre Channel switches in the back of the chassis. By using NPV on these switches, you can add them to your fabric without having to assign a domain ID to each and every one of them.

Here’s another example. Consider an environment where you are mixing different types of Fibre Channel switches and are concerned about interoperability. As long as there is NPIV support, you can enable NPV on one set of switches. The NPV-enabled switches will then act like NPIV-enabled hosts, and you won’t have to worry about connecting E_Ports and creating ISLs between different brands of Fibre Channel switches.

I hope you’ve found this explanation of NPIV and NPV helpful and accurate. In the future, I hope to follow up with some additional posts—including diagrams—that show how these can be used in action. Until then, feel free to post any questions, thoughts, or corrections in the comments below. Your feedback is always welcome!

Disclosure: Some industry contacts at Cisco Systems provided me with information regarding NPV and its operation and behavior, but this post is neither sponsored nor endorsed by anyone.

Tags: , , , , ,

Welcome back to yet another Virtualization Short Take! Here is a collection of virtualization-related items—some recent, some not, but hopefully all interesting and/or useful.

  • Matt Hensley posted a link to this VIOPS document on how to setup VMware SRM 4.0 with an EMC Celerra storage array. I haven’t had the chance to read through it yet.
  • Jason Boche informs us that both Lab Manager 3 and Lab Manager 4 have problems with the VMXNET3 virtual NIC. In this blog post, Jason describes how his attempts to install Lab Manager server into a VM with the VMXNET3 NIC was failing. Fortunately, Jason provides a workaround as well, but you’ll have to read his article to get that information.
  • Bruce Hoard over at Virtualization Review (disclaimer: I write a regular column for the print edition of Virtualization Review) stirred up a bit of controversy with his post about Hyper-V’s three problems. The first problem is indeed a problem, but not an architectural or technological problem; VMware is indeed the market leader and has a quite solid user base. The second two “problems” stem from Microsoft’s architectural decision to embed the hypervisor into Windows Server. Like any other technology decision, this decisions has its advantages and disadvantages (these technology decisions are a real double-edged sword). Based on historical data, it would seem that the need to patch Windows Server will impact the uptime of the Windows virtualization solution; however, this is not to say that VMware ESX/ESXi are not without their patches and associated downtime as well. I guess the key takeaway here is that VMware seems to be doing a much better job of lessening (or even removing) the impact of the downtime through things like VMotion, DRS, HA, maintenance mode, and the like.
  • Apparently there is a problem with the GA release of the Host Update utility that is installed along with the vSphere Client, as outlined here by Barry Coombs. Downloading the latest version and reinstalling seems to fix the issue.
  • And while we are on the subject of ESX upgrades, here’s another one: if the /boot partition is too small, the upgrade to ESX 4.0.0 will fail. This isn’t really anything too new and, as Joep points out, is documented in the vSphere Upgrade Guide. I prefer clean installations of VMware ESX/ESXi anyway.
  • Dave Mishchenko details his adventures (part 1, part 2, and part 3) in managing ESXi without the VI Client or the vCLI. While it’s interesting and contains some useful information, I’m not so sure that the exercise is useful in any way other than academically. First of all, Dave enables SSH access to ESXi, which is unsupported. Second, while he shows that it’s possible to manage ESXi without the VI Client or the vCLI, it don’t seem to be very efficient. Still, there is some useful information to be gleaned for those who want to know more about ESXi and its inner workings.
  • I think Simon Seagrave and Jason Boche were collaborating in secret, since they both wrote posts about using vSphere’s power savings/frequency scaling functionality. Simon’s post is dated October 27; Jason’s post is dated November 11. Coincidence? I don’t think so. C’mon, guys, go ahead and admit it.
  • Thinking of using the Shared Recovery Site feature in VMware SRM 4.0? This VMware KB article might come in handy.
  • I’m of the opinion that every blogger has a few “masterpiece” posts. These are posts that are just so good, so relevant, so useful, that they almost transcend the other content on the blogger’s site. Based on traffic patterns, one of my “masterpiece” posts is the one on ESX Server, NIC teaming, and VLAN trunking. It’s not the most well-written post I’ve ever published, but it seems to have a lasting impact. Why do I mention this? Because I believe that Chad Sakac’s post on VMware I/O queues, microbursting, and multipathing is one of his “masterpiece” posts. Like Scott Drummonds, I’ve read that post multiple times, and every time I read it I get something else out of it, and I’m reminded of just how much I have yet to learn. Time to get back out of that comfort zone!
  • Oh, and speaking of Chad’s blog…this post is handy, too.

That’s all for now, folks. Stay tuned for the next installation, where I’ll once again share a collection of links about virtualization. Until then, feel free to share your own links in the comments.

Tags: , , , , , , ,

Storage Short Take #5

I’ve decided to resurrect my Storage Short Take series, after almost a year since the last one was published. I find myself spending more and more time in the storage realm—which is completely fine with me—and so more and more information coming to me in various forms is related to storage. While I’m far from the likes of storage rockstars such as Robin Harris, Stephen Foskett, Storagebod, and others, hopefully you’ll find something interesting and useful here. Enjoy!

  • This blog post by Frank Denneman on the HP LeftHand product is outstanding. I learned more from this post than a lot of posts recently. Great work Frank!
  • Need a bit more information on FCoE? Nigel Poulton has a great post here (it’s a tad bit older, but I’ve just stumbled across it) with good details for those who might not be familiar with FCoE. It’s worth a read if you haven’t already taken the time to come up to speed on FCoE and its “related” technologies.
  • What led me to Nigel’s FCoE post was this post by Storagezilla in which he rants about “vendor flapheads” who “are intentionally obscuring it’s [FCoE's] limitations”. You’ve got that right! Wanting to present a reasonably impartial and complete view of FCoE was partially the impetus behind my end-to-end FCoE post and the subsequent clarification. Thankfully, I think that the misinformation around FCoE is starting to die down.
  • This post has a bit of useful information on HP EVA path policies and vSphere multipathing. I would have liked a bit more detail than what was provided, but the content is good nevertheless.
  • Devang Panchigar’s recoup of HP TechDay day 1, which focused on HP StorageWorks technologies, has some good information, especially if you aren’t already familiar with some of HP’s various storage platforms.
  • Chad Sakac of EMC has some very useful information on Asymmetric Logical Unit Access (ALUA), VMware vSphere, and EMC CLARiiON arrays. If you using EMC storage with your VMware vSphere 4 environment, and you have a CX4, and you’re running FLARE 28.5 or later, it might be worthwhile to switch your path policy from NMP to Round Robin (RR).
  • Speaking of RR with vSphere, somewhere I remember seeing information on changing the default number of I/Os down a path, and tweaking that for best performance. Was that in Chad’s VMworld session? Anyone remember?
  • If you’re looking for a high-level overview of SAN and NAS virtualization, this InfoWorld article can help you get started. You’ll soon want to delve deeper than this article can provide, but it’s a reasonable starting point, at least.

That’s it for this time around. Feel free to share other interesting or useful links in the comments.

Tags: , , , , , ,

I was reading a completely unrelated post on Alessandro’s site this morning about how VKernel is reacting to VMware’s release of CapacityIQ when a thought occurred to me: is VMware legitimizing the competition?

Here’s the excerpt from Alessandro’s post that started me thinking:

And of course VKernel now is also in hurry to clarify that support for Microsoft Hyper-V and Citrix XenServer is coming.

Now, let me ask you this question: what is one of the largest complaints about products like Microsoft Hyper-V and Citrix XenServer? It’s the size of the partner ecosystem. Customers are a bit more hesitant to deploy these other solutions in part because there aren’t as many partner solutions out there to complement the virtualization solutions.

So, as VMware expands into new markets like capacity management and monitoring, backups, etc., former VMware-only partners are forced to adapt their products to work with Hyper-V and XenServer in order to protect themselves. This causes the size of the partner ecosystem for VMware’s competitors to grow, eliminating that complaint and removing one of VMware’s competitive advantages. In effect, VMware’s own actions are building out the partner ecosystem for their competitors and thus legitimizing the competition.

Am I crazy? Am I wrong? What is a company like VMware to do, if anything? I’d love to hear your thoughts.

UPDATE: Some readers have pointed out, rightfully so, that “legitimizing” isn’t really the best word to use here. Perhaps “assisting” or “helping” is a better word?

Tags: , , , , , , ,

I’ve been doing a pretty fair amount of work recently with the Cisco Nexus 5000 series of switches, as evidenced by the flurry of Nexus-related articles:

Connecting Nexus 5000 to Older Gigabit Ethernet Switches
Setting Up FCoE on a Nexus 5000
FCoE and VLAN Trunking on Nexus 5000

One thing I hadn’t yet documented was how to enable jumbo frames on a Nexus 5000. Since jumbo frames are now officially supported for VMkernel traffic with VMware vSphere, the combination of jumbo frames and 10Gb Ethernet is an attractive one. I’ve covered the ESX/ESXi side (ordinary vSwitches here and distributed vSwitches here), but here’s the Nexus side.

The commands are pretty straightforward, and I’ve included the commands for both NX-OS 4.0 and NX-OS 4.1 (they are different between versions). Important note: if you enabled jumbo frames under NX-OS 4.0 and then upgraded the switch to version 4.1, you’ll need to re-do your jumbo frame configuration.

For NX-OS 4.1, the commands to enable jumbo frames are:

switch(config)# policy-map type network-qos jumbo
switch(config-pmap-nq)# class type network-qos class-default
switch(config-pmap-c-nq)# mtu 9216
switch(config-pmap-c-nq)# exit
switch(config-pmap-nq)# exit
switch(config)# system qos
switch(config-sys-qos)# service-policy type network-qos jumbo

Now, contrast the commands above with the following commands, which you would have used to enable jumbo frames on NX-OS 4.0:

switch(config)# policy-map jumbo
switch(config-pmap)# class class-default
switch(config-pmap-c)# mtu 9216
switch(config-pmap-c)# exit
switch(config)# system qos
switch(config-system)# service-policy jumbo

The end result of these differences is this: if you upgrade NX-OS from 4.0 to 4.1, then your jumbo frames configuration will go away, and you’ll need to enter the commands for version 4.1 in order to enable jumbo frame support again. This little gotcha caused me quite a headache when my NFS-based datastores suddenly went offline after the NX-OS upgrade.

More information on the necessary commands can be found here for version 4.0 and here for version 4.1.

Tags: , ,

In the event you haven’t heard, Gestalt IT has organized the Gestalt IT Field Day to take place next week, November 12 and 13, somewhere on the West Coast. This is a very exciting event that brings together multiple vendors and multiple bloggers to discuss their technology and their products in a non-NDA environment. This is a great way for companies to help increase knowledge and awareness of their products. Be prepared, though, to get both the good and the bad—most bloggers won’t hold back!

The list of blogger attendees is an all-star list of folks like (in no particular order) Rich Brambley, Chris Evans, Robin Harris, Greg Ferro, Rod Haywood, John Obeto, Nigel Poulton, Simon Seagrave, and more! This is quite a gathering of folks. The sponsor companies should be very grateful to get in front of this audience!

For more information on Gestalt IT Field Day, check out the following links:

Gestalt IT Field Day
Field Day Frequently Asked Questions
Full Attendee List
Full Sponsor List
Tech Field Day “Do You Know…” Contest

Although I wasn’t able to make it to this event, I’m hopeful that Gestalt IT will organize future events that I will be able to attend. In the meantime, though, look for some great information from these folks next week!

Tags: ,

It’s a bit later than I would have liked (sorry Jason!), but I wanted to write up a brief post-mortem on the VCDX Design Exam, which I took—and passed with a score of 408—this past Tuesday.

The exam wasn’t particularly difficult in the sense of needing to know specific details. For example, I saw very few questions asking about a specific command-line parameter, the output of a specific command, or how many X that feature Y supports. There were a few, but not many. That doesn’t mean you don’t need to know that sort of stuff, because you do—but in a very different way.

The exam is moderately difficult in the sense that you are required to take the product knowledge you have and put it together to solve a scenario. The real kicker, in my mind, is the “Select the best answer” prompt—implying that there was more than one technically correct answer. But which answer was best? Yes, you could do it via method A, but is method B better? Which method best satisfies the requirements described in the question? In this aspect, I did find the test challenging. Not incredibly difficult, but challenging.

Unfortunately, the test suffered from a few issues. There were several questions that had missing components necessary for a correct answer. I commented on those, but what else can you do? The design portion of the exam—by now you probably know there is a multiple choice section and a graphical design section—had a terrible interface. I struggled more with the interface than I did the question. Looking at it from VMware’s perspective, I’m sure that it was incredibly difficult to come up with something like that, and I give them credit for actually trying (and not totally failing).

All in all, I think that VMware did a reasonably good job with the exam. It was very different from the Enterprise Admin exam but equally challenging in a very different way.

Now comes the real question: what did I use to study? Here’s the resources that I used:

  • The VI3 connections and ports diagram, downloaded here from Forbes’ site
  • iSCSI Design Considerations and Deployment Guide from VMware
  • Configuring and Troubleshooting N_Port ID Virtualization document from VMware
  • Setup for Microsoft Cluster Service document from VMware
  • Fibre Channel SAN Configuration Guide from VMware
  • Virtual Machine Backup Guide from VMware
  • Resource Management Guide from VMware
  • Duncan’s VMware HA deepdive
  • VI3 security hardening white paper from VMware

I used all this in conjunction with the Design Exam blueprint, of course.

So, there you go: there’s my brief analysis of my VCDX Design Exam experience. Your mileage may vary, of course, since there are numerous versions of the exam.

Tags: ,

VMware, Cisco, and EMC made their official announcement of the VCE Coalition and the joint venture Acadia this morning. You can read one of the press releases here via MarketWire.

Acadia is interesting, but it really isn’t the meat of the announcement, in my opinion. The real substance of the matter is the nature of the coalition. There are many interesting questions/thoughts circling in my head right at the moment:

  • What impact will this have on VMware’s relationship(s) with HP, IBM, and Dell? “Throwing their hat in the ring” with Cisco’s UCS, so to speak, may greatly endanger VMware’s much larger (with respect to revenue) relationships with other OEMs. What will happen to VMware if those OEMs “throw their hat in the ring” with Microsoft and Hyper-V? This is not a good place to be.
  • The acrimonious Cisco-HP relationship adds further fuel to the concerns over VMware’s close alliance with Cisco’s computing platform.
  • Does this new coalition signal a move away from the “arms-length” relationship between EMC and VMware, a move that some (competitors, notably) have been talking about for some time? If so, what danger does that put VMware in with regards to storage relationships?
  • It seems to me that VMware has the most to lose here. What does EMC lose if this doesn’t go well? Nothing, really. What about Cisco? Nothing, really. VMware, on the other hand…well, it could be ugly.
  • What does this coalition offer that the three companies couldn’t deliver without the coalition? Why risk important relationships? This is a big question in my mind. Lots of technology companies have delivered validated designs without any sort of formal coalition. Why is one necessary in this case?
  • On the other end of the spectrum—keeping Acadia out of the picture for the moment—is this “new coalition” really anything more than what the three companies have already been doing? Is this really anything more than each of the companies dedicating resources to this effort? I know from my own direct interaction with at least one of these vendors that resources had already been dedicated to the VCE technology intersection before any sort of formal announcement. So, does this formal announcement really mean anything at all?

I don’t have any answers (yet), but you can at least read my thoughts—and contribute back to them via the comments—without having to pay $499 to some analyst firm.

By the way, if you’d like some other viewpoints on this matter, here are a couple from opposing viewpoints:

NetApp – Jay’s Blog: The Importance of Being Open
Chuck’s Blog: Announcing the VCE Coalition

Feel free to speak up in the comments below (courteous comments only, please, and be sure to include full vendor disclosure where appropriate). Thanks!

Tags: , , , , , , , ,

In my earlier post on how to configure FCoE on a Nexus 5000, one of the readers suggested in the comments that it was necessary to have the interfaces in VLAN trunk mode via the switchport mode trunk command. I didn’t pay that much attention to it because the interfaces were indeed in VLAN trunk mode.

Fast forward to yesterday, when I was troubleshooting a problem between a Gen2 QLogic CNA and the Nexus 5010 in my lab (I tweeted about it). Although the Ethernet side of the CNA works just fine, the CNA refuses to bring up an FCoE connection. In the process of troubleshooting, Brad Hedlund (check his outstanding web site) suggested to me in a Twitter direct message that I should double-check the VLAN trunking status of the interface. That part I’d already heard from the reader who commented on the first post, but the next part was new to me (emphasis mine):

Gen2 requires ‘switchport mode trunk’ on the 5K. Gen1 doesn’t. Also make sure FCoE VLANs are allowed on the trunk.

Ah, now there’s something I hadn’t heard! That prompted me to do a bit of testing this morning (yes, I know I’m supposed to be studying for the VCDX Design Exam this afternoon). In my testing, I confirmed that a Gen1 CNA (I’m using Gen1 Emulex CNAs) does not require VLAN trunking to be enabled on the Ethernet interface.

There does appear to be a “gotcha” though: if the Ethernet interface is in access mode, it’s access VLAN must be the same as the FCoE VLAN; otherwise, the vfc interface will report down.

In summary:

  • If you are using a Gen2 CNA, you must put the Ethernet interface in VLAN trunk mode.
  • If you are using a Gen1 CNA, the Ethernet interface may be in either access mode or trunk mode.
  • If the interface is in trunk mode, be sure that you have allowed the FCoE VLAN via the switchport trunk allowed vlan command.
  • If the interface is in access mode, be sure that you have placed the interface in the FCoE VLAN via the switchport access vlan command.

If there are any other subtleties or nuances I’ve missed, please post them in the comments below so that future readers will benefit. Thank you!

Tags: , , , , ,