PASS

Articles in this category are republished to the SQLPASS virtualization site.

I had a customer contact me about scaling network throughput when using NFS datastores. Specifically, this customer was interested in knowing if it was possible to utilize more than 1 NIC with IP-based storage. The customer is currently using link aggregation (EtherChannel on a Cisco switch). I pointed the customer to my post on NIC utilization, in which I explain the prerequisites for utilizing more than 1 NIC in this sort of configuration. To refresh your memory, those prerequisites are:

  • The vSwitch must be configured for “Route based on IP hash”
  • The physical NICs connected to the vSwitch as uplinks must all be configured as active in the failover order
  • The physical switch must be configured for link aggregation
  • There must be multiple, unique source-destination IP address pairs involved

The customer responded with a question (which I’m paraphrasing here): “That’s all? It will just automatically use more than one link?”

Well…sort of.

There is one little caveat. Cisco IOS uses a hashing algorithm to determine which link a particular traffic flow between a source and destination will use. This algorithm is controlled by the port-channel load-balance command. Assuming that you’re using source-destination IP hashing, that means the Cisco switch will use a hash of the source IP address and the destination IP address to determine which link it will use. This page has more detailed information.

It’s theoretically possible, based on the number of links in the port channel, that some traffic flows between different pairs of source-destination IP addresses might end up on the same link. That means it’s not necessarily just as simple as setting up multiple NFS exports or iSCSI targets on different IP addresses—you also need to know if the IP addresses you are using will actually result in the traffic being distributed across the links.

How does one tell? Good question, and one I’m glad you asked. You can tell using this command (this command assumes you are using IP-based hashing):

switch# test etherchannel load-balance interface <Port channel interface> ip <Src IP Addr> <Dst IP Addr>

So, let’s say that you have an ESX/ESXi host with a VMkernel interface whose address is 172.16.5.10. Let’s say that you have a storage array (NetApp FAS, EMC Celerra, etc.) that supports NFS and you want to mount two different NFS exports on two different IP addresses so that traffic from this ESX/ESXi host to the storage array. You could use the test etherchannel load-balance command to help you determine which address could help ensure traffic distribution across the links:

switch# test etherchannel load-balance interface Po3 ip 172.16.5.10 172.16.5.100

For more examples of what the output would look like, take a look at this image. This was taken off a Cisco Catalyst 3560G running my test lab (and yes, the IP addresses have been changed to protect the innocent).

This would give you one way of testing whether your link aggregation configuration would actually use multiple links, or only a single link due to the IP hash calculation. Also, don’t forget that esxtop can also show you NIC utilization; here’s an example of both uplinks being used in this sort of configuration.

Unfortunately, what I can’t tell you right now is what algorithm the vSwitch itself uses to place traffic onto the uplinks. Does it follow the same sort of mechanism as the Cisco switch? I don’t know. If anyone has any information on that, it would be tremendously helpful.

If anyone has any other pertinent information or resources on this topic, please add them to the comments below.

UPDATE: Duncan Epping pointed out an article by Ken Cline from earlier this year provides the mechanism VMware uses to determine which uplink on a vSwitch will be used. This algorithm performs an XOR operation on the Least Significant Byte (LSB) of the source and destination IP addresses, then finds the modulus of that result and the number of uplinks. Thanks, Duncan and Ken!

Tags: , , , ,

I wanted to go ahead and get another issue of Virtualization Short Takes out the door before VMworld, as I suspect that I’ll be covered up both during and for some time after VMworld. So, here’s my latest collection of links and articles about virtualization, storage, and anything else I find interesting.

  • Chad Sakac brings up an important issue for EMC CLARiiON users also using vSphere and iSCSI; be sure to read the full post for all the details. Basically, this bug in the FLARE code puts us back to using multiple IP subnets to scale iSCSI traffic. Bummer. I imagine they’ll get it fixed up pretty quick, but until then it’s back to the old way of scaling IP-based storage traffic. Chad’s posts on VMware-storage integration (Part 1 and Part 2) are good reads as well.
  • Nick Triantos weighs in with a good post on how to configure ALUA support and Round Robin I/O in vSphere. This looks useful; too bad the old NetApp gear I have in the lab won’t run the latest Data ONTAP version so I can test this myself. Oh, and you should also check out Nick’s post on the NetApp Collector and Analyzer for Virtual Environments, which looks like it might be a handy tool for sizing NetApp storage environments.
  • Duncan Epping points out a couple of issues related to VMFS block size in this post on snapshots and block size. Good find!
  • Ben Armstrong puts up a great post about competitive arguments. I have to say that I have a new respect for Ben after reading this post. He’d always presented himself very professionally, but his open approach to comparing virtualization products is very refreshing, and one that I wish more people would adopt. I’m particularly impressed that Ben quoted Proverbs 27:17 in his post.
  • Aaron Sweemer posted a newsletter from a co-worker on his site that has some great information. You should definitely have a look, I think you’ll find something useful there.
  • Rick Scherer posted the steps necessary to remove a rogue vCenter Chargeback plug-in. Useful, but I wish all plug-ins provided a mechanism like this.
  • Jason Nash brings to light a bug in Cisco Nexus 1000V when used in conjunction with CNAs. Be sure to have a look if this has any similarity to your environment. Like Jason, I have some Gen 1 Emulex CNAs so I may run into the same issue myself as I build out the Nexus hardware in the lab.
  • The Systems Engineer (no name provided) gives a handy one-line command to map ESX datastores to EMC CLARiiON LUNs. I’ll have to give this one a try once I get my CLARiiON up and running.
  • Somewhere along the way I picked up the URL to this VMware KB article about problems with iSCSI or NFS over an EtherChannel link. Hmmm, that looks interesting, but when you read the article it points out that the issue exists when you are using EtherChannel but the vSwitch is configured as “Route based on originating virtual port ID.” That’s a configuration mismatch—of course you’re going to have problems! Simply change the vSwitch to “Route based on ip hash” (the strongly recommended setting when using EtherChannel) and the problems go away.
  • Stevie Chambers (formerly of VMware, now with Cisco) posts about 10 technology advances since 2005. The article is mostly about the Intel Xeon 5500 CPUs and a couple other features specific to Cisco’s Unified Computing System (UCS); namely, the Palo adapter and the Catalina ASIC. While he wanders a bit, I think Stevie’s point is about how virtualization architects and operations staff need to understand the impact of these technologies and how they affect the virtualization solution—a useful point, indeed.
  • Paul Fazzone has a couple of great posts on the Cisco Nexus 1000V: first an article with an overview of VM network security with the Nexus 1000V, then a second article describing how the Nexus 1000V compares to multiple vSwitches. Both are good reads for people seeking a bit more information on deployment scenarios for the Nexus 1000V.
  • Computerworld posted this article about the 7 half-truths of virtualization. The underlying point behind all of these “half-truths” is that in order for an organization to really reap the benefits of virtualization, that organization needs to change, to adapt, and to grow with the virtualization initiative. If you just virtualize and don’t change anything else, your ROI will be limited at best. I particularly agree with #5: if you’re investigating VDI for short-term cost savings, you’re barking up the wrong tree.
  • This is kind of cool. I might put this on my home network.
  • I haven’t had my chance to talk with Arista yet, but I’m surprised that there hasn’t been more buzz around their announcement of vEOS. In fact, I had to hear about it (other than a very brief e-mail from Doug Gourlay) from a Cisco contact! How crazy is that? I suppose, as I mentioned on Twitter, that Arista is going to make a big push next week during VMworld 2009 in San Francisco.

That wraps up this edition of Virtualization Short Takes. Next week will be a busy week; look for lots of coverage from the conference in San Francisco as well as summaries of my vendor meetings (and there are lots of them!). Until then, take care!

Tags: , , , , , , ,

I’ll preface this article by saying that I am not (yet) an expert with Cisco’s Unified Computing System (UCS), so if I have incorrect information I’m certainly open to clarification. Some would also accuse me of being a UCS-hater, since I had the audacity to call UCS a blade server (the horror!). Truth is, I’m on the side of the customer, and as we all know there is no such thing as a “one size fits all” solution. Cisco can’t provide one, and HP can’t provide one.

The mudslinging that I’m talking about is taking place between Steve Chambers (formerly with VMware, now with Cisco) and HP. HP published a page with a list of reasons why Cisco UCS should be dismissed, and Steve responded on his personal blog. Here are the links to the pages in question:

The Real Story about Cisco’s “One Giant Switch” view of the Datacenter (this was based, in part at least, on the next link)
Buyer beware of the “one giant switch” data center network model
HP on the run

I thought I might take a few points from these differing perspectives and try to call out some mudslinging that’s occurring on both sides. To be fair, Steve states in the comments to his article that it was intended to be entertaining and light-hearted, so please keep that in mind.

Point #1: Complexity

The reality of these solutions is that they are both equally complex, just in different ways. HP’s BladeSystem Matrix uses reasonably well-understood and mature technologies, while Cisco UCS uses newer technologies that aren’t as widely understood. This is not a knock against either; as I’ve said before in many other contexts and many other situations, there are advantages and disadvantages to every approach. HP’s advantage is that leverages the knowledge and experience that people have with their existing technologies: StorageWorks storage solutions, ProLiant blades, ProCurve networking, and HP software. The disadvantage is that HP is still tied to the same “legacy” technologies.

In building UCS, Cisco’s advantage is that the solution uses the latest technologies (including some that are still Cisco-proprietary) and doesn’t have any ties to “legacy” technologies. The disadvantage, naturally, is that this technological leap creates additional perceived complexity because people have to learn the new technologies embedded within UCS.

Adding to the simple fact that both of these solutions are equally complex in different ways is the fact that you must re-architect your storage in order to gain the full advantage of either solution. To get the full benefit of both UCS and HP BladeSystem Matrix, you need to be doing boot-from-SAN. (Clearly, this doesn’t apply to virtualized systems, but both Cisco and HP are touting their solutions as equally applicable to non-virtualized workloads.) This is a fact that, in my opinion, has been significantly understated.

Neither HP nor Cisco really have the right to proclaim their solution is less complex than the other. Both solutions are complex in their own ways.

Point #2: Standards-Based vs. Proprietary

Again, neither HP nor Cisco really have any room to throw the rock labeled “Proprietary”. Both solutions have their own measure of vendor lock-in. HP is right; you can’t put an HP blade or an IBM blade into a Cisco UCS chassis. Steve Chambers is right; you can’t put a Dell blade or a Cisco blade server into an HP chassis. The reality, folks, is that every vendor’s solution is has a certain amount of vendor lock-in. Does VMware vSphere have vendor lock-in? Sure, but so does Hyper-V and Citrix XenServer. Does Microsoft Windows have vendor lock-in? Of course, but so does…so does…well, you get the idea.

HP says VNTag is proprietary and won’t even work with some of Cisco’s own switches. OK, let’s talk proprietary…does Flex-10 work with other vendor’s switches? The fact of the matter is that both Cisco and HP have their own forms of vendor lock-in and neither can cry “foul” on the other. It’s a draw.

Point #3: The “Giant Network Switch”

At one point in HP’s article (I believe it was under the Complexity heading) they make this point about the network traffic in a Cisco UCS environment:

In Cisco’s one-giant-switch model, all traffic must travel over a physical wire to a physical switch for every operation. Consequently, it appears that traffic even between two virtual servers running next to each other on the same physical would have to traverse the network, making an elaborate “hairpin turn” within the physical switch, only to traverse the network again before reaching the other virtual server on the same physical machine. Return traffic (or a “response” from the second virtual machine) would have to do the same. Each of these packet traversals logically accounts for multiple interrupts, data copies and delays for your multi-core processor.

I do have to call “partial FUD” on this one. In a virtualized environment, even a virtualized environment running the Cisco Nexus 1000V, traffic from one virtual server to another virtual server on the same host never leaves that host. HP’s statement seems to imply that’s not the case, and as far as I know it is. However, HP’s statement is partially true: traffic from one virtual server on one physical host does have to travel to the fabric interconnect and then back again in order to communicate with a virtual server running on a physical host in the same chassis. The fabric extenders don’t provide any switching functionality; that all occurs in the interconnect. Based on the information I’ve seen thus far, I would say that using Cisco’s SR-IOV-based “Palo” adapter and attaching VMs directly to a virtual PCIe NIC would put you into the situation HP is describing, which then just reinforces a question that Brad Hedlund and I tossed back and forth a couple of times: is hypervisor-bypass, aka VMDirectPath, with “Palo” the right design for all environments? In my opinion, no—I again go back to my statement that there is no “one size fits all” solution. And considering that the use of hypervisor-bypass with “Palo” would put you into a situation where traffic between two virtual machines on the same physical host has to travel to the fabric interconnect and back again, I’m even less inclined to use that architecture.

In the end, it’s pretty clear to me that both HP and Cisco have some advantages and disadvantages to their respective solutions, and neither vendor really has the room to label the other as “more complex” or “more proprietary” than the other. But what do you think? Do you agree or disagree? Courteous comments (with full vendor disclosure) are welcome.

Tags: , , , ,

You might have read the article I wrote here titled vSphere Virtual Machine Upgrade Process, in which I described a process whereby you could upgrade your VMs to VM hardware version 7 (the version used with vSphere) as well as use the latest paravirtualized network and SCSI drivers (VMXNET3 and PVSCSI). Both PVSCSI and VMXNET3 offer greater performance with the same CPU utilization.

Rightfully so, some readers and other bloggers pointed out that PVSCSI isn’t supported for boot disks (Rich Brambley put up a really good post, for example). Rich, among others, suggested moving virtual machines back to a “two disk model,” with a boot disk and a separate data disk; this would allow for the greater performance of the PVSCSI controller on the data disk. This seemed to be a reasonable workaround. I don’t recall hearing about any significant issues with VMXNET3. Using the newer network driver seemed to be a good move all the way around.

Unfortunately, there is another drawback to both of these devices. Rich caught this drawback in his article, but relegated it to a small mention at the very end of the article that even I overlooked at first (emphasis mine):

There are some other factors to consider as well. For example, vSphere Fault Tolerance cannot be enabled on a VM using PVSCSI.

That’s right—you cannot use VMware Fault Tolerance (FT) on a virtual machine that is using the PVSCSI device. However, this restriction doesn’t just apply to the PVSCSI device; it also applies to VMXNET3! VMware FT cannot be enabled on a virtual machine using either the VMXNET3 or PVSCSI devices; vCenter Server will simply report an error that the network interface or disk controller isn’t supported for VMware FT.

In my opinion, this is a significant enough limitation that I felt it warrants its own post. If you are planning on using VMware FT in your environment, be sure not to configure any virtual machines to use VMXNET3 or PVSCSI if they might need to be protected with VMware FT. In this case, you’ll have to choose from either maximum performance or maximum protection—you don’t get both.

UPDATE: Rich Brambley shared links to two resources that describe the incompatibility between VMware FT and PVSCSI and VMXNET3:

VMware Communities: Unable to configure FT with error “Unsupported virtual machine configuration for Fault Tolerance. Device ‘Network adapter 1′ is not supported”
VMware Fault Tolerance Requirements and Limitations

Tags: , , ,

I was going through my list of actions in OmniFocus, looking at my projects and actions and evaluating each of them. In my “Potential Posts” project, where I keep links to articles that I might use in a blog post, I found the URL for this article by Steve Kaplan about virtualization, Cisco Nexus, and blade servers. The basic idea of his article is that virtualization and the Cisco Nexus—specifically, the unified fabric—are going to combine to kill blade servers.

I do agree with Steve that there is no innate relationship that means running VMware on blades is somehow “automagically” better:

It is amazing how frequently we hear IT managers talk about deploying blade servers as an integral component of their new virtual infrastructures – as if there were an obvious synergy between VMware and blade server architectures.

Absolutely! Blades are an option, just like rack mounted servers, and it’s up to the customer to choose (or us as consultants to recommend) the form factor that best meets the business needs. It might be blade servers, or it might be rack mounted servers. It just depends. So, on this one point, I agree with Steve.

Yet, at the same time, I also disagree with this point that Steve makes in his article:

Blade servers have always been an impediment to an optimal virtual infrastructure because they introduce limitations in efficiently utilizing power and cooling resources, budget, flexibility, manageability, bios and firmware updates, performance and troubleshooting.

Here is where Steve and I start to disagree. In fact, this specific article was something of the catalyst for a series of posts, written by colleague and friend Aaron Delp, detailing how blade servers and virtualization work well together:

Blades and Virtualization Aren’t Mutually Exclusive: Part One, HP Power Sizing
Blades and Virtualization Aren’t Mutually Exclusive: Part Two, IBM Power Sizing
Blades and Virtualization Aren’t Mutually Exclusive: Part Three, IBM Traditional Expansion Options
Blades and Virtualization Aren’t Mutually Exclusive: Part Four, HP Traditional Expansion Options

While this series of articles doesn’t squarely address all of the arguments against blades and virtualization, the series does make it clear that blades can produce power savings vs. rack mounted servers, and that blades do offer enough expansion options to accommodate the majority of virtualization deployments.

I also disagree with Steve about the value of the unified fabric, especially considering that right now unified fabric can exist only at the edge of the network and not at the core. That being the case, I find it hard to say that unified fabric is going to kill blade servers. So, again, I have to disagree with Steve’s position.

However, Steve’s not entirely wrong—virtualization, FCoE and 10Gb Ethernet, and yes even unified fabric will change how blade servers are designed and deployed. Cisco’s Unified Computing System (UCS) is one example of how blade servers are going to adapt to these agents of change, and I believe we’ll see more examples from other leading vendors in the coming months and years. But will blades die away entirely? No, I don’t think so.

Think I’m crazy? Think I’m out of my mind? Feel free to speak up in the comments—courteous comments are always welcome.

Tags: , , , ,

A lot of the content on this site is oriented toward VMware ESX/ESXi users who have a pretty fair amount of experience. As I was working with some customers today, though, I realized that there really isn’t much content on this site for new users. That’s about to change. As the first in a series of posts, here’s some new user information on creating vSwitches and port groups in VMware ESX using the command-line interface (CLI).

For new users who are seeking a thorough explanation of how VMware ESX networking functions, I’ll recommend a series of articles by Ken Cline titled The Great vSwitch Debate. Ken goes into a great level of detail. Go read that, then you can come back here.

Before I get started it’s important to understand that, for the most part, the information in this article applies only to VMware ESX. VMware ESXi doesn’t have a Linux-based Service Console like VMware ESX, and therefore doesn’t have a readily-accessible CLI from which to run these sorts of commands. There is a remote CLI available, which I’ll discuss in a future post, but for now I’ll focus only on VMware ESX.

The majority of all the networking configuration you will need to perform on VMware ESX boils down to just a couple commands:

  • esxcfg-vswitch: You will use this command to manipulate virtual switches (vSwitches) and port groups.
  • esxcfg-nics: You will use this command to view (and potentially manipulate) the physical network interface cards (NICs) in the VMware ESX host.

Configuring VMware ESX networking boils down to a couple basic tasks:

  1. Creating, configuring, and deleting vSwitches
  2. Creating, configuring, and deleting port groups

I’ll start with creating, configuring, and deleting vSwitches.

Creating, Configuring, and Deleting vSwitches

You’ll primarily use the esxcfg-vswitch command for the majority of these tasks. Unless I specifically indicate otherwise, all the commands, parameters, and arguments are case-sensitive.

To create a vSwitch, use this command:

esxcfg-vswitch -a <vSwitch Name>

To link a physical NIC to a vSwitch—which is necessary in order for the vSwitch to pass traffic onto the physical network or to receive traffic from the physical network—use this command:

esxcfg-vswitch -L <Physical NIC> <vSwitch Name>

In the event you don’t have information on the physical NICs, you can use this command to list the physical NICs:

esxcfg-nics -l (lowercase L)

Conversely, if you need to unlink (remove) a physical NIC from a vSwitch, use this command:

esxcfg-vswitch -U <Physical NIC> <vSwitch Name>

To change the Maximum Transmission Unit (MTU) size on a vSwitch, use this command:

esxcfg-vswitch -m <MTU size> <vSwitch Name>

To delete a vSwitch, use this command:

esxcfg-vswitch -d <vSwitch Name>

Creating, Configuring, and Deleting Port Groups

As with virtual switches, the esxcfg-vswitch is the command you will use to work with port groups. Once again, unless I specifically indicate otherwise, all the commands, parameters, and arguments are case-sensitive.

To create a port group, use this command:

esxcfg-vswitch -A <Port Group Name> <vSwitch Name>

To set the VLAN ID for a port group, use this command:

esxcfg-vswitch -v <VLAN ID> -p <Port Group Name> <vSwitch Name>

To delete a port group, use this command:

esxcfg-vswitch -D <Port Group Name> <vSwitch Name>

To view the current list of vSwitches, port groups, and uplinks, use this command:

esxcfg-vswitch -l (lowercase L)

There are more networking-related tasks that you can perform from the CLI, but for a new user these commands should handle the lion’s share of all the networking configuration. Good luck!

Tags: , , , ,