In early December of 2006, I wrote a very popular article on VMware ESX, NIC teaming, and VLAN trunking. In that article, I laid out the configuration for using both NIC teaming and VLAN trunking. In particular, the NIC teaming configuration in that article described the use of Cisco Gigabit EtherChannel for link aggregation, in which both the physical switch and the vSwitch are configured to distribute traffic across all the links between them.
Since that time, the question has come up many times: which method is better, with EtherChannel or without? Many engineers prefer not to use EtherChannel (or its standardized equivalent, static LACP/802.3ad) because of the added complexity involved. It’s easier to just team the NICs at the vSwitch level and leave the physical switches alone. That is true, but what about performance? And what impact does this have on NIC utilization?
There are two ways of handling NIC teaming in VMware ESX:
- Without any physical switch configuration
- With physical switch configuration (EtherChannel, static LACP/802.3ad, or its equivalent)
In the NIC teaming/VLAN trunking article I referenced above, I noted that there is a corresponding vSwitch configuration that matches each of these types of NIC teaming:
- For NIC teaming without physical switch configuration, the vSwitch must be set to either “Route based on originating virtual port ID”, “Route based on source MAC hash”, or “Use explicit failover order”
- For NIC teaming with physical switch configuration—EtherChannel, static LACP/802.3ad, or its equivalent—the vSwitch must be set to “Route based on ip hash”
In order to better understand how these settings and different configurations affect NIC utilization, I set out to do some tests in the lab. Most of my tests were centered around IP-based storage from the host (i.e., using NFS or iSCSI for VMDKs), and only tested two basic configurations: using “Route based on originating virtual port ID” and no link aggregation and using “Route based on ip hash” with link aggregation. Although the tests were slanted toward IP-based storage traffic, the underlying principles should be the same for other types of traffic as well. Here’s what I found.
NIC Teaming Without Link Aggregation
First, it’s important to understand the basic behavior in this configuration. Because the vSwitch is set to “Route based on originating virtual port ID”, network traffic will be placed onto a specific uplink and won’t use any other uplinks until that uplink fails. (This is described in more detail in this PDF from VMware.) Every VM and every VMkernel port gets its own virtual port ID. These virtual port IDs are visible using esxtop (launch esxtop, then press “n” to switch to network statistics). That’s simple enough, but what does this mean in practical terms?
- Each VM will only use a single network uplink, regardless of how many different connections that particular VM may be handling. All traffic to and from that VM will be place on that single uplink, regardless of how many uplinks are configured on the vSwitch.
- Each VMkernel NIC will only use a single network uplink. This is true both for VMotion as well as IP-based storage traffic, and is true regardless of how many uplinks are configured on the vSwitch.
- Even when the traffic patterns are such that using multiple uplinks would be helpful—for example, when a VM is copying data to or from two different network locations at the same time, or when a VMkernel NIC is accessing two different iSCSI targets—only a single uplink will be utilized.
This last bullet is particularly important. Consider the implications in a VMware Infrastructure 3 (VI3) environment using the software iSCSI initiator with multiple iSCSI targets. Even though multiple iSCSI targets may be configured, all the iSCSI targets will share one uplink from that vSwitch using this configuration. Obviously, that is not ideal.
Note that this doesn’t really impact VMotion traffic, since VMotion is a point-to-point type of connection. VMotion would only be impacted if placed on a vSwitch with other types of traffic and their virtual port IDs were assigned to the same uplink.
NIC Teaming With Link Aggregation
In this configuration, EtherChannel/static LACP/802.3ad is configured on the physical switch and the ESX vSwitch is configured for “Route based on ip hash.” With this configuration, the behavior changes quite dramatically.
- Traffic to or from a VM could be placed onto any uplink on the vSwitch, depending upon the source and destination IP addresses. Each pair of source and destination IP addresses could be placed on different uplinks, but any given pair of IP addresses can use only a single uplink. In other words, multiple connections to or from the VM will benefit, but each individual connection can only utilize a single link.
- Each VMkernel NIC will utilize multiple uplinks only if multiple destination IP addresses are involved. Conceivably, you could also use multiple VMkernel NICs with multiple source IP addresses, but I haven’t tested that configuration.
- Traffic that is primarily point-to-point won’t see any major benefit from this configuration. A single VM being accessed by another single client won’t see a traffic boost other than that possibly gained by the placement of other traffic onto other uplinks.
This configuration can help improve uplink utilization on a vSwitch because traffic is dynamically placed onto the uplinks based on the source and destination IP addresses. This helps improve overall NIC utilization when there are multiple VMs or when a VMkernel NIC is accessing multiple IP-based storage targets. Note again, though, that individual connections will only ever be able to utilize a single uplink.
Practical Application
I come back again to the question I asked earlier: what does this mean in practical terms?
- If you want to scale IP-based storage traffic, you’ll have to use link aggregation and multiple targets. Using link aggregation with a single target (destination IP address) won’t use more than a single uplink; similarly, no link aggregation with multiple targets will still result in only a single uplink being used. Only with link aggregation and multiple targets will multiple uplinks get used.
- Link aggregation will help with better overall uplink utilization for vSwitches hosting VMs. Because there are multiple source/destination address pairs in play, the vSwitch will spread them around the uplinks dynamically.
- To achieve the best possible uplink utilization as well as provide redundancy, you’ll need physical switches that support cross-stack link aggregation. I believe the Cisco Catalyst 3750 switches do, as do the Catalyst 3120 switches for the HP c-Class blade chassis. I don’t know about other vendors, since I deal primarily with Cisco.
Clearly, this has some implications for efficient and scalable VI3 designs. I’d love to hear everyone’s feedback on this matter. In my humble opinion, extended conversations about this topic can only serve to better educate the community as a whole.
UPDATE: Reader Tim Washburn pointed out that the “Route based on Source MAC hash” actually can’t be used in conjunction with link aggregation; it’s behavior is identical to “Route based on originating virtual port ID”. Thanks for the correction, Tim!
Tags: Cisco, ESX, Networking, Virtualization, VMware


20 comments
Comments feed for this article
Trackback link
http://blog.scottlowe.org/2008/07/16/understanding-nic-utilization-in-vmware-esx/trackback/
Saturday, August 16, 2008 at 9:07 pm
Pingback from Dispelling Some VMware Over NFS Myths
Wednesday, July 16, 2008 at 5:28 pm
Duncan
I totally agree with you on this one. The cross stack channel feature is a nice one, but in my opinion most of the time it’s too expensive to justify. Especially considering the fact that when running on iSCSI most customers have to deal with a single destination ip address which indeed results in the same path being taken, always. so In this case, the cheap good old Virtual Port ID will just work fine. And you can easily set it up on two separate switches for redundancy reasons without complications if you don’t forget about STP.
Anyway, it would be very cool if and when iSCSI multipathing is supported together with round robin load balancing for instance. But we will just have to wait.
Wednesday, July 16, 2008 at 6:38 pm
Andrew
Thanks for a great article, as always.
I have a question: I am currently using a single NETAPP as a NFS storage mount for the virtual machine disks. Both the ESX hosts, and the physical switches are configured to use port channeling (or ip hashing in the case of ESX).
Is is possible to give the NETAPP multiple IPs on the private VLAN and mount the same volume multiple times (e.g., from 1.1.1.1, mount /vol/vol1, from 1.1.1.2, mount /vol/vol1…even though it’s the same NETAPP and same volume)? Then divide the VMs across the two mounts…in my head, since there are different IPs, even though it’s the same two hosts, the port channel will divide the traffic across the multiple interfaces.
There is some configuration overhead, but would this have the same effect as having multiple storage servers?
Thanks.
Wednesday, July 16, 2008 at 7:31 pm
Matt Brown
I currently have 8 nics on my VMWare boxes, they are connected to two pairs of 3750G cisco catalysts switches that are each connected together via a fiber link. (4 cisco switches in total). I have yet to figure out how to create an LACP aggregate across the 2 switch sets as the Port-channels don’t talk to each other from switch to switch unless they are connected via the cisco cross link cable… but then they share the same IOS so upgrades will take down the whole switch.
What do you think?
I’m trying to come up with a good solution for vSwitch setup and load balancing for my ESX Boxes, I’ve now got 11 ESX Servers with 8 nics each all connected to the 3750g switches.
Here’s my current VMWare config
http://universitytechnology.blogspot.com/2008/06/recommended-network-setup-for-vmware.html
Wednesday, July 16, 2008 at 7:52 pm
Duane Haas
Scott
Great post, I am currently using an EMC NS502 via ISCSI, and was curious to know about the comment about multiple targets. Right now my software initiator is configured to go after one IP address. The one ip address is configured to go across three uplink ports(LACP) on the EMC. Are you suggesting that if I create another IP(target) on the NS502 that I might see improved performance. My environment isn’t that large @ the time being, but always love to deploy best practices when configuring and setting up my env. Thanks as always, your time and dedication to the vmware community is much appreciated.
Wednesday, July 16, 2008 at 9:33 pm
slowe
Duncan,
Thanks for the feedback! Let me know when that iSCSI multipathing round-robin functionality is ready….
Andrew,
You shouldn’t mount the same NFS datastore on multiple IP addresses. If you can, use a multi-mode VIF on the NetApp (this will require configuration on the switches), then assign multiple IP address aliases to the VIF. Run each datastore on its own IP address. This should provide adequate traffic distribution across the links for both the ESX servers as well as the NetApp storage array.
Matt,
You can’t use cross-stack link aggregation without the cross-link cable; ordinary links won’t do it. There are drawbacks to using a stack, but you have to weigh those drawbacks against the ability to better distribute your traffic across the links. I reviewed your configuration and there are some changes I would probably make if it were me, but without knowing the rest of the environment it’s difficult to make good recommendations. Feel free to e-mail me (see the About page) and we can discuss it privately.
Duane,
The LACP configuration on the NS502 does help load balancing incoming iSCSI requests from multiple ESX servers/iSCSI initiators. If you want to have even the possibility of getting more than 1Gbps of throughput from an individual ESX server to the NS502, you will need to use EtherChannel for your ESX vSwitches and multiple iSCSI targets on the NS502. This wouldn’t necessarily help improve performance for a single LUN, since a single LUN is typically accessed via a single target, but it would help with multiple LUNs. I’ll defer to EMC experts (Chad, are you reading?) to tell me if I’m incorrect.
Thursday, July 17, 2008 at 5:22 am
Alastair
Scott,
Thanks for a great blog, love your work.
One thing to point out is that “Route based on Source MAC” is in the same camp as “Route based on source port ID” in that each virtual NIC uses one and only one physical NIC, rather than like “Route based on IP hash” where one vNIC will use multiple pNICs.
To me the big risk around IP Hash is that the vSwitch must connect to a single switch config, so that become a much more significant point of failure.
I definitely agree with the point about iSCSI storage and multiple targets being the primary place where IP Hash is better, personally I’d only use it there and I’d probably dedicate a vSwitch to that function.
When you look at multiple VMK ports with differnet IP addresses remember they are all under one TCP stack so need to be on different subnets before the VMK route table will end up pointing to different vNICs then pNICs.
To me IP Storage nirvana involves 10GBE (or infiniband) to let us have a much fatter pipe.
Tuesday, July 29, 2008 at 9:22 am
Erwan
Hi,
I use ESX VI3 for 6 month and really enjoy. I’m wondering if it make sense to install firewall as a VM. Each network interface of the firewall (3 at least) must be connect to different VLAN (tag must be configured on my switch and vswitch) and it’s not a problem for me. My real question is, what about security? I don’t see real risk with this implementation considering vSwith correctly deals VLAN Tag. The firewall i want as a VM will be the front firewall (connecting directly onto Internet).
Please, give me your advice about this.
Regards.
Monday, August 18, 2008 at 5:01 pm
Mike Astrosky
Came across your blog while searching for basic vmware/networking tutorials. I am new to vmware (just learned to spell it) and am having a lot of issues. Can you point me to some good turorials? While I actually read about 10% of your blog it will be a year or two before I understand half of what you are explaining.
I have ESX loaded on a Dell 2950 server with redundant NICS and a separate quad port nic card. No underlying OS - just ESX server. I can access ESX just fine with the web interface and insalled on VM. I have Infrastructure server loaded on a different box. I have addedd the ESX host to the Infrastructure box. I cannot access the VM directly - only through the web interface of the ESX server or through the infrastructure client. How do I get the VM to get IP addresses?
I appreciate your time for a response (either here of direct) even if it is to say that you cannot give out this kind of help at this time.
Sorry to bother you.
Michael
Tuesday, August 19, 2008 at 5:58 am
slowe
Mike,
You need to use the Infrastructure Client to access the console of the VM, at which point you can log into the guest OS and assign an IP address, etc. From there, it should be business as usual.
Good luck!
Wednesday, August 27, 2008 at 2:12 am
Meki Chan
hi, thank you for this great information
Regards,
Meki Chan
Cisco Trick
Wednesday, August 27, 2008 at 2:21 pm
SlamRand
Scott,
First, great articles! This is the second solution you’ve given me, led here by Google searches. (I used another article of yours to enable Jumbo Frames between ESXi and my Dell MD3000i iSCSI SAN.)
I used this article plus your Dec ‘06 article on NIC Teaming to configure NIC Teaming with LACP between ESXi and my Catalyst 3750 switch stack. I’ve enabled the teaming on the port groups for both my iSCSI VLAN and my Production LAN VLAN. Although both iSCSI communication and LAN communication are working fine, I’m not sure whether the teaming is actually configured/working properly - not because I’m having any connection troubles, but because I don’t know how to measure the network activity within ESX (or on the switch ports) to see which links are being utilized.
The only thing I’ve found to go on is the same thing that’s making me question whether it’s working: on my Catalyst stack, both of the channel groups are reporting status as Down. (I’m using Cisco’s GUI configuration tool “Cisco Network Assistant”). I’ve also played around in the IOS command line looking for a way to display more verbose status information, but I haven’t found anything I know how to interpret as indicating one way or the other, whether the port groups are doing what we want them to do.
Do you have any suggestions on ways to:
1) Verify that the port groups are actually configured *and functioning* correctly on the switch? What commands can I use to explore this, and what output am I looking for?
2) Verify that network traffic to different LUN targets on my iSCSI SAN is actually being carried across both links, thanks to the IP hash settings on ESX and etherchannel settings on the switch?
FYI, my iSCSI SAN has a single controller with two host ports, each port configured with a unique IP, so I’m assuming that meets the requirements of multiple destination IPs needed for the NIC teaming to even matter.
Thanks for all the help you’re giving to the VMWare community — especially us newbies to ESX!
Cheers,
Bryan
Thursday, August 28, 2008 at 11:06 am
ken
good stuff, as usual.
Tuesday, September 9, 2008 at 10:45 pm
alex
Hi Scott,
I’m in a situation where I may find myself having to implement iSCSI with software initiators.
We have allocated 2 dual-port NICs for this, ie 4 NIC ports in total.
The NAS has 2 controllers, one controller to network switch A, the second controller to network switch B. Then a connection from switch A and B to dual-port NIC #1, and ditto for dual-port NIC #2.
Going by your blog, it would indicate I could only get throughput through 1 of the 4 NIC ports. Is there a way to configure so I can get throughput through more than one NIC port?
Cheers
Alex
Tuesday, September 9, 2008 at 11:07 pm
slowe
Alex,
I’m guessing, although you didn’t state it, that your network switches don’t support cross-switch link aggregation (i.e., the ability to perform link aggregation across multiple physical switches). That being the case, it *is* possible to get more throughput, but it’s not as straightforward as it may seem.
What you’ll need to do is create two VMkernel ports on different subnets and use multiple iSCSI targets on those different subnets. This will allow ESX to use more than one NIC, although it sounds like you’ll be hard-pressed to get ESX to use all four.
Hope this helps!
Wednesday, September 10, 2008 at 5:42 am
alex
Hi Scott
Thanks for for your help.
You are correct, our switches do not support cross-switch link aggregation. Are you able to advise me which Cisco switch(es) if any support this functionality… if no Cisco switches do, what switch does?
Cheers,
Alex
Wednesday, September 10, 2008 at 5:44 am
alex
Also why do you say I will be hard-pressed to get ESX to use all four? Can I not create 4 vmKernel ports on 4 subnets rather than 2 vmkernel ports on 2 subnets?
Cheers
Alex
Wednesday, September 10, 2008 at 11:19 am
slowe
Alex,
In addition to creating 4 different VMkernel ports on 4 different subnets, you’d also need to configure your storage system to have 4 different iSCSI targets (one iSCSI target on each of the 4 different subnets). Then, *IF* your storage array is truly active/active and can present LUNs on multiple interfaces at the same time, you might be able to actively use all the interfaces. Otherwise, you’ll have to manually place LUNs on different iSCSI targets in order to get all of the links used. It’s certainly doable, but it will take effort and planning.
With regards to the Cisco switches, I believe that the Cisco Catalyst 3750 switches offer a stacking option that also allows you to do cross-switch link aggregation. Some of the specific configurations of the high-end chassis switches, like the Catalyst 6500, also offer this functionality.
Hope this helps!
Tuesday, September 16, 2008 at 3:43 pm
Robert Eanes
Scott,
I’ve followed the advice to use link aggregation and nic teaming, and everything seems to be working as well as it can ( 1 link utilized for any one pair of ip addresses ). What I haven’t been able to find any information on, is what to set the other objects in ESXi to. The vSwitch needs to be IP hash, but what about the vm network or the vkernel object. Each of these settings have the ability to override the settings on the vswitch, and I was wondering what the effects are to changing these for the VM’s?
Thanks,
Rob
Tuesday, October 14, 2008 at 10:53 pm
Sean Clark
Wow! Great post. This was just the answer I was looking for. This is why putting VMotion on the same VMkernel nic as iSCSI is probably a bad idea. (Saw that one at customer site a couple weeks ago).
But this is also the reason that the Equallogic’s best practice is to only use the ESX software iSCSI initiator for a VM’s system drive and use a guest-based iSCSI initiator for data. In that way, they can utilize multiple NICs to distribute the load.