ESX Server, NIC Teaming, and VLAN Trunking

Before we get into the details, allow me to give credit where credit is due. First, thanks to Dan Parsons of IT Obsession for an article that jump-started the process with notes on the Cisco IOS configuration. Next, credit goes to the VMTN Forums, especially this thread, in which some extremely useful information was exchanged. I would be remiss if I did not adequately credit these sources for the information that helped make this testing successful.

There are actually two different pieces described in this article. The first is NIC teaming, in which we logically bind together multiple physical NICs for increased throughput and increased fault tolerance. The second is VLAN trunking, in which we configure the physical switch to pass VLAN traffic directly to ESX Server, which will then distribute the traffic according to the port groups and VLAN IDs configured on the server. I wrote about ESX and VLAN trunking a long time ago and ran into some issues then; here I’ll describe how to work around the issues I ran into at that time.

So, let’s have a look at these two pieces. We’ll start with NIC teaming.

Configuring NIC Teaming

There’s a bit of confusion regarding NIC teaming in ESX Server and when switch support is required. You can most certainly create NIC teams (or “bonds”) in ESX Server without any switch support whatsoever. Once those NIC teams have been created, you can configure load balancing and failover policies. However, those policies will affect outbound traffic only. In order to control inbound traffic, we have to get the physical switches involved. This article is written from the perspective of using Cisco Catalyst IOS-based physical switches. (In my testing I used a Catalyst 3560.)

To create a NIC team that will work for both inbound and outbound traffic, we’ll create a port channel using the following commands:

s3(config)#int port-channel1
s3(config-if)#description NIC team for ESX server
s3(config-if)#int gi0/23
s3(config-if)#channel-group 1 mode on
s3(config-if)#int gi0/24
s3(config-if)#channel-group 1 mode on

This creates port-channel1 (you’d need to change this name if you already have port-channel1 defined, perhaps for switch-to-switch trunk aggregation) and assigns GigabitEthernet0/23 and GigabitEthernet0/24 into team. Now, however, you need to ensure that the load balancing mechanism that is used by both the switch and ESX Server matches. To find out the switch’s current load balancing mechanism, use this command in enable mode:

show etherchannel load-balance

This will report the current load balancing algorithm in use by the switch. On my Catalyst 3560 running IOS 12.2(25), the default load balancing algorithm was set to “Source MAC Address”. On my ESX Server 3.0.1 server, the default load balancing mechanism was set to “Route based on the originating virtual port ID”. The result? The NIC team didn’t work at all—I couldn’t ping any of the VMs on the host, and the VMs couldn’t reach the rest of the physical network. It wasn’t until I matched up the switch/server load balancing algorithms that things started working.

To set the switch load-balancing algorithm, use one of the following commands in global configuration mode:

port-channel load-balance src-dst-ip (to enable IP-based load balancing)
port-channel load-balance src-mac (to enable MAC-based load balancing)

There are other options available, but these are the two that seem to match most closely to the ESX Server options. I was unable to make this work at all without switching the configuration to “src-dst-ip” on the switch side and “Route based on ip hash” on the ESX Server side. From what I’ve been able to gather, the “src-dst-ip” option gives you better utilization across the members of the NIC team than some of the other options. (Anyone care to contribute a URL that provides some definitive information on that statement?)

Creating the NIC team on the ESX Server side is as simple as adding physical NICs to the vSwitch and setting the load balancing policy appropriately. At this point, the NIC team should be working.

Configuring VLAN Trunking

In my testing, I set up the NIC team and the VLAN trunk at the same time. When I ran into connectivity issues as a result of the mismatched load balancing policies, I thought they were VLAN-related issues, so I spent a fair amount of time troubleshooting the VLAN side of things. It turns out, of course, that it wasn’t the VLAN configuration at all. (In addition, one of the VMs that I was testing had some issues as well, and that contributed to my initial difficulties.)

To configure the VLAN trunking, use the following commands on the physical switch:

s3(config)#int port-channel1
s3(config-if)#switchport trunk encapsulation dot1q
s3(config-if)#switchport trunk allowed vlan all
s3(config-if)#switchport mode trunk
s3(config-if)#switchport trunk native vlan 4094

This configures the NIC team (port-channel1, as created earlier) as a 802.1q VLAN trunk. You then need to repeat this process for the member ports in the NIC team:

s3(config)#int gi0/23
s3(config-if)#switchport trunk encapsulation dot1q
s3(config-if)#switchport trunk allowed vlan all
s3(config-if)#switchport mode trunk
s3(config-if)#switchport trunk native vlan 4094
s3(config-if)#int gi0/24
s3(config-if)#switchport trunk encapsulation dot1q
s3(config-if)#switchport trunk allowed vlan all
s3(config-if)#switchport mode trunk
s3(config-if)#switchport trunk native vlan 4094

If you haven’t already created VLAN 4094, you’ll need to do that as well:

s3(config)#int vlan 4094
s3(config-if)#no ip address

The “switchport trunk native vlan 4094″ command is what fixes the problem I had last time I worked with ESX Server and VLAN trunks; namely, that most switches don’t tag traffic from the native VLAN across a VLAN trunk. By setting the native VLAN for the trunk to something other than VLAN 1 (the default native VLAN), we essentially force the switch to tag all traffic across the trunk. This allows ESX Server to handle VMs that are assigned to the native VLAN as well as other VLANs.

On the ESX Server side, we just need to edit the vSwitch and create a new port group. In the port group, specify the VLAN ID that matches the VLAN ID from the physical switch. After the new port group has been assigned, you can place your VMs on that new port group (VLAN) and—assuming you have a router somewhere to route between the VLANs—you should have full connectivity to your newly segregated virtual machines.

Final Notes

I did encounter a couple of weird things during the setup of this configuration (I plan to leave the configuration in place for a while to uncover any other problems).

  • First, during troubleshooting, I deleted a port group on one vSwitch and then re-created it on another vSwitch. However, the virtual machine didn’t recognize the connection. There was no indication inside the VM that the connection wasn’t live; it just didn’t work. It wasn’t until I edited the VM, set the virtual NIC to a different port group, and then set it back again that it started working as expected. Lesson learned: don’t delete port groups.
  • Second, after creating a port group on a vSwitch with no VLAN ID, one of the other port groups on the same vSwitch appeared to “lose” its VLAN ID, at least as far as VirtualCenter was concerned. In other words, the VLAN ID was listed as “*” in VirtualCenter, even though a VLAN ID was indeed configured for that port group. The “esxcfg-vswitch -l” command (that’s a lowercase L) on the host still showed the assigned VLAN ID for that port group, however.
  • It was also the “esxcfg-vswitch” command that helped me troubleshoot the problem with the deleted/recreated port group described above. Even after recreating the port group, esxcfg-vswitch still showed 0 used ports for that port group on that vswitch, which told me that the virtual machine’s network connection was still somehow askew.

Hopefully this information will prove useful to those of you out there trying to set up NIC teaming and/or VLAN trunking in your environment. I would recommend taking this one step at a time, not all at once like I did; this will make it easier to troubleshoot problems as you progress through the configuration.

Tags: , , , , , , ,

  1. Tim Hollingworth’s avatar

    Chauncey wants his SAN back.

  2. slowe’s avatar

    Hey, I let Greg decide where the SAN was going, so don’t pin this on me!!

    Toby said he thought he had enough equipment to send another unit over to Charlotte…he’ll probably end up with something newer and bigger than I have.

    Scott

  3. Ecio’s avatar

    Hi Scott,
    Just a couple of notes based on my little experience (I’m doing my first tests in these days @work).

    My config is ESX 3.0 (4 nic) + 2 Cisco 3750 (stack) + Netapp 3020C

    After some testing (yesterday and today) i succeeded configuring NIC Teaming + VLAN Trunking on 2 of the NICs and then use this connection to transport networking data, iSCSI and so on. The ESX is now using a datastore on the NetApp via iSCSI.

    That’s what emerged:

    1) In my IOS Version 12.2(25)SEB4 the ‘show etherchannel load-balance’ reports
    EtherChannel Load-Balancing Operational State (src-ip):
    so i’ve set “Route based on ip hash” on the nic teaming page.

    2) i had NOT to use the “switchport trunk native vlan xxx” statement because when i left it on the config i wasnt able to use any VLAN different from xxx. When i deleted the statement i was able to use all of the vlans without problems

    3) (this is iSCSI related) if you dont create a service console on the iSCSI network you wont be able to scan/found LUNs on the storage[*]. I think this is due to the fact that our iSCSI network is not routed so the ESX cant reach it without having a service console “foot” on that network (even though Vmkernel of course has one ip on that network).

    [*] this means that if you delete the iSCSI service console everything works until you try to find another LUN or reboot the ESX server –> BOOM you cant see the storage anymore (fortunately i found this before going into production :D )

    Ciao,
    Ecio

    PS sorry for my english

  4. slowe’s avatar

    Ecio,

    Yes, I saw it documented somewhere about the need for ESX to be able to make a connection to the iSCSI target, so that means routed traffic or a Service Console connection on the same subnet as the iSCSI target. I can’t recall where I saw it documented, but I *know* I remember seeing it in the documentation somewhere.

    Otherwise, glad to hear things are working well for you! Be sure to check out my recent article on recovering data inside VMs using NetApp snapshots:

    http://blog.scottlowe.org/2006/12/30/recovering-data-inside-vms-using-netapp-snapshots/

    Thanks,
    Scott

  5. Joey’s avatar

    Hey Scott,

    I really appreciate all the work you took into putting all this together. Thanks. We have just purchased VMware and we are looking to do all that you have talked about. BUT….i was wondiering if you have tried configuring the devices that are communicating via ISCSI to either a Netapp or EMC device to use jumbo frames. We are using Cisco 3750-48 Port Gig switches to hopefully do this.

    This is really where I want to take it. Because we don’t have all of our equipment as of yet i am unable to test it to see if it will work.

    Please email me let me know what you think. Thanks again.

  6. slowe’s avatar

    Joey,

    If you are using the software iSCSI initiator within ESX, jumbo frames are currently out of the question, as they aren’t supported. Hardware iSCSI initiators are, of course, a different story.

    Good luck with your implementation!

  7. Joey’s avatar

    Hey Scott,

    Are you saying the Software Iscsi, like Microsoft ISCSI that i would have installed on a wn2k3 server connected to a SAN via ISCSI, doesnt support jumbo frames? The servers that i have this installed on the nics do support jumbo frames. Or are you saying there is something with ESX?

    I’m trying to find out all the pros and cons so if you could shed some more light on the situation for me i’d appreciate it.

    Thanks.

  8. slowe’s avatar

    Joey,

    The software iSCSI initiator that ships with ESX Server does not support jumbo frames, so even if the rest of your infrastructure (NICs, switches, storage array) supports jumbo frames it still won’t really matter.

    The release notes for ESX 3.0.2 (which was just released yesterday) do not indicate any change in jumbo frame support.

    If jumbo frame support is a *MUST HAVE* requirement for you, then you’ll need to look into hardware iSCSI initiators.

    Hope this helps!

  9. mike’s avatar

    can i just configure the server for link aggregation without configuring etherchannel on the SW?

  10. slowe’s avatar

    Yes, you can. However, there is currently some debate as to whether this will distribute traffic across the various links as efficiently as using EtherChannel/LACP.

  11. Kelly Olivier’s avatar

    I agree Scott. VMWare misleads people to believe that their nic teaming is LACP. However, ether channels work better per our testing. We also just use a etherchannel on the cisco side and config the esx box to use the ip hash. This has worked great.

  12. Mimmus’s avatar

    It’s true: with LACP, ESX doesn’t balance traffic toward ONE iSCSI target. You need to configure multiple destinations, using virtual iSCSI IPs.

  13. slowe’s avatar

    Mimmus,

    It’s my understanding that’s true for any solution built using EtherChannel/LACP, not just a VMware limitation; specifically, the data flow between any two single endpoints (IP addresses) cannot exceed the bandwidth of a single link. The advantage comes in the distribution of multiple data flows across multiple links.

  14. Charlie Brown’s avatar

    Hey Scott,

    You have mention that you have ran into some configuration issues and I was wondering if you would have idea about the issue that I’m having. We are using HP 685 blade enclosures with Pro-curve switches. We have 2 virtual switches created with 2 nics on each switch. we are also using vlan tagging on the switches. The issue that I’m having is that every once in a while when a VM migrates to antoher server it will lose network connectivity. This pops up more frequently when we do patching of the ESX hosts and migrate multiple VM around. One person on the team thinks it is due to an issue with Network Detection Failure set to Beacon Probing. I was wondering if you any suggestions. For the life of me, I have been trying to reproduce the problem but cannot.

    Any suggestions woould be appreciated

  15. slowe’s avatar

    Charlie,

    There could be multiple issues at play here. I’ve heard of numerous issues with Beacon Probing, but I can’t definitely say that’s the problem. It could also be your vSwitch configuration and how you are sharing NICs for VM traffic, the Service Console, and VMotion.

    Start with switching away from Beacon Probing to Link Status. Then moving to overriding the vSwitch failover policy for specific port groups so that the Service Console prefers one NIC over another, VMotion prefers the other NIC, etc.

    Hope this helps!

  16. vijaysys’s avatar

    Hi…
    one help…
    HOw to find MAC address for an ESX server ?

    thanks * regards
    VJ

  17. slowe’s avatar

    Vijaysys,

    You can find the MAC address for the Service Console of the ESX Server using the “ifconfig” command at the Service Console. MAC addresses for virtual machines hosted on an ESX Server can be determined by running the guest OS-specific commands, such as “ifconfig” or “ipconfig” within the guest.

    Hope this helps!

  18. Brian’s avatar

    Scott,

    Great post! We are looking to do this with ESX 3i on Dell 2950′s and a Netapp 2050 SAN. I want to mount my volume using NFS instead of iSCSI to avoid SCSI locking. Will your teaming and trunking config still help out with NFS?

  19. slowe’s avatar

    Brian,

    Using NIC teaming will help with overall throughput to multiple NFS exports on separate IP addresses, but not with overall throughput to a single NFS export on a single IP address. You’ll also want to look at using multi-mode VIFs on the NetApp side as well:

    http://blog.scottlowe.org/2007/06/13/cisco-link-aggregation-and-netapp-vifs/
    http://blog.scottlowe.org/2008/01/08/lacp-with-cisco-switches-and-netapp-vifs/

    Hope this helps!

  20. John Flick’s avatar

    Question: If I am going from ESX to an Iscsi device, and my data flows are going from ESX to the Isci device am I really “bonding” for a full X gb of bandwidth. When I look at the nics that are sending data out (for example during a file copy from a guest that is copying from Internet storage (where the C: .vmdk is) to the iSCSI device (where drive d: .vmdk is) I notice only 1 NIC is getting in on the act…IE I’m not filling up all 4 nics I’ve “bonded” for iSCSI, hence I only get 1GB of throughput. Shouldn’t I see all 4 bonded NIC’s just pound away at sending that traffic to the Iscsi device?

  21. slowe’s avatar

    John,

    This is a common misconception. In most cases, any “bonded” link will only be able to use the bandwidth of a single member of the link for a data flow between two single endpoints. To increase throughput across the bonded links, you need multiple endpoints, i.e., a one-to-many or many-to-one data flow. The software iSCSI initiator in ESX 3.0.x is a bit limited in this regard, although it looks like ESX 3.5 improves that.

  22. John Flick’s avatar

    So I guess I need 10Gbe then?

    I have a SAN with 4 GB ports bonded and I have ESX with 4 ports bonded….but you’re right…only 1 is used at a time.

  23. slowe’s avatar

    John,

    You _might_ be able to work around this by using multiple iSCSI targets (i.e., multiple IP addresses on the iSCSI storage array). Then, if you are using static LACP (EtherChannel) on both ends of the link, it may work better. No guarantees, though it may be worth a try.

  24. Alex’s avatar

    Just for anyone else looking, I had some major issues with vlan tagging on an IBM blade center with a server connectivity module.

    Symptoms were no vlan tagging was working. Only vlan that WAS working was the native vlan.
    Under vSwitch0 Properties on the Network Adapters tab, vmnic0′s observed IP range was again only the native vlan (same with the networks under the status box on the right)

    THE SOLUTION was to start a web session to the switch module, click Non-Default Virtual LANs and add vlans to your port group.

    So simple, but I missed it again and again.

    Thanks for your blog Scott, it has been helpful time and time again :)

    -Alex

  25. Calvin’s avatar

    Would you care to write a guide for the HP ProCurve switches? I’m trying to make an ESX server with 4 1GbE ports link-aggregate (2 ports, 2 links) but I’m always loosing connectivity if both ports are plugged in.

  26. slowe’s avatar

    Calvin,

    I’d love to, but HP seems to be dragging their feet getting me a ProCurve switch to use in my lab for testing and validation. As soon as I can get my hands on a ProCurve switch, I’ll be happy to test the config and post some information here.

  27. PeterVG’s avatar

    Hi Scott,
    This has been bothering me for some time & I can’t seem to find a clear answer, so here it goes: We have ESX servers with 4 nic’s for the VM’s vswitch. 2 of the NIC’s connect to switch A & the other 2 connect to switch B. Both switches are Cisco CAT6500. I would like to configure incoming load balancing by creating a portchannel on each switch with the 2 related NIC’s, effectively having 2 PortChannels connecting to my load balanced (ip hash) vswitch. Is this at all possible ? If not, why ?

  28. slowe’s avatar

    PeterVG,

    That’s a great question! My initial guess is that it won’t work because the vSwitch expects to apply the IP hash algorithm across all four uplinks when it really needs to be applied separately for each pair of uplinks. However, I’ve never tested this and I’ve never seen any documentation, so I could very well be wrong (it wouldn’t be the first time!). Unfortunately, I don’t have enough GigE-capable switches to actually run the test myself. Please do let me know if you get a firm answer.

  29. PeterVG’s avatar

    Hi Scott,
    I believe I might have an answer to this problem: http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1001938&sliceId=1&docTypeID=DT_KB_1_1&dialogID=14912551&stateId=1%200%2014916516
    The firts line of the article states: ESX Server supports link aggregation only on a single physical switch or stacked switches, never on disparate trunked switches.
    So I guess I’m out of luck…

  30. slowe’s avatar

    PeterVG,

    Good find! I suspected that would be the case, but it’s good to see it documented.

  31. Dominick’s avatar

    Scott – thanks for the blog. Basic question: I’m using ESX 3.5 with a pair of dual-Gig NICs connected to a 6509 and an iSCSI SAN. I was thinking of a single vSwitch, physical connections teamed & trunked on alternate NIC’s (for redundancy on the chipsets) in active/active mode, with port groups for VM’s, iSCSI/vmKernal and Service Consoles – each using 2 physical NICs (one channel). Does this make sense for maximizing performance and redundancy? Is there a better method? Thanks!

  32. slowe’s avatar

    Dominick,

    Good question! Your timing is impeccable; I’m just wrapping up some network configuration testing in my lab and will be publishing results within the next few days.

    In any case, here we go…you won’t be able to use “Route based on IP hash” on your vSwitch because that requires a single physical switch (refer to the link in comment #29). So, for physical switch redundancy, we either have to a) go with two vSwitches, each with two uplinks; or b) use standard NIC bonding in ESX without any physical switch configuration. Option A has problems because you can’t really replicate the same port group configuration on both vSwitches; Option B has problems because depending upon the type of traffic, ESX doesn’t do a very good job of utilizing all the uplinks on the vSwitch. This leaves you in a difficult position.

    However, unless you have multiple iSCSI targets, Option B will provide you the best balance of performance and redundancy, so that’s the route I’d take.

    Good luck!

  33. Dominick’s avatar

    Thanks Scott – I’m using a single physical switch (4 NIC’s on the ESX box talking to a single 6509 – jumping blades for redundancy). I took a look at the link on #29 and that along with your comment about ESX not doing a great job utilizing the uplink’s makes me believe I would be best served using 802.3ad aggregation at switch (alternating port pairs) and leaving the ESX load balancing setting to ‘route based on the originating virtual port ID’. As for the iSCSI SAN we are stuck with the single VIP.

    I have the luxury of time, so I can actually test a few configurations to see which works best before placing the system into production in Sept.

    Looking forward to your test results – and thanks again for the response!

  34. slowe’s avatar

    Dominick,

    I believe that if you are going to use 802.3ad on the physical switches, you’ll be required to use “Route based on ip hash” on the vSwitches in order for connectivity to work. Keep in mind that this will only help improve the distribution of traffic across the links, not necessarily improve the throughput of any single point-to-point connection.

    Good luck!

  35. CoolRos’s avatar

    I too am attempting to connect an ESX server to redundant switches.

    My thought was to have a Vswitch with two NICs (or more), each connected to redundant Cisco switches. The Cisco switches have an interconnect as well, creating a physical loop.

    My concern is that I might create a bridging loop between the three switches, since the Vswitch does not support Spanning Tree Protocol.

    From what I can find, VMWare says that STP isn’t necessary only because Vswitches don’t bridge to each other. They don’t seem to address the chaos that can ensue in a Cisco world.

    1) Are my concerns substantiated?

    2) If so, is there a way to enable STP on a Vswitch?

  36. slowe’s avatar

    CoolRos,

    1. I wouldn’t be concerned about it. vSwitches can’t be linked to each other except by a VM.

    2. No. There is no STP support on a vSwitch. I presume this is because you can’t link vSwitches together, so there’s no need to worry about a bridging loop.

    I’m not sure what “chaos” may result in a Cisco world; I have many customers who are doing just exactly what you describe without any issues whatsoever. You won’t be able to do link aggregation/EtherChannel unless your switch supports cross-switch EtherChannel, but otherwise it should work just fine.

    Thanks for reading!

  37. soulshepard’s avatar

    hi all,

    i have an cisco 6509 switch (9 blades) and esx 3.5.

    my network configuration is:

    cisco: multiple trunked etherchannels (X), gigabit ports in seperate blades in the switch configured to be an member of an etherchannel with lacp and an allow vlan map. (in short configured as most examples indicate)

    esx: created an vswitch with the two fysical nics and on that portgroups that have the vlan tagged.

    my results are:

    both nics are up when adding a service console i can ping it thus the vlan mapping works.. shutting down one of the nics works.. (i do see the teaming working.)

    BUT i see in my cisco switch config the channel interface as down. when issuing an “show int trunk” i do see the both adapters as trunks but not the port channel interface.

    as i am used to is that when using a (etherchannal) team on cisco, is that you should see this port channel inertface as up. oteherwise it would be the same as configuring no channel at all.

    my questions:

    - the only difference at the moment is that i do not have the src-dst-ip setup at the switch side. is this causing the channel to be down. (?)
    - how do other cisco alike configs show the etherchannel status (?)
    - and trunking status, does it show the etherchannal or the seperate ints.?
    - erm any pointer in what i possible doing wrong?

    Thanks in advance
    Soul

  38. soulshepard’s avatar

    Actually i answerd my own question.

    after reading

    http://www.vmware.com/files/pdf/virtual_networking_concepts.pdf

    i decided to ignore the “etherchannel unconditionally” but place the etherchannel in pure mode on. (and no protocol definitions ofc) this placed my etherchannels as up. what i do find strange is that other with oterh settings do have an up etherchannel.. or possibly not? ;)

    ps: also in the document in the link state no lacp should be used. what i find again strange is that across the net i see people using lacp.. strange

  39. Greg’s avatar

    I skimmed through the replies so if I missed the answer I apologize.
    I was able to get trunking and teaming working on 3 interfaces. I configured my management VLAN as the native VLAN and a handful of other vlans are allowed in the switch which are used for VMs which I can VLAN tag at the vSwitch without issue. I am however having problems PXE booting when I have all three of the ports connected to my switch. I have a helper entry which has the IP of my PXE server and I am able to PXE boot fine on the native vlan if I have only the PXE capable NIC in the bond connected to the switch. When I have all three Interfaces connected to the switch the PXE boot fails with an ARP time out. I also tried tagging the PXE adapter in the NIC configuration bios with another VLAN which also has a helper entry of the PXE server… This works identical to the native VLAN and only PXE boots when just the PXE capable NIC is connected. I believe the team configuration is what is breaking my ability to PXE boot when all three of the Ports are active. Has anyone come across a work around for this? I am guessing I am going to have to sacrifice inbound load balancing and configure the 3 Interfaces as trunks only?

  40. Brian D’s avatar

    My network guy and I looked over this earlier today in an effort to re-design my ESX environment. We ran into two issues. For the NIC teaming, port channels are required. However, you apparently can’t port channel across core switches (we’ve got two Cisco 4506′s linked together.) This poses a problem for redundancy, since 4 NICs go to core 1 and the other 4 go to core 2 (so 8 NIC ports for the VMs – plus two others for SC and VKernel.) The other problem is that the load-balance command you mentioned is a global command and would affect all of the ports, not just the ones that are port-channeled. When we tried to test this, Cisco did not recognize the command on that interface. So I assume maybe that you have your ESX boxes on their own switch?

    We also looked at the native vlan options you discussed in your Vmotion and VLAN security article. However in our case, you can already route between our VLANs so hopping wouldn’t be an issue (or so I’m told.) He made the point that you’d have to be inside the building to even get to our private VLANs, at which point, we’d have a much bigger problem :)

    Thoughts and comments are more than welcomed. As I mentioned, I’m in the process of redesigning 8 different sites so that they’re all setup the same way. Thanks!

  41. slowe’s avatar

    Brian D,

    You’re correct; NIC teaming as it is described in this post does indeed require port channels (EtherChannel). That’s not to say, though, that you can’t use the built-in NIC teaming functionality within ESX, available by setting your vSwitch to “Route based on originating virtual port iD”. When you do that, of course, you will be bound to the NIC utilization guidelines that I describe elsewhere on the site.

    To be able to use port channels across switches, you need the super high-end stuff–like Cat 6509′s with Sup720 modules and such. The Catalyst 3750 switches also do this via their “stacking” mechanism, as do some of the switch modules that fit into blade chassis from IBM and HP.

    Anyway, good luck on the redesign, and thanks for reading!

  42. Linus’s avatar

    Thanks Scott. Your post really helped me out to solve our problem to get the ESX to work with our Cisco 3750 switches.
    I know nothing about ESX servers and the ESX administrator told me he used LACP…

    //Linus Cisco Engineer

  43. Chris Neil’s avatar

    I’m building a new farm (cloud?) from scratch using ESX3.5. I’ve configured etherchannel on my 3750s with src-dst-ip balancing and set my vSwitch to use “Route based on IP hash”.

    Now everything seems to be working but the networking page is worrying me. I would expect all NICs in the team to see our entire LAN IP range.

    http://vitaredux.files.wordpress.com/2009/01/screenshot026.jpg

    Is this expected? If not, any ideas where to look to identify the problem?

  44. slowe’s avatar

    I wouldn’t be too terribly worried about that; I’ve seen that behavior before. As long as all the physical switch ports are configured as 802.1q VLAN trunks and all appropriate VLANs are allowed, you should be OK.

  45. Brian Sitton’s avatar

    In the article you said, “You can most certainly create NIC teams (or “bonds”) in ESX Server without any switch support whatsoever. Once those NIC teams have been created, you can configure load balancing and failover policies. However, those policies will affect outbound traffic only. ”

    If you don’t use etherchannel, and you balance based on VPID, then each VM will pass traffic out of a single nic. On the physical switch side, it will see that mac address on that single physical nic, and inbound traffic will be switched back through the same port that traffic came out of. Effectivly giving all virtual machines an equal share of the aggregate bandwidth. Right?

    This works well in the many to many case. If you had one IP address talking to many clients, then a single nic would be the bottleneck, right? But this would be rare in an ESX environment.

    My real question is when does etherchannel help? I assume it would in the “one to many” case I listed above, but would it help in any other cases? How would it help?

  46. Dennis’s avatar

    Thanks for the How-To, we’re going to give it a try tomorrow.

    One questions though, I have the SUP720s and all the fancy stuff, can you give a basic rundown of how to trunk across the switches?

    I’m the server guy, not the network guy, but I like to know what I’m asking somebody to do.

    Thanks a bunch.

1 · 2 · 3 ·