ESX Server, NIC Teaming, and VLAN Trunking

Before we get into the details, allow me to give credit where credit is due. First, thanks to Dan Parsons of IT Obsession for an article that jump-started the process with notes on the Cisco IOS configuration. Next, credit goes to the VMTN Forums, especially this thread, in which some extremely useful information was exchanged. I would be remiss if I did not adequately credit these sources for the information that helped make this testing successful.

There are actually two different pieces described in this article. The first is NIC teaming, in which we logically bind together multiple physical NICs for increased throughput and increased fault tolerance. The second is VLAN trunking, in which we configure the physical switch to pass VLAN traffic directly to ESX Server, which will then distribute the traffic according to the port groups and VLAN IDs configured on the server. I wrote about ESX and VLAN trunking a long time ago and ran into some issues then; here I’ll describe how to work around the issues I ran into at that time.

So, let’s have a look at these two pieces. We’ll start with NIC teaming.

Configuring NIC Teaming

There’s a bit of confusion regarding NIC teaming in ESX Server and when switch support is required. You can most certainly create NIC teams (or “bonds”) in ESX Server without any switch support whatsoever. Once those NIC teams have been created, you can configure load balancing and failover policies. However, those policies will affect outbound traffic only. In order to control inbound traffic, we have to get the physical switches involved. This article is written from the perspective of using Cisco Catalyst IOS-based physical switches. (In my testing I used a Catalyst 3560.)

To create a NIC team that will work for both inbound and outbound traffic, we’ll create a port channel using the following commands:

s3(config)#int port-channel1
s3(config-if)#description NIC team for ESX server
s3(config-if)#int gi0/23
s3(config-if)#channel-group 1 mode on
s3(config-if)#int gi0/24
s3(config-if)#channel-group 1 mode on

This creates port-channel1 (you’d need to change this name if you already have port-channel1 defined, perhaps for switch-to-switch trunk aggregation) and assigns GigabitEthernet0/23 and GigabitEthernet0/24 into team. Now, however, you need to ensure that the load balancing mechanism that is used by both the switch and ESX Server matches. To find out the switch’s current load balancing mechanism, use this command in enable mode:

show etherchannel load-balance

This will report the current load balancing algorithm in use by the switch. On my Catalyst 3560 running IOS 12.2(25), the default load balancing algorithm was set to “Source MAC Address”. On my ESX Server 3.0.1 server, the default load balancing mechanism was set to “Route based on the originating virtual port ID”. The result? The NIC team didn’t work at all—I couldn’t ping any of the VMs on the host, and the VMs couldn’t reach the rest of the physical network. It wasn’t until I matched up the switch/server load balancing algorithms that things started working.

To set the switch load-balancing algorithm, use one of the following commands in global configuration mode:

port-channel load-balance src-dst-ip (to enable IP-based load balancing)
port-channel load-balance src-mac (to enable MAC-based load balancing)

There are other options available, but these are the two that seem to match most closely to the ESX Server options. I was unable to make this work at all without switching the configuration to “src-dst-ip” on the switch side and “Route based on ip hash” on the ESX Server side. From what I’ve been able to gather, the “src-dst-ip” option gives you better utilization across the members of the NIC team than some of the other options. (Anyone care to contribute a URL that provides some definitive information on that statement?)

Creating the NIC team on the ESX Server side is as simple as adding physical NICs to the vSwitch and setting the load balancing policy appropriately. At this point, the NIC team should be working.

Configuring VLAN Trunking

In my testing, I set up the NIC team and the VLAN trunk at the same time. When I ran into connectivity issues as a result of the mismatched load balancing policies, I thought they were VLAN-related issues, so I spent a fair amount of time troubleshooting the VLAN side of things. It turns out, of course, that it wasn’t the VLAN configuration at all. (In addition, one of the VMs that I was testing had some issues as well, and that contributed to my initial difficulties.)

To configure the VLAN trunking, use the following commands on the physical switch:

s3(config)#int port-channel1
s3(config-if)#switchport trunk encapsulation dot1q
s3(config-if)#switchport trunk allowed vlan all
s3(config-if)#switchport mode trunk
s3(config-if)#switchport trunk native vlan 4094

This configures the NIC team (port-channel1, as created earlier) as a 802.1q VLAN trunk. You then need to repeat this process for the member ports in the NIC team:

s3(config)#int gi0/23
s3(config-if)#switchport trunk encapsulation dot1q
s3(config-if)#switchport trunk allowed vlan all
s3(config-if)#switchport mode trunk
s3(config-if)#switchport trunk native vlan 4094
s3(config-if)#int gi0/24
s3(config-if)#switchport trunk encapsulation dot1q
s3(config-if)#switchport trunk allowed vlan all
s3(config-if)#switchport mode trunk
s3(config-if)#switchport trunk native vlan 4094

If you haven’t already created VLAN 4094, you’ll need to do that as well:

s3(config)#int vlan 4094
s3(config-if)#no ip address

The “switchport trunk native vlan 4094″ command is what fixes the problem I had last time I worked with ESX Server and VLAN trunks; namely, that most switches don’t tag traffic from the native VLAN across a VLAN trunk. By setting the native VLAN for the trunk to something other than VLAN 1 (the default native VLAN), we essentially force the switch to tag all traffic across the trunk. This allows ESX Server to handle VMs that are assigned to the native VLAN as well as other VLANs.

On the ESX Server side, we just need to edit the vSwitch and create a new port group. In the port group, specify the VLAN ID that matches the VLAN ID from the physical switch. After the new port group has been assigned, you can place your VMs on that new port group (VLAN) and—assuming you have a router somewhere to route between the VLANs—you should have full connectivity to your newly segregated virtual machines.

Final Notes

I did encounter a couple of weird things during the setup of this configuration (I plan to leave the configuration in place for a while to uncover any other problems).

  • First, during troubleshooting, I deleted a port group on one vSwitch and then re-created it on another vSwitch. However, the virtual machine didn’t recognize the connection. There was no indication inside the VM that the connection wasn’t live; it just didn’t work. It wasn’t until I edited the VM, set the virtual NIC to a different port group, and then set it back again that it started working as expected. Lesson learned: don’t delete port groups.
  • Second, after creating a port group on a vSwitch with no VLAN ID, one of the other port groups on the same vSwitch appeared to “lose” its VLAN ID, at least as far as VirtualCenter was concerned. In other words, the VLAN ID was listed as “*” in VirtualCenter, even though a VLAN ID was indeed configured for that port group. The “esxcfg-vswitch -l” command (that’s a lowercase L) on the host still showed the assigned VLAN ID for that port group, however.
  • It was also the “esxcfg-vswitch” command that helped me troubleshoot the problem with the deleted/recreated port group described above. Even after recreating the port group, esxcfg-vswitch still showed 0 used ports for that port group on that vswitch, which told me that the virtual machine’s network connection was still somehow askew.

Hopefully this information will prove useful to those of you out there trying to set up NIC teaming and/or VLAN trunking in your environment. I would recommend taking this one step at a time, not all at once like I did; this will make it easier to troubleshoot problems as you progress through the configuration.

Tags: , , , , , , ,

  1. slowe’s avatar

    Dennis, I wish I could tell you, but I haven’t worked with the Sup720′s yet so I’m not familiar with the specific syntax. I’ll talk to one or more of the CCIEs at the office and see what they tell me.

  2. Dennis’s avatar

    Perhaps it doesn’t matter, I can’t seem to get this to work on a single switch yet. I’m running ESXi 3.5 U3 and connected to a Cisco 6509.

    I created a vswitch and port group. Set the port group to vlan 4094 and lb to IP hash.

    This is what I have in the switch:

    interface GigabitEthernet9/42
    switchport
    switchport trunk encapsulation dot1q
    switchport trunk native vlan 4094
    switchport mode trunk
    switchport nonegotiate
    no ip address
    spanning-tree portfast trunk

    and

    Port Mode Encapsulation Status Native vlan
    Gi9/42 on 802.1q trunking 4094

    Port Vlans allowed on trunk
    Gi9/42 1-4094

    Port Vlans allowed and active in management domain
    Gi9/42 1-3,32,64,92,96,128,160,190-197,199-200,224,235-245,248-252,299,998,4094

    Port Vlans in spanning tree forwarding state and not pruned
    Gi9/42 1-3,32,64,92,96,128,160,190-197,199-200,224,235-245,248-252,299,998,4094

    Yes I only have it turned on to only one port right now, that’s all I can afford to remove from use for testing, but I assume it should still work.

    Ideas?

  3. Dennis’s avatar

    Okay, we added a second nic/port to the etherchannel and vswitch. Then created a port group in it on each vlan we wanted to access. This seems to work. Of course, now the switch won’t let us add more than 2 ports without erroring on flow control settings.

    Time to call Cisco.

    I would still be amenable to advice configuring this across 2 switches if you find a way.

    Thanks,

    Dennis

  4. Paul Shannon’s avatar

    Thanks Scott!

    I know this is an old post, but it just got me out of a little spot.

    Cheers

    Paul

  5. slowe’s avatar

    I’m glad it helped, Paul!

  6. Rune Jon’s avatar

    Some suggestions on the Cisco switch. If you want to tag all vlan included the native vlan, there is no need to change native vlan to something else other that vlan 1. It still dont tag the native vlan. You just tell the switch to tag native vlan:
    Rack1SW1(config)#vlan dot1q tag native

    All vlan are allowed on trunks by default.

  7. Craig’s avatar

    I’m hoping you might be able to help with LAG configuration using a dual-nic’d ESXi server and a Dell 6248 switch. The options provided by the IOS are slightly different, and I don’t believe the LAG I’ve configured is using both physical NICS of my ESXi server.

    Any insight you might be able to provide would be helpful. Here is what my switch config looks like:

    interface ethernet 2/g25
    channel-group 1 mode auto
    switchport mode general
    no switchport general acceptable-frame-type tagged-only
    exit
    !
    interface ethernet 2/g26
    channel-group 1 mode auto

    switchport mode general
    no switchport general acceptable-frame-type tagged-only
    exit
    !
    interface port-channel 1
    switchport mode general
    no switchport general acceptable-frame-type tagged-only
    exit

    The LAG Hash options on the 6248 look a bit different than what you’ve outlined for Cisco switches. Options are as follows, my port-channel is configured using type 3 below:

    Hash Algorithm Type
    1 – Source MAC, VLAN, EtherType, source module and port Id
    2 – Destination MAC, VLAN, EtherType, source module and port Id
    3 – Source IP and source TCP/UDP port
    4 – Destination IP and destination TCP/UDP port
    5 – Source/Destination MAC, VLAN, EtherType, source MODID/port
    6 – Source/Destination IP and source/destination TCP/UDP port

    I have my vSwitch configured for “route based on IP hash”. This seemed to most closely match the options provided on the switch.

    I have connectivity to my guests, but I think I’m only going through a single physical NIC. Any thoughts?

  8. Craig’s avatar

    I should add that if I change switchport mode to “trunk” (options are access, general, trunk) then I lose connectivity to my Guests.

  9. DWR’s avatar

    I have 9 ESX servers that are connected via just regular 802.1q trunks. With no etherchannels configured. These trunks are configured on a Cisco 3750 switch stack.

    The internal VMware vswitch is setup with 6 of the servers load balancing using Route based on ip hash and the other 3 are Route based on port ID.

    Could this configuration be causing the following error on the 3750?
    %SW_MATM-4-MACFLAP_NOTIF: Host 0201.0000.0000 in vlan 2 is flapping between port Gi0/1 and port Gi0/2

  10. slowe’s avatar

    DWR,

    You should not have your vSwitch configured to route based on IP hash without configuring your switch for link aggregation. It will definitely cause errors. Either changing the load balancing configuration on the vSwitches, or setup EtherChannel bundles on your physical switches.

  11. Joe Lawson’s avatar

    Hi Scott,

    It appears that the IT Obsession site is no more but the Internet Archive still has the page:

    http://web.archive.org/web/20070329061934/http://www.itobsession.com/2005/12/20/nic-teaming-8023ad-in-vmware-esx-server/

  12. JiggySmack’s avatar

    Scott,

    Very cool right up. your blog ROCKS! I wish you were my next door neighbor.

    Concerning this NIC team on ESX and setting it up on the 3750 switch….all is well. I thought however, that even if a member of the NIC team fails or gets unplugged…everything still works just minus the extra bandwidth. If I pull a nic, then my Service Console gets kicked and I can’t get access until I reconnect the failed team member. I’m obviously wrong in my assumption. How can I get aggregate bandwidth, but fault tolerance as well?

    Thanks in advance.

  13. slowe’s avatar

    JiggySmack, you should get both fault tolerance and aggregated bandwidth (depending upon the traffic flows) out of this configuration. If you are losing connectivity to a port group when you pull a single member of the NIC team, then something isn’t configured correctly.

  14. JiggySmack’s avatar

    Well, my “test” for failover is removing one nic from my vSwitch. Shouldn’t that suffice as a “downed nic”? maybe not since I’m essentially breaking my nic bonds apart. i have a 3 nic etherchannel setup on my cisco switch side and a 3 nic “Team” setup on my Vswitch side. If I remove a nic from my vSwitch, I’m breaking up my 3 to 3 bond right which essentially voids out my etherchannel…..is that right? Am I on drugs? I’ve been trying to cut back.

    In my big picture, I’m really trying to get my Celerra NS20 iSCSI SAN connected and get the best possible performance but with Fault tolerance. i’m not sure if you’ve used or seen that model of SAN. It’s got (4) 1 GB NICs on it. I’ve been reading this write up and your other write ups on iSCSI to try and get a grasp on just what the heck I’m doing over here. I did have a single LACP etherchannel iSCSI target setup, then I read it’s better to have multiple iSCSI targets so I broke it up into 4 separate targets and put my LUNs balanced behind them. I had faster VMotions under the etherchannel.

    I’m thinking I should just work at Walmart and be a “greeter”. The only networking involved there is with people.

  15. slowe’s avatar

    JiggySmack,

    You should be able to drop a link in the 3-link EtherChannel without causing the entire channel to go down. There must be something else going on there.

    As for the NS20 configuration, multiple iSCSI targets are definitely something you want to incorporate. There’s no way that ESX, the physical switches, or the NS20 can do anything with multiple links if you’ve only got a single iSCSI target.

    I don’t so how your VMotions could have been faster with an EtherChannel, since VMotion traffic is, by its very definition, one-to-one traffic. One-to-one traffic will never benefit from EtherChannel (which is one of the reasons behind using multiple iSCSI targets).

  16. JiggySmack’s avatar

    Well, what I meant by faster was that I had an aggregate Etherchannel on the NS20 (4 ports = 400Mb/s as seen by the network devices). I broke up the Etherchannel into 4 separate NICs as 4 separate iSCSI targets. Each of my 4 LUNs went to it’s own target which brought it back down to regular Fast Ethernet as far as ESX saw it.

    So what I did now is that i create a single LACP etherchannel device on the NS20, then create 4 virtual interfaces off of that LACP device.

    4 LUNs created each going to it’s own iSCSI target. Each target has the 4 portal IPs assigned to it.

    I lose myself when I try and trace VMotion traffic in my head.

    What are your comments?

    Thanks so much for your previous input!!!!

  17. slowe’s avatar

    JiggySmack,

    That 4 port EtherChannel on the NS20, however, would only use multiple links within the EtherChannel when there were multiple source IPs or multiple destination IPs. Since there was only a single iSCSI target, you would only see a benefit from multiple source IPs (i.e., multiple ESX hosts). Even then, each ESX host would still only get a maximum of 100Mbps for any given traffic flow. Remember: within a link aggregate, no single source-destination traffic flow will ever use more than 1 link within the aggregate.

    Hope this helps!

  18. nikon’s avatar

    hey this is great thanks very much scott

    with the manual etherchanel – is no negotiate needed?

  19. slowe’s avatar

    No negotiation is needed or supported. VMware ESX will support static EtherChannel only.

  20. chris stand’s avatar

    Those of you who are worrying/thinking about native vlans need to do a bit more reading – especially those of you who are having problems unless you add that line to your port channel.

    Native vlan tags packets that are not tagged already – and it will tag them with the native vlan number. Ok, we all got that. ( but not adding a tag does not mean that there is not a tag – it just happens to be “1″ ).
    But if it is not working without that command it means that you have some other configuration issues – your ports don’t have the right “switchport access vlan” on them maybe ?. Native vlan can be removed from 100% of switch port/trunk configurations IF you have other parts setup correctly – it is kind of a vlan of last resort. And you should always shutdown vlan 1 to help find these problems from the begging or you will end up doing silly things.
    You don’t want packets to move around accidently – you want them to move around on purpose.

  21. Shai’s avatar

    Hello,

    this is a great post and interesting discussion. I’m actually having a very different experience with ESX 4 and NFS datastores.
    I setup 2 NICs a vswitch with a VMkernel configured to load ballance using IPHash. I than configure 2 NFS datastores to 2 different IPs (making sure the hash function returns to different values). I use Cisco 3750 and here is the interesting part:
    If I DO NOT configure etherchannel for the 2 ESX ports I get aggregate bandwidth across both NICs (I can see that quite clearly in the guest OS and esxtop with 1600Mbps total).

    If I DO turn on etherchannel on the switch I get HALF the bandwidth. Specifically esxtop shows both NICs sending packets at 800MBps but only 1 of them is recieving and the total is only around 800MBps.

    I am using src-dst-ip load balancing on the switch.

    In short I get double the bandwitdth out of 2 teamed NICs WITHOUT ethernchannel that I do with etherchannel.

    Anyone care to explain?
    Thanks,
    Shai

  22. Olivier’s avatar

    Simple, like said you must use ip hash instead of src-dst-ip with etherchannel.

    Did I understood your question correctly? Seem too simple.

  23. Neil’s avatar

    Scott… excellent content on your site.

    I’m trying to implement this with a 2960G and I’m finding that the connectivity between esx and the pSwitch dies when I add the entry for ‘switchport trunk native vlan 222′ on the interfaces in the group.

    Are there any settings that are different with the 2960 as it is not a layer 3 switch?

    Thanks!

  24. slowe’s avatar

    Neil,

    When you set the native VLAN, you must also change the VMware ESX port group configuration. The native VLAN will then become the untagged VLAN and whatever port group should receive the traffic will need to have the VLAN ID set to 0 (no VLAN ID). It sounds like the port group has a VLAN ID specified, so when you set the native VLAN then you lose connectivity.

    Hope this helps!

  25. ProInteg’s avatar

    In summary based on my understanding through reading and testing, if you need more than 1Gbps throughput to a single data store running on a single iSCSI target (Openfiler, Left-Hand, NetApp, etc) then you need to build a 10GbE iSCSI network.

    Well I’m currently doing just that using 2 ProCurve 6500cl 6-port CX 10GbE switches with Intel 10GbE Dual CX port network adapters in my vSphere ESX 4.0 hosts. iSCSI is being used to access my HP Left-Hand SAN that has 1 10GbE port per node (Left-Hand SANs all come with 2 nodes for redundancy).

    How does it work? As soon as I finish building it I will let everyone know but I’m thinking it should greatly simplify my networking to my backend iSCSI targets and allow me to use iSCSI and have the same if not better performance than using FiberChannel. ( I love both iSCSI and FC )

    There’s a couple questions:

    Has anyone done this already who has insight into the pitfalls of 10GbE?

    What 10GbE switches did you use and what 10GbE network adapters?

    How did you configure ESX?

    How well is it working?

    Thanks in advance and again, I’ll post more after my testing.

    Adam
    Professional Integrations LLC
    Orange County, CA – US
    http://www.professionalintegrations.com

  26. Clayton McKenzie’s avatar

    I read this, decided I understood it and then spent 2 weeks busting my poor, tiny, inexperienced little mind trying to figure out why my esx boxes dissappeared from the network each time I tried adding the switch interfaces to port-channel X using ‘channel-group X mode on’…

    The answer? Switchport state negotiation. The interfaces won’t become active in the port-channel if they’re set to negotiate first.

    ‘switchport nonegotiate’ on each of the interfaces solved it, and now everything’s good.

    Thanks Scott, this blog post really helped.

  27. Kevin Kramer’s avatar

    I have a client that we need to configure 3 separate vlans for. The one esx server is housing our dhcp server. That server needs to provide dhcp to all the vlans. We have added virtual nics to the win 2k8 server. I have assigned virtual machine groups to each vlan and placed the nics on the server in those machine groups. I have trunking set on the uplink port to the server, however, I do not get the correct ip address on each different vlan, I only get ip addresses on my native vlan. Is their any way to do this with just one NIC or do I have to have two to assign vlan trunking.

    Regards,

    Kevin Kramer
    Kotori Technologies, LLC.

  28. slowe’s avatar

    Kevin,

    VLAN trunking is possible with or without NIC teaming. An easier way than adding multiple NICs to a single server might be to use DHCP forwarding on your Layer 3 router between these VLANs (assuming they are routable).

    Otherwise, just make sure your pNICs are attached to switch ports that are configured as VLAN trunks, configure the VLANs as allowed across the trunk (using the switchport allowed vlan command), and then configure ESX/ESXi port groups with matching VLAN IDs. From there it should be easy as pie.

    Hope this helps!

  29. Steve’s avatar

    We setup 4 NIC’s on the ESX side and on the cisco side we have 4 ports in a 3750G-24. The etherchannel shows down, but all traffic is passing fine.
    Our config is:
    interface Port-channel2
    description VMWare Vswitch SWESX03
    switchport trunk encapsulation dot1q
    switchport trunk allowed vlan 102,109,112
    switchport mode trunk
    switchport nonegotiate
    spanning-tree portfast trunk
    !
    interface GigabitEthernet1/0/1
    description SWESX03 NIC 2
    switchport trunk encapsulation dot1q
    switchport trunk allowed vlan 102,109,112
    switchport mode trunk
    switchport nonegotiate
    no mdix auto
    channel-group 2 mode desirable
    spanning-tree portfast trunk
    !
    interface GigabitEthernet1/0/3
    description SWESX03 NIC 3
    switchport trunk encapsulation dot1q
    switchport trunk allowed vlan 102,109,112
    switchport mode trunk
    switchport nonegotiate
    no mdix auto
    channel-group 2 mode desirable
    spanning-tree portfast trunk
    !

    Show Etherchannel:
    3750G-24#sh etherchannel 2 summary
    Flags: D – down P – bundled in port-channel
    I – stand-alone s – suspended
    H – Hot-standby (LACP only)
    R – Layer3 S – Layer2
    U – in use f – failed to allocate aggregator

    M – not in use, minimum links not met
    u – unsuitable for bundling
    w – waiting to be aggregated
    d – default port

    Number of channel-groups in use: 2
    Number of aggregators: 2

    Group Port-channel Protocol Ports
    ——+————-+———–+———————————————–
    2 Po2(SD) – Gi1/0/1(I) Gi1/0/3(I) Gi1/0/21(I)
    Gi1/0/22(I)

    What are we doing wrong? Or is this correct?

  30. Diego’s avatar

    Quick Question

    I have 3 ESX Host servers, each with 2 physical nics.

    would I create 3 port chanel groups. One for each pair of nics attached to a single server, or would I just create 1 port chanel group for all 6 nics ?

  31. mike’s avatar

    I have a question. We’re trying to reconfigure our VM environment. We have an IBM BC H series with 8 servers, 2 NICs per server, and 2 switches. we’ve decided to move forward with a total of 8 switches (cisco 3012s) and 8 NICs per server. We have a Cisco 3750 stack strictly for our ISCSI SAN and 2 4506s for our Data Center Cores. Currently we have our ESX hosts configured for Active/Standby on the uplinks with “originating VM Port-ID”. Now that you have a brief topology my question is we would like to have 4 NICs dedicated for a vSwitch and the other 4 NICs dedicated for a second vSwitch. How can i “team” 4 NICs on the ESX Hosts/Bladeservers since they all 4 connect to different Cisco 3012s within the Bladecenter? Configuration examples seem to only show an ESX host connected to one physical switch with multiple NICs running etherchannel, LACP, and IP Hash. How can we get all 4 NICs to be Active/Active when they connect to 4 different 3012s in our BC Chassis? Do you have a cisco onfig example as well as a suggestion on the VMWare side? Thanks so much!

  32. LB’s avatar

    Unfortunately, the network on which my esxi server reside do use vlan1 to pass vm traffic for the esx servers. Not my doing and unable to change. I am in the process of implementing etherchannel on my three vSwitches. I have 2 vSwitches for vm traffic and the third vSwtch for IPStorage. vSwitch 0, which is on the native vlan (vlan1) I have added 3 pNic to it and will the setting up etherchannel. Since I cannot use untagged traffic over the trunk, Will the ‘Switchport trunk native vlan tag’ command help in this situation or should I use a dummy vlan ‘switchport trunk native vlan 4094′. This would be applied to the portchannel and the physical switch ports. Thanks!

  33. slowe’s avatar

    LB,

    Use the dummy VLAN on the trunks and assign VLAN ID 1 to the port groups. That will work.

  34. Robert’s avatar

    Ugh, I am still having problems with this setup.

    VLAN 10 Management
    VLAN 100 iSCSI / VMotion

    I can successfully create a 3Gb trunked (vlan10/vlan100) etherchannel group between my 3560 and the ESX 4.0 server on vswitch1 3x VMNICs. I then have a vswitch0 that is connected to a single trunked port on my 3560 with my console port that is setup for vlan10 1xVMNICs.

    Everything works, except I want a 4Gb trunk with everything on one switch.

    I deleted vswitch1 so I currently only have 1 vswitch with 1 vmnic.

    So, starting over thought I would be able to just add the single trunked port into the port-channel and then add the VMNICs in esx after the fact and everything would be happy. But I lose connectivity right when I add it into the port-channeluntil I pull it back out. The port-channel and the single trunked port are setup identical. Testing showed that my port-channel in schenario 1 was working. Anyone have any ideas?

    interface Port-channel1
    switchport trunk encapsulation dot1q
    switchport trunk native vlan 4094
    switchport mode trunk
    switchport nonegotiate
    spanning-tree portfast trunk
    !
    interface GigabitEthernet0/19
    description …
    switchport trunk encapsulation dot1q
    switchport trunk native vlan 4094
    switchport mode trunk
    switchport nonegotiate
    !
    interface GigabitEthernet0/20
    description …
    switchport trunk encapsulation dot1q
    switchport trunk native vlan 4094
    switchport mode trunk
    switchport nonegotiate
    channel-group 1 mode on
    !
    interface GigabitEthernet0/21
    description …
    switchport trunk encapsulation dot1q
    switchport trunk native vlan 4094
    switchport mode trunk
    switchport nonegotiate
    channel-group 1 mode on
    !
    interface GigabitEthernet0/22
    description …
    switchport trunk encapsulation dot1q
    switchport trunk native vlan 4094
    switchport mode trunk
    switchport nonegotiate
    channel-group 1 mode on
    !

  35. Robert’s avatar

    I fixed the issue by removing the port channel completely and then re-adding it per a coworkers suggestion. Then adding all of the ports together. Not sure why this would cause any issues but its resolved 4Gb trunk.

  36. Alan’s avatar

    I’ve configured 3 vlans on an ESXi host: 2 (public net) and 100, 101 (a couple of internal nets). 1 is Native, but unused, on the cisco 6509 it connects to.

    vlan 2 is working fine, neither 100 nor 101 are coming up. The cisco says “line protocol down”. Ideas? Debug output isn’t showing anything at all…

    Gi5/1 is the trunk port in question:

    interface GigabitEthernet5/1
    description trunk to vmx02
    switchport
    switchport trunk encapsulation dot1q
    switchport trunk allowed vlan 2,100-102
    switchport mode trunk

    core1#show int vlan101
    Vlan101 is up, line protocol is down
    Hardware is EtherSVI, address is 0011.5d6f.c00a (bia 0011.5d6f.c00a)
    Description: ServerNet NG
    Internet address is 172.20.19.253/22
    MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
    reliability 255/255, txload 1/255, rxload 1/255

    core1#sho int trunk

    Port Mode Encapsulation Status Native vlan
    Gi3/7 on 802.1q trunking 1
    Gi3/16 on 802.1q trunking 1
    Gi5/1 on 802.1q trunking 1

    Port Vlans allowed on trunk
    Gi3/7 1-4094
    Gi3/16 2,427,551,1998
    Gi5/1 2,100-102

    Port Vlans allowed and active in management domain
    Gi3/7 1-2,5,201,296,300,303,362,364,401,409-410,427,551,561-562,601
    Gi3/16 2,427,551
    Gi5/1 2

    Port Vlans in spanning tree forwarding state and not pruned
    Gi3/7 1-2,5,201,296,300,303,362,364,401,409-410,427,551,561-562,601
    Gi3/16 2,427,551
    Gi5/1 2

  37. Andrey’s avatar

    Thanks Scott!

    I know this is an old post, but it just got me out of a little spot.

    And little more, I created port group with VlanID 4095, set adapter type on host settings to E1000, its has give me feature to assign VlanID over Intel 1000MT drivers installed on guest OS.

    Cheers.

    Andrey

  38. Paul S’s avatar

    Our ESX vmware vswitches connect to two Brocade (Foundry) Switches and when I configure the vswitches each in different subnets and look at the network properties the only one that has the right subnet range is the one that is on the same network as my management IP address. The other subnet ranges of IP’s are incorrect. I talked to ESX and they said they pull that information from the Brocade switch but I don’t see how since the vswitches are only layer 2 and if it was reading the arp table alot more IP addresses should be appearing. Now with that said everything still works so I’m not sure the Network information it is pulling is relavent.

  39. Bill’s avatar

    This blog has really helped me out. Thank you.

    Here is how we have used your information and the information from this post on VMTN ( http://communities.vmware.com/thread/254828 ). We have 4 on-board nic (Broadcom) and a quad-port add-in card (Intel) in each host server. I saw a post about not mixing NIC families within the port channel (etherchannel) group. So we have 2 Broadcom nics teamed to switch 1 and 2 Intel nics teamed to switch 2. This gives full redundancy in case of either a nic or switch failure. We have elected not to stack our switches so we cannot build the etherchannel across the physical switches.

    All four nics are assigned to one vswitch. For each port group on our vswitch, we have one pair of nics (one team) active and the other pair standby. The load balancing on the Cisco catalyst switches and on the vswitch is set to ip hash.

    The problem is that a single port failure will cause just one standy nic to become active and testing of this shows that we start having intermittent packet problems. The VMTN post hints that there is a way to have the entire etherchannel fail if one port fails but we have not been able to find it. Can you help?

  40. Kiya’s avatar

    Have a HP c3000 Chassis with 2 Cisco 3020 blade switches. I have this same problem when testing. If I turn on of the switch ports where there is NIC teaming involved (ie one with the service console), the ESX host is basically inaccessible. I looked around and found this article and it made the sense to me. Is this still applicable, and if so, why wouldn’t Vmware or Cisco document it? I am pretty sure hundreds of people are doing NIC teaming.

    The Cisco 3020 then each connect to a Cisco 3750G core switch. Do we have any other special configurations that need to be done on the core switches too?

    Furthermore, I still not getting why you created VLAN 4094. Do all switches now have to know about VLAN 4094? I kinda understand why you did it, but don’t understand how it will affect our environment where the default vlan is 1.

    Hope to hear from you before I start testing it in our environment.

  41. Michael Thompson’s avatar

    Sorry for bringing up an old post, but I had a question; suppose you have two ESX servers with 4 ethernet ports each and two switches that do not support cross-stack etherchannel (in my case, I have two cisco 3560G).

    Would it be viable to take the following approach:
    vSwitch0 is connected to vmnic0 & vmnic1. vmnic0 and vmnic1 are connected to the first physical switch in etherchannel.
    vSwitch1 is connected to vmnic2 & vmnic3. vmnic2 and vmnic3 are connected to the second physical switch in etherchannel.
    Each vm has two nics, with one connection to each vSwitch. The two virtual nics are configured as a team and have a single IP address.

    Or should I just blow it off, not use etherchannel at all, and hook two nics to each switch respectively using a default config?

  42. slowe’s avatar

    Michael, if it were me I’d toss the EtherChannel and just cross-connect the vSwitches to the physical switches. That will give you redundancy in the event of a NIC or physical switch failure. “Keep it as simple as possible, and no simpler.”

    Good luck!

  43. Paul’s avatar

    I just attempted to configure the etherchannel on two different switches configuration. It works some of the time. I was having intermittent connection issues with one of the VM’s and that lead me to this post. I had to move one of the etherchannel groups to standby so they would not be active and this solved the issue. I am going to move it back to the way it was and setup each port as a trunk port to get the VLAN’s over that I need and satisfy the fault tolerance I require. This post was very helpful and I wish I had found it before I changed the configuration.

  44. John’s avatar

    mmm… i cannot see the need for setting up etherchannel on the physical switch.

    If the esx servers are load balancing the outbound traffic based on mac address or ip or whatever, then the mac address table entries for these vm’s in the physical switch will naturally be spread accross all the ports trunking with the esx hosts.

    Why is etherchannel any better than this?

  45. Michael’s avatar

    Thanks for this article, you saved me time piecing it all together!

  46. Shane’s avatar

    Hi Scott,
    A quick question:
    Is it possible to do link aggregation on the server without having to configure the SW EtherChannel?
    Thanks!

  47. slowe’s avatar

    Shane, generally speaking you’ll require configuration on the switch. In fact, it’s possible to do a form of link aggregation without any server knowledge (I’m thinking virtual port channels) but even that requires switch configuration.

    Specific to the VMware use case, there are a variety of ways you can use multiple uplinks that do not require switch configuration, but link aggregation is not one of them. Link aggregation requires switch configuration.

    Hope this helps!

  48. renato’s avatar

    cisco catalyst and juniper vlans
    Cisco Catalyst 3550, ScreenOS, IOS Fundamentals, Juniper NetScreen-25
    i have two vlans configured into a juniper netscreen (a screenOS device).
    I have connected a catalyst to the nic i have configured the vlans on. I need an example configuration to shoot into the cisco to be able to ping both juniper vlan addresses from the cisco

· 1 · 2 · 3 ·