VLAN

You are currently browsing articles tagged VLAN.

In part 1 of this series, I covered some networking basics (OSI and DoD models; layer 2 vs. layer 3; bridging, switching, and routing; Spanning Tree Protocol; and ARP and flooding). In this part, I’m going to build on those basic concepts to introduce a few more fundamental building blocks. As the series progresses, I’ll continue to build on concepts and technologies introduced in earlier sections.

In this part, we’ll discuss:

  • VLANs
  • VLAN Trunks
  • Link Aggregation

I’ll start with VLANs.

VLANs

Recall that in part 1 I defined a broadcast domain as all the devices and hosts that are connected by bridges or switches (which operate at layer 2, the Data Link layer, of the OSI model). This means that, by default, every host plugged into a switch will automatically be part of the same broadcast domain. But what if you wanted to have multiple broadcast domains on the same switch? A virtual LAN, aka a VLAN, allows you to have this ability. Defined in its simplest terms, a VLAN is a layer 2 broadcast domain. A switch that supports VLANs supports the ability to be “subdivided” into multiple broadcast domains. In order for traffic to pass from one VLAN to another VLAN (i.e., from one broadcast domain into another broadcast domain), a layer 3 router is needed.

VLANs work by leveraging a 12-bit identifier in the Ethernet frame format (see here for more details). This 12-bit identifier, often referred to as the VLAN tag, allows for up to 4,094 VLANs (212 = 4,096, with all zeroes [0x000 hexadecimal] and all ones [0xFFF hexadecimal] reserved). Not all switches from all vendors support using all 4,094 VLANs; some switches might support fewer than that.

<aside>As a side note, you might note that 802.1Q doesn’t actually encapsulate the original Ethernet frame. In other words, it doesn’t wrap its own headers before and after the original Ethernet frame; rather, it injects the 12-bit VLAN tag into the frame just after the source MAC header.</aside>

Although adding support for VLANs to a switch allows that switch to support multiple broadcast domains, it doesn’t change certain layer 2 switching behaviors. In particular, the switch may still need to use flooding (described in part 1) to learn which MAC addresses are associated with which switch ports. The key difference is that flooding will only occur within a VLAN, since flooding is limited to a broadcast domain.

Finally, it’s important to note that VLANs themselves are strictly layer 2 constructs, but because a layer 3 router is needed to pass traffic between them, VLANs are often associated with IP subnets—meaning that each VLAN is considered a unique IP network. It’s probably for this reason that some people use the terms “IP subnet” and “VLAN” interchangeably when, as you can see, they aren’t necessarily the same thing. (You could, for example, have two different IP subnets running on the same broadcast domain.) Generally speaking, though, it’s very common for each VLAN to represent a unique IP subnet.

VLAN Trunks

OK, so a VLAN allows me to subdivide a switch into multiple broadcast domains. What if I have multiple switches? The IEEE 802.1Q standard that defines VLANs also defines a way for two switches to multiplex VLANs over a single link. (Without this functionality, the only way to connect multiple switches together while still preserving VLANs would be to have separate physical connections for each VLAN—clearly not very efficient.) A connection that carries multiple VLANs is often referred to as a VLAN trunk (although some might say that this is a very Cisco-centric term). Note that VLAN trunks don’t handle switch configuration; if you want two switches connected by a VLAN trunk to “share” the same VLANs, you’ll still need to configure the VLANs on each switch.

Allow me to use a practical example to help illustrate this point. Assume you have SwitchA that has two VLANs defined (VLAN 100 and VLAN 200). Further assume that you have SwitchB with two VLANs, VLAN 200 and VLAN 300, defined. No layer 3 router is present. A VLAN trunk connects SwitchA and SwitchB. Here’s the resulting connectivity matrix:

  1. HostA in VLAN 100 attached to SwitchA won’t be able to communicate with any hosts attached to SwitchB (there is a VLAN trunk between the switches, but SwitchB doesn’t have VLAN 100 defined and VLAN 200 is a separate broadcast domain).
  2. HostA in VLAN 200 attached to SwitchA will be able to communicate with hosts in VLAN 200 attached to SwitchB (the VLAN is defined on both switches and there is a VLAN trunk between the switches).
  3. HostB in VLAN 200 attached to SwitchB will be able to communicate with hosts in VLAN 200 attached to SwitchA (the VLAN is defined on both switches and there is a VLAN trunk between the switches).
  4. HostC in VLAN 300 attached to SwitchB will not be able to communicate with any hosts attached to SwitchA (there is a VLAN trunk between the switches, but SwitchA doesn’t have VLAN 300 defined, and the VLANs that are defined are separate broadcast domains).

(There are a few “gotchas” to this relatively simple example, like native/untagged VLANs and VLAN pruning across the trunk, but this discussion should suffice for now.)

The key takeaway is that in order for VLANs to span multiple physical switches, you need a) matching VLAN configurations across the physical switches, and b) a VLAN trunk connecting the physical switches. Without both of these, connectivity generally won’t work. When both of these conditions are present, the VLAN (and therefore the broadcast domain) is extended across both switches. It should be fairly obvious at this point that by extending the VLAN across both switches, you’ve subjected ports in that VLAN on both switches to broadcasts and flooding (because they are in the same broadcast domain).

(Side note: STP also had to be modified to account for VLANs and VLAN trunks to ensure that bridging loops were not created within each VLAN. This gave rise to a group of STP versions, such as PVST and the like.)

Link Aggregation

In part 1 I introduced Spanning Tree Protocol (STP) (more information here), which the networking experts created to eliminate switching loops (which were a Bad Thing because there is no TTL-like mechanism at layer 2 to remove “old” or “stale” traffic from the network). While preventing switching loops is a Good Thing, one by-product of STP is that it blocks redundant switch-to-switch connections in the same broadcast domain. So what could you do if you needed more bandwidth between switches than a single link could offer?

This is where link aggregation comes into play. Link aggregation allows switches to combine multiple physical links into one logical link, allowing for a greater amount of aggregate bandwidth between the switches. For example, I might configure 4 individual 1 Gbps physical links into a single logical link, giving me a total of 4 Gbps aggregate throughput between the two switches. The key phrase here is aggregate throughput; it’s really important to understand that any single traffic flow will only be able to use a single physical link.

Let me explain why. Let’s suppose that you have 4 links combined into a single logical link between two switches (I’ll be creative and call them SwitchA and SwitchB). When traffic enters SwitchA bound for SwitchB, SwitchA needs to decide how to place the traffic on the individual members of the logical link. Switches generally support a variety of load balancing mechanisms, including source-destination MAC addresses, source-destination IP addresses, and sometimes even layer 4 (TCP/UDP) source-destination ports. For the purposes of this discussion, I’ll assume that the load balancing is being done based on source and destination IP address. The traffic enters SwitchA, originating from HostA and bound for HostB attached to SwitchB. There is a link aggregate configured between SwitchA and SwitchB, so SwitchA performs a hash of the source and destination IP addresses. The result of that hash—which would be a number ranging from 0 to 3, since there are 4 links in the link aggregate—tells SwitchA which physical link to use in the logical link. SwitchA places the traffic on the physical link and off it goes. When HostB replies, SwitchB has to go through the same process, so it calculates a hash based on the source and destination IP addresses, determines which link in the aggregate to use, and off it goes.

There are a couple of key takeaways from this:

  1. Both switches need to be configured to use link aggregation, and the configuration has to match on each end. If SwitchA was configured to use 2 links but SwitchB was configured to use 4 links, we’d very likely run into issues.
  2. Any given traffic flow between two endpoints will always be limited to the bandwidth of a single link within the aggregate. Why? Look back to the explanation above: the switches will create a hash based on some variables (MAC addresses, IP addresses, source and destination TCP/UDP ports) to determine which link to use. As long as all the variables are the same—which they would be for a single traffic flow—the hash is deterministic and always returns the same result. Therefore, the same link is always used for that particular traffic flow.

Takeaway #2 has significant implications, which I’ll explore in more detail in future posts. (Here’s an example; see slides 5 through 7 in particular.)

There are a number of protocols involved in link aggregation; the most common protocol is Link Aggregation Control Protocol (LACP), which is designed to enable switches to negotiate the use of link aggregation between them (when used properly, this helps address takeaway #1). You may also see references to “EtherChannel”; this is a Cisco-specific term that is also used to describe the use of link aggregation.

That’s probably enough information for now. If you have any questions about any of the information I’ve presented here, please feel free to speak up in the comments. I welcome all courteous comments, so join in the discussion!

Tags: ,

It’s interesting to me how much the idea of a VLAN has invaded the consciousness of data center IT professionals. Data center folks primarily tasked with managing compute workloads are nearly as familiar with VLANs as their colleagues primarily tasked with managing network connectivity. However, as networking undergoes a transformation at the hands of SDN, NFV, and network virtualization, what will happen to the VLAN?

The ubiquity of the VLAN is due, I think, to the fact that it serves as a reasonable “common ground” for both compute-focused and networking-focused professionals. Need a logical container for new workloads? We can use a VLAN for that. VMware is partially to blame for this—vSphere (and its predecessors) made it incredibly easy to use VLANs as a way of logically “partitioning” compute workloads on the same host. (To be fair, it was really the only tool available to accomplish the task at the time.)

Normally, finding a “common ground” is a good thing…until that common ground starts to get pushed beyond where it was intended to be used. I think this is where VLANs are now—getting pushed beyond where they were intended to be used, and that strain is the source of some discord between the compute-centric teams and the networking-centric teams:

  • The compute-centric teams need a logical container by which they can group workloads that might run potentially anywhere in the data center.
  • The networking-centric teams, though, recognize the challenges inherent in taking a VLAN (i.e., a single broadcast domain) and stretching it across a bunch of different top-of-rack (ToR) switches so that it’s available to any compute host. (The irony here is that we’re using a tool designed to breakdown broadcast domains—VLANs—and building large broadcast domains with them.)

What’s needed here is a separation, or layering, of functions. The compute-centric world needs some sort of logical identifier that can be used to group or identify traffic. However, this logical identifier needs to be separate and distinct from the identifier/tag/mark that the networking-centric folks need to use to build scalable networks with reasonably-sized broadcast domains. This is, in my view, one of the core functions of a network encapsulation protocol like STT, VXLAN, or NVGRE when used in the context of a network virtualization solution. Note that qualification—a network encapsulation protocol alone is like the suspension on a car: useful, but only in the context of the complete package. When a network encapsulation protocol is used in the context of the complete package (a network virtualization solution), the network encapsulation protocol can supply the logical identifier the compute-centric teams need (this would be VXLAN’s 24-bit VNI, or STT’s 64-bit Context ID) while simultaneously allowing the network-centric teams to use VLANs as they see fit for the best performance and resiliency of the network.

<aside>By the way, while using a network encapsulation protocol in the context of a network virtualization solution provides the decoupling that is so important to innovation, it’s important to note that decoupling does not equal loss of visibility. But that’s a topic for another post…</aside>

The end result is that compute-centric teams can create logical groupings that are not dependent on the configuration of the underlying network. Need a new logical grouping for some new line-of-business application? No problem, create it and start turning up workloads. Meanwhile, the networking-centric teams are free to design the network for optimal performance, resiliency, and cost-effectiveness without having to take the compute-centric team’s logical groups into consideration. Need to use a routed L3 architecture between pods/racks/ToR switches using VLANs? No problem—build it the way it needs to be built, and the network virtualization solution will handle creating the compute-centric logical grouping.

At least, that’s my thought. All my Thinking Out Loud posts are just that—me thinking out loud, providing a springboard to more conversation. What do you think? Am I mistaken? Feel free to speak up in the comments below. Courteous comments (with vendor disclosures where applicable, please) are always welcome.

Tags: , ,

In other articles, I’ve talked about how to use Open vSwitch (OVS) with VLANs to place guest domains (VMs) into a particular VLAN. In this article, I want to show you how to pass VLAN tags all the way into the guest domain—in other words, how to do VLAN trunking to guest domains using OVS. To do this, we’re going to leverage the OVS-libvirt integration I referenced in this post on using VLANs with OVS and libvirt.

For this to work, you must have an operating system in the guest domain that is capable of recognizing and using the VLAN tags that are being passed to it by OVS. In this article, I’ll use Ubuntu 12.04 as the OS in the guest domain. For other operating systems, the commands and/or procedures to configure VLAN support appropriately will probably differ, so keep that in mind.

There are two parts to making this work:

  1. Configuring OVS (manually or via libvirt) to pass VLAN tags to the guest OS.
  2. Configuring the guest domain’s installed OS to take advantage of the VLAN tags being passed up by OVS.

Let’s look at each of these parts separately. We’ll start with configuring OVS, either manually or via libvirt, to pass the VLAN tags up to the guest domain.

Configuring OVS to Pass VLAN Tags to the Guest Domain

There are two ways to accomplish this: you can do it manually, or you can do it via OVS integration with recent builds of libvirt.

Manually Configuring OVS

To configure OVS manually, you would need to:

  1. Identify which vnet port you want to configure for VLAN trunking
  2. Configure the vnet port to trunk the VLANs.

To identify which vnet port needs to be modified, you’ll want to figure out the guest domain interface(s) that is/are connected to the vnet port. You can do this by using this command (substitute the desired vnet port name in place of vnet0 in the following command):

ovs-vsctl list interface vnet0

In the output of the command, look for the external_ids line; it will contain an entry called “attached-mac”, and that represents the MAC address of the interface in the guest domain OS attached to this particular vnet port. You can compare this to the output of ip addr list or ifconfig -a in Ubuntu to find a matching MAC address in the guest domain. Correlating the two values allows you to determine which guest domain is attached to which vnet port, and then you can modify the correct vnet port appropriately.

You’d modify the vnet port using this command:

ovs-vsctl set port vnet0 trunks=20,30,40

You’d want to substitute the appropriate values for vnet0 and the VLAN IDs that you want passed up to the guest domain. Once you’ve made the change, you can verify the changes using this command (replacing vnet0 with the correct port):

ovs-vsctl list port vnet0

Note that if you want the guest domain to receive both untagged (native VLAN) traffic as well as tagged (trunked) traffic, there is an additional setting you must set:

ovs-vsctl set port vnet0 vlan_mode=native-untagged

With this setting in place, the OS installed into the guest domain will be able to communicate over the untagged (native) VLAN as well as using VLAN tags.

Using libvirt Integration

If the manual method of configuring OVS seems a bit cumbersome, using the libvirt integration makes it much easier.

Basically, you’ll follow the configuration outlined in this blog post to create a libvirt network that corresponds to an OVS bridge. Here’s an example of the XML code to accomplish this task:

Of particular interest for what we’re trying to accomplish here is the very last section, the portgroup named “vlan-all.” Note that for this specific portgroup, the vlan element has a property that specifies it is a trunk, and then there are multiple tag elements that list each VLAN ID that will be trunked across this network into the guest domain.

Using this configuration, when we create the guest domain and specify that it is attached to the network named “vlan-all” (matching the portgroup in the libvirt network definition), libvirt will automatically configure OVS appropriately (it will set the trunks value for that domain’s OVS port).

However, it will not configure the OVS port to allow untagged traffic as well (only tagged traffic will be passed). If you want the guest domain to receive untagged traffic also, you must set the vlan_mode value manually as outlined above.

Configuring the Guest Domain to Use VLAN Tags

Once you’ve followed the steps outlined above and have OVS configured correctly, then you’re ready to configure the OS in the guest domain. Keep in mind that I’m using Ubuntu 12.04 in this post, but you’re welcome to use any operating system that supports VLAN tags.

Assuming that eth0 is the interface in the guest domain that is receiving tagged traffic from OVS, this snippet in /etc/network/interfaces will create and configure a VLAN interface:

Technically, the “raw-vlan-device” line isn’t needed because the parent device name is in the name of the VLAN device, but I like to include it for completeness and ease of debugging. (Your mileage may vary, of course.) The number on the end of the eth0 (for example, eth0.20) corresponds to the VLAN ID (VLAN 20, in this case) being passed up by OVS.

You can repeat this configuration for multiple VLAN interfaces.

Use Case

I’ll have to admit that I can’t immediately think of some useful use cases for this sort of configuration. At first glance, you might think that it would be useful in situations where you need logical separation, but I think there are better ways than VLANs to accomplish this task (and those ways are probably simpler). I primarily set out to document this in order to better solidify my knowledge of how OVS works and is configured. However, I’d be happy to hear from others on what they think might be interesting or useful use cases for this sort of configuration. Feel free to add your thoughts in the comments below. Courteous comments are always welcome!

Tags: , , , ,

In previous posts, I’ve shown you how to use Open vSwitch (OVS) with VLANs through fake bridges, as well as how to wrap libvirt virtual network around OVS fake bridges. Both of these techniques are acceptable for configuring VLANs with OVS, but in this post I want to talk about using VLANs with OVS via a greater level of libvirt integration. This has been talked about elsewhere, but I wasn’t able to make it work until libvirt 1.0.0 was released. (Update: I was able to make it work with an earlier version. See here.)

First, let’s recap what we know so far. If you know the port to which a particular domain (guest VM) is connected, you can configure that particular port as a VLAN trunk like this:

ovs-vsctl set port <port name> trunks=10,11,12

This configuration would pass the VLAN tags for VLANs 10, 11, and 12 all the way up to the domain, where—assuming the OS installed in the domain has VLAN support—you could configure network connectivity appropriately. (I hope to have a blog post up on this soon.)

Along the same lines, if you know the port to which a particular domain is connected, you could configure that port as a VLAN access port with a command like this:

ovs-vsctl set port <port name> tag=15

This command makes the domain a member of VLAN 15, much like the use of the switchport access vlan 15 command on a Cisco switch. (I probably don’t need to state that this isn’t the only way—see the other OVS/VLAN related posts above for more techniques to put a domain into a particular VLAN.)

These commands work perfectly fine and are all well and good, but there’s a problem here—the VLAN information isn’t contained in the domain configuration. Instead, it’s in OVS, attached to an ephemeral port—meaning that when the domain is shut down, the port and the associated configuration disappears. What I’m going to show you in this post is how to use VLANs with OVS in conjunction with libvirt for persistent VLAN configurations.

This document was written using Ubuntu 12.04.1 LTS and Open vSwitch 1.4.0 (installed straight from the Precise Pangolin repositories using apt-get). Libvirt was compiled manually (see instructions here). Due to some bugs, it appears you need at least version 1.0.0 of libvirt. Although the Silicon Loons article I referenced earlier mentions an earlier version of libvirt, I was not able to make it work until the 1.0.0 release. Your mileage may vary, of course—I freely admit that I might have been doing something wrong in my earlier testing.

To make VLANs work with OVS and libvirt, two things are necessary:

  1. First, you must define a libvirt virtual network that contains the necessary portgroup definitions.
  2. Second, you must include the portgroup reference to the virtual network in the domain (guest VM) configuration.

Let’s look at each of these steps.

Creating the Virtual Network

The easiest way I’ve found to create the virtual network is to craft the network XML definition manually, then import it into libvirt using virsh net-define.

Here’s some sample XML code (I’ll break down the relevant parts after the code):

The key takeaways from this snippet of XML are:

  1. First, note that the OVS bridge is specified as the target bridge in the <bridge name=...> element. You’ll need to edit this as necessary to make your specific OVS configuration. For example, in my configuration, ovsbr0 refers to a bridge that handles only host management traffic.
  2. Second, note the <portgroup name=...> element. This is where the “magic” happens. Note that you can have no VLAN element (as in the vlan-01 portgroup), a VLAN tag (as in the vlan-10 or vlan-20 portgroups), or a set of VLAN tags to pass as a trunk (as in the vlan-all portgroup).

Once you’ve got the network definition in the libvirt XML format, you can import that configuration with virsh net-define <XML filename>. (Prepend this command with sudo if necessary.)

After it is imported, use virsh net-start <network name> to start the libvirt virtual network. If you make changes to the virtual network, such as adding or removing portgroups, be sure to restart the virtual network using virsh net-destroy <network name> followed by virsh net-start <network name>.

Now that the virtual network is defined, we can move on to creating the domain configuration.

Configuring the Domain Networking

As far as I’m aware, to include the appropriate network definitions in the domain XML configuration, you’ll have to edit the domain XML manually.

Here’s the relevant snippet of domain XML configuration:

You’ll likely have more configuration entries in your domain configuration, but the important one is the <source network=...> element, where you’ll specify both the name of the network you created as well as the name of the portgroup to which this domain should be attached.

With this configuration in place, when you start the domain, it will pass the necessary parameters to OVS to apply the desired VLAN configuration automatically. In other words, once you define the desired configuration in the domain XML, it’s maintained persistently inside the domain XML (instead of on the ephemeral port in OVS), re-applied anytime the domain is started.

Verifying the Configuration

Once the appropriate configuration is in place, you can see the OVS configuration created by libvirt when a domain is started by simply using ovs-vsctl show or—for more detailed information—ovs-vsctl list port <port name>. Of particular interest when using ovs-vsctl list port <port name> are the tag and/or trunks values; these are where VLAN configurations are applied.

Summary

In this post, I’ve shown you how to create libvirt virtual networks that integrate with OVS to provide persistent VLAN configurations for domains connected to an OVS bridge. The key benefit that arises from this configuration is that you longer need to know to which OVS port a given domain is connected. Because the VLAN configuration is stored with the domain and applied to OVS automatically when the domain is started, you can be assured that a domain will always be attached to the correct VLAN when it starts.

As usual, I encourage your feedback on this article. If you have questions, thoughts, corrections, or clarifications, you are invited to speak up in the comments below.

Tags: , , , , , , ,

In this post, I’ll be sharing with you information on how to do link aggregation (with LACP) and VLAN trunking on a Brocade FastIron switch with both VMware vSphere as well as Open vSwitch (OVS).

Throughout the majority of my career, my networking experience has been centered around Cisco’s products. You can easily tell that from looking at the articles I’ve published here. However, I’ve recently had the opportunity to spend some time working with a Brocade FastIron switch (a 48-port FastIron Edge X, specifically, running software version 7.2), and I wanted to write-up what I’ve learned about how to do link aggregation and VLAN trunking in conjunction with both VMware vSphere as well as OVS.

Configuring Link Aggregation with LACP

When researching how to do link aggregation on a Brocade FastIron, I came across a number of different articles suggesting two different ways to configure link aggregation (ultimately I followed the information provided in this article and this article). I think that the difference in the configuration comes down to whether or not you want to use LACP, but I’m not completely sure. (If you’re a Brocade/Foundry expert, feel free to weigh in.)

To configure a link aggregate using LACP, use these commands:

  • You’ll use the link-aggregate configure key <unique key> command to identify which interfaces may participate in a given link aggregate. The key must range from 10000 to 65535, and has to be unique for each group of interfaces in a link aggregate bundle. The switch uses the key to identify which ports may be a part of a link aggregate.
  • You’ll use the link-aggregate active command to indicate the use of LACP for link aggregation configuration and negotiation.

For example, if you wanted to configure port 10 on a switch for link aggregation, the commands would look something like this:

switch(config)# interface ethernet 10
switch(config-if-e1000-10)# link-aggregate configure key 10000
switch(config-if-e1000-10)# link-aggregate active

For each additional port that should also belong in this same link aggregate bundle, you would repeat these commands and use the same key value. As I mentioned earlier, the identical key value is what tells the switch which interfaces belong in the same bundle.

Configuring the virtualization host is pretty straightforward from here:

  • If you are using vSphere, note that you’ll need to use vSphere 5.1 and a vSphere Distributed Switch (VDS) in order to use LACP. In order to use LACP, you’ll need to set your teaming policy to “Route based on IP hash,” and then you must enable LACP in the settings for the uplink group. Chris Wahl has a nice write-up here, including a list of the caveats of using LACP with vSphere. VMware also has a VMware KB article on the topic.
  • If you are using OVS, you can follow the instructions I provided in this post on link aggregation and LACP with Open vSwitch.

Configuring VLANs

Although VLANs are (generally) interoperable between different switch vendors due to the broad adoption of the 802.1Q standard, the details of each vendor’s implementation of VLANs is just different enough to make life difficult. In this particular case, since I learned Cisco’s VLAN implementation first, Brocade’s VLAN implementation on the FastIron Edge X series switches seemed rather odd. I’m sure that had I learned Brocade’s implementation first, then Cisco’s version would seem odd.

In any case, the commands you use for VLANs are as follows:

  • To create a VLAN, use the vlan <VLAN identifier> command.
  • To add a port to that VLAN, so that traffic across that port is tagged for the specified VLAN, use the tagged ethernet <interface> command.
  • To add a range of ports to a VLAN, use the tagged ethernet <start interface> to <end interface> command.
  • To allow a port to carry both untagged (native, or default VLAN) and tagged traffic, you must use the dual-mode command. Otherwise, a port carries only untagged or tagged traffic. (This was a key difference in Brocade’s VLAN implementation that threw me off at first.)

So, if you wanted to create VLAN 30, add Ethernet interface 24 to that VLAN, and configure the interface to carry both tagged and untagged traffic, the commands would look something like this:

switch(config)# vlan 30 name server-subnet
switch(config-vlan-30)# tagged ethernet 24
switch(config-vlan-30)# interface ethernet 24
switch(config-if-e1000-24)# dual-mode

Once the VLANs are created and the interfaces are added to the VLANs, configuring the virtualization hosts is—once again—pretty straightforward:

I hope this information is useful to someone. If anyone has any corrections or clarifications, I encourage you to add your information to the comments on this post.

Tags: , , , , , ,

In other posts, I’ve (briefly) talked about how to configure Open vSwitch (OVS) for use with VLANs. If you know the port to which a guest is connected, you can configure that particular port as a VLAN trunk like this:

ovs-vsctl set port <port name> trunks=10,11,12

This configuration would pass the VLAN tags for VLANs 10, 11, and 12 all the way up to the guest, where—assuming the OS installed in the guest has VLAN support—you could configure network connectivity appropriately.

Alternately, if you know the port to which a particular guest is connected, you could configure that port as a VLAN access port with a command like this:

ovs-vsctl set port <port name> tag=15

This command makes the guest a member of VLAN 15, much like the use of the switchport access vlan 15 command on a Cisco switch.

These commands are all well and good, but there’s a couple problems here:

  1. First, you must know which port corresponds to which guest domain. Thus far, I have been unable to determine what set of commands will help me (you) establish the mapping between ports/interfaces and guest domains. (If you know how, please speak up in the comments!)
  2. Second, even if you do know which port corresponds to which guest, the settings are ephemeral. That is, when you power off the guest, the port—and its associated configuration—goes away. You’d then need to reapply the configuration to the port when you start the guest domain again.

Clearly, this is not ideal. Fortunately, there is a workaround—a couple of them, actually. One workaround is to add OVS and VLAN support to libvirt (something that is actually mentioned here). This is a great idea—but it doesn’t work just yet. On some systems (I use Ubuntu 12.04.1 LTS with libvirt 0.10.2), the libvirt-OVS-VLAN integration causes an error. A patch has been submitted to libvirt to fix this problem (great work Kyle!), but it hasn’t (yet) made it into a release.

Without OVS/VLAN support in libvirt, we have only one other workaround: OVS fake bridges. OVS fake bridges look and act like a bridge, but are tied to a particular VLAN ID. (I haven’t seen/found a way to use a fake bridge to do VLAN trunking up to a guest domain. Anyone else know how?) In this post, I’m going to show you how to use OVS fake bridges to add VLAN support to your OVS environment.

This post was written using Ubuntu 12.04.1 LTS with Open vSwitch 1.4.0 (straight out of the Precise Pangolin repositories). Please note that the commands might be slightly different on other distributions or with other versions of OVS.

To create a fake bridge, you’ll use a modified form of the ovs-vsctl add-br command. The command is so subtly different that I missed it quite a few times when reading through the documentation for ovs-vsctl. Here’s the command you’ll need:

ovs-vsctl add-br <fake bridge> <parent bridge> <VLAN>

Let’s look at an example. Suppose you had an existing OVS bridge named ovsbr0, and you wanted to add a fake bridge to support VLAN 100. You would use this command:

ovs-vsctl add-br vlan100 ovsbr0 100

When you create (or edit) a guest domain, you’ll assign it to the new fake bridge (named vlan100 in this example). So, looking at the libvirt XML code for a guest domain, it might look something like this:

<interface type='bridge'>
  <mac address='11:22:33:aa:bb:cc'/>
  <source bridge='vlan100'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>

Naturally, you could also create a libvirt virtual network that corresponds to the fake bridge as well. (I’ll likely post a separate article around that idea.)

Then, when you powered up the guest domain and ran ovs-vsctl show, you’d see something like this:

Bridge "ovsbr0"
    Port "bond0"
        Interface "eth1"
        Interface "eth2"
    Port "ovsbr0"
        Interface "ovsbr0"
            type: internal
    Port "vnet0"
        tag: 100
        Interface "vnet0"
    Port "vlan100"
        tag: 100
        Interface "vlan100"
            type: internal

Note that the guest domain’s port/interface are automatically given the fake bridge’s VLAN tag, without any further interaction/configuration required by the user or administrator. Much better!

Assuming you’re using fake bridges (and if you’re using OVS and VLANs, I’m not sure how you wouldn’t be), there are a couple other commands you might find helpful as well:

  • The ovs-vsctl br-to-vlan command will print the VLAN ID for a given bridge. If the bridge is a real bridge, the command returns 0; if the bridge is a fake bridge, it returns the VLAN ID.
  • The ovs-vsctl br-to-parent command returns the parent bridge for a given fake bridge. If the specified bridge is a real bridge, it returns the real bridge.

Using fake bridges with link aggregation is also possible, as you can see from the snippet of OVS configuration above. More information on OVS with link aggregation is available here.

I hope this information is useful. OVS is a really powerful piece of software, and I’m enjoying learning more about it and how to use it. If anyone has any additional information, please feel free to speak up in the comments. All courteous comments are welcome!

Tags: , , , , ,

Back in early March I was invited to speak at the South Florida VMUG, and I gave this presentation on vSphere networking challenges and solutions. The idea behind the presentation was to give attendees some visibility into IEEE and IETF efforts at creating new network technologies and protocols. I’m posting it here just in case someone might find it useful or helpful.

As always, your questions, corrections, or clarifications are welcome in the comments below.

Tags: , , , ,

If you don’t work in the networking space on a regular basis, it’s easy to overlook interoperability issues between equipment from different vendors. After all, a VLAN trunk is a VLAN trunk is a VLAN trunk, right? Alas, the answer is not always quite so simple.

While standards such as 802.1Q promise easy interoperability, the devil is usually in the details. I ran into just this sort of problem today in the lab. Specifically, I had a need to trunk VLANs between a pair of Cisco Nexus 5010 switches and a pair of Dell PowerConnect 6248 switches.

The configuration on the Cisco Nexus side was pretty straightforward (note that this was one of the first eight ports on the switch and was throttled down to 1Gbps):

interface ethernet 1/1
  switchport mode trunk
  speed 1000

I tried replicating this same setup on the PowerConnect switches using Dell’s switchport mode trunk command. Unfortunately, it didn’t work. I kept digging around, but regardless of the configuration the show interfaces switchport ethernet command would always show that VLAN 1 was marked as tagged. This clearly wouldn’t work; since VLAN 1 was defined as the native VLAN on the Cisco Nexus switch, it would be untagged on the Cisco side. I needed the VLAN to be untagged on the Dell side as well.

Quite by accident, I stumbled upon a slightly different command on the Dell: the switchport mode general command. The help text in the interface indicated that this was the correct configuration for 802.1Q operation.

I modified the Dell PowerConnect to use this configuration:

interface ethernet 1/g47
switchport mode general
switchport general allowed vlan add 1 untagged
switchport general allowed vlan add 900 tagged

With this configuration, the show interfaces switchport ethernet command now reported that VLAN 1 was untagged, as shown in the screenshot below.

A quick connectivity test showed that traffic was now flowing properly between the Dell PowerConnect 6248 switches and the Cisco Nexus 5010 switches. Problem resolved! Key takeaway: use switchport mode general for interoperability with other vendors’ switches.

If you have any experience with Dell PowerConnect switches and have additional information to share, please post it in the comments below.

Tags: , ,

About a month ago I posted an article titled The vMotion Reality. In that article, I challenged the assertion that vMotion was a myth, not widely implemented in data centers today due to networking complexity and networking limitations.

In a recent comment to that article, a reader again mentions networking limitations—in this instance, the realistic limits of a large Layer 2 broadcast domain—as a gating factor for vMotion and limiting its “practical use” in today’s data centers. Based on the original article’s assertions and the arguments found in this comment, I’m beginning to believe that a lot of people have a basic misunderstanding of the Layer 2 adjacency requirements for vMotion and what that really means. In this post, I’m going to discuss vMotion’s Layer 2 requirements and how they interact (and don’t interact) with other VMware networking functionality. My hope is that this article will provide a greater understanding of this topic.

First, I’d like to use a diagram to help explain what we’re discussing here. In the diagram below, there is a single ESX/ESXi host with a management interface, a vMotion interface, and a couple of network interfaces dedicated to virtual machine (VM) traffic.

VLAN Behaviors with VMware ESX/ESXi

As you can see from this highly-simplified diagram, there are three basic types of network interfaces that ESX/ESXi uses: management interfaces, VMkernel interfaces, and VM networking interfaces. Each of them is configured separately and, in many configurations, quite differently.

For example, the management interface (in VMware ESX this is the Service Console interface, in VMware ESXi this is a separate instance of a VMkernel interface) is typically configured to connect to an access port with access to a single VLAN. In Cisco switch configurations, the interface configuration would look something like this:

interface GigabitEthernet 1/1
  switchport mode access
  switchport access vlan 27
  spanning-tree portfast

There might be other commands present, but these are the basic recommended settings. This makes the management interfaces on an ESX/ESXi host act, look, feel, and behave just exactly like the network interfaces on pretty much every other system in the data center. Although VMware HA/DRS cluster design considerations might lead you toward certain Layer 2 boundaries, there’s nothing stopping you from putting management interfaces from different VMware ESX/ESXi hosts in different Layer 3 VLANs and still conducting vMotion operations between them.

So, if it’s not the management interfaces that are gating the practicality of vMotion in today’s data centers, it must be the VM networking interfaces, right? Not exactly. Although the voices speaking up against vMotion in this online discussion often cite Layer 3 VLAN concerns as the primary problem with vMotion—stating, rightfully so, that the IP address of a migrated virtual machine cannot and should not change—these individuals are overlooking the recommended configuration for VM networking interfaces in a VMware ESX/ESXi environment.

Physical network interfaces in a VMware ESX/ESXi host that will be used for VM networking traffic are most commonly configured to act as 802.1Q VLAN trunks. For example, the Cisco switch configuration for a port connected to a network interface being used for VM networking traffic would look something like this:

interface GigabitEthernet1/10
  switchport trunk encapsulation dot1q
  switchport mode trunk
  switchport trunk allowed vlan 1-499
  switchport trunk native vlan 499
  spanning-tree portfast trunk

As before, some commands might be missing or some additional commands might be present, but this gives you the basic configuration. In this configuration, the VM networking interfaces are actually capable of supporting multiple VLANs simultaneously. (More information on the configuration of VLANs with VMware ESX/ESXi can be found in this aging but still very applicable article.)

In practical use what this means is that any given VMware ESX/ESXi host could have VMs running on it that exist on a completely different and separate VLAN than the management interface. In fact, any given VMware ESX/ESXi host might have multiple VMs running across multiple separate and distinct VLANs. And as long as the ESX/ESXi hosts are properly configured to support the same VLANs, you can easily vMotion VMs between ESX/ESXi hosts when those VMs are in different VLANs. So, the net effect is that ESX/ESXi can easily support multiple VLANs at the same time for VM networking traffic, and this means that vMotion’s practical use isn’t gated by some inherent limitation of the VM networking interfaces themselves.

Where in the world, then, does this Layer 2 adjacency thing keep coming from? If it’s not management interfaces (it isn’t), and it’s not VM networking interfaces (it isn’t), then what is it? It’s the VMkernel interface that is configured to support vMotion. In order to vMotion a VM from one ESX/ESXi host to another, each host’s vMotion-enabled VMkernel interface has to be in the same IP subnet (i.e., in the same Layer 2 VLAN or broadcast domain). Going back to Cisco switch configurations, a vMotion-enabled VMkernel port will be configured very much like a management interface:

interface GigabitEthernet 1/5
  switchport mode access
  switchport access vlan 37
  spanning-tree portfast

This means that the vMotion-enabled VMkernel port is just an access port (no 802.1Q trunking) in a single VLAN.

Is this really a limitation on the practical use of vMotion? Hardly. In my initial rebuttal of the claims against vMotion, I pointed out that because it is only the VMkernel interface that must share Layer 2 adjacency, all this really means is that a vMotion domain is limited to the number of vMotion-enabled VMkernel interfaces you can put into a single Layer 2 VLAN/broadcast domain. VMs are unaffected, as you saw earlier, as long as the VM networking interfaces are correctly configured to support the appropriate VLAN tags, and the management interfaces do not, generally speaking, have any significant impact.

Factoring in the consolidation ratio, as I did in my earlier post, and you’ll see that it’s possible to support very large numbers of VMs spread across as many different VLANs as you want with a small number of ESX/ESXi hosts. Consider 200 ESX/ESXi hosts—whose vMotion-enabled VMkernel interfaces would have to share a broadcast domain—with a consolidation ratio of 12:1. That’s 2,400 VMs that can be supported across as many different VLANs simultaneously as we care to configure. Do you want 400 of those VMs in one VLAN while 300 are in a different VLAN? No problem, you only need to configure the physical switches and the virtual switches appropriately.

Let’s summarize the key points here:

  1. Only the vMotion-enabled VMkernel interface needs to have Layer 2 adjacency within a data center.
  2. Neither management interfaces nor VM networking interfaces require Layer 2 adjacency.
  3. VM networking interfaces are easily capable of supporting multiple VLANs at the same time.
  4. When the VM networking interfaces on multiple ESX/ESXi hosts are identically configured, VMs on different VLANs can be migrated with no network disruption even though the ESX/ESXi hosts themselves might all be in the same VLAN.
  5. Consolidation ratios multiply the number of VMs that can be supported with Layer 2-adjacent vMotion interfaces.

Based on this information, I hope it’s clearer that vMotion is, in fact, quite practical for today’s data centers. Yes, there are design considerations that come into play, especially when it comes to long-distance vMotion (which requires stretched VLANs between multiple sites). However, this is still an emerging use case; the far broader use case is within a single data center. Within a single data center, all the key points I provided to you apply, and there is no practical restriction to leveraging vMotion.

If you still disagree (or even if you agree!), feel free to speak up in the comments. Courteous and professional comments (with full disclosure, where applicable) are always welcome.

Tags: , , , , ,

The quite popular GigaOm site recently published an article titled “The VMotion Myth”, in which the author, Alex Benik, debunks the myth of vMotion and the live migration of virtual machines (VMs).

In his article, Benik states that the ability to dynamically move workloads around inside a single data center or between two data centers is, in his words, “far from an operational reality today”. While I’ll grant you that inter-data center vMotion isn’t the norm, vMotion within a data center is very much an operational reality of today. I believe that Benik’s article is based on some incorrect information and incomplete viewpoints, and I’d like to clear things up a bit.

To quote from Benik’s article:

Currently, moving a VM from a one physical machine to another has two important constraints. First, both machines must share the same storage back end, typically a Fibre Channel/iSCSI SAN or network-attached storage. Second, the physical machines must reside in the same VLAN or subnet. This means that inside a single data center, one can only move a VM across a relatively small number of physical machines. Not exactly what the marketing guys would have you believe.

Benik is correct in that shared storage is required in order to use vMotion. The use of shared storage in VMware environments is quite common for this very reason. Note also that shared storage is required in order to leverage other VMware-specific functionality such as VMware High Availability (HA), VMware Distributed Resource Scheduler (DRS), and VMware Fault Tolerance (FT).

On the second point—regarding VLAN requirements—Benik is not entirely correct. You see, by their very nature VMware ESX/ESXi hosts tend to have multiple network interfaces. The majority of these network interfaces, in particular those that carry the traffic to and from VMs, are what are called trunk ports. Trunk ports are special network interfaces that carry multiple VLANs at the same time. Because these network interfaces carry multiple VLANs simultaneously, VMware ESX/ESXi hosts can support VMs on multiple VLANs simultaneously. (For a more exhaustive discussion of VLANs with VMware ESX/ESXi, see this collection of networking articles I’ve written.)

But that’s only part of the picture. It is true that there is one specific type of interface on a VMware ESX/ESXi host that must reside in the same VLAN on all the hosts between which live migration is required. This port is a VMkernel interface that has been enabled for vMotion. Without sharing connectivity between VMkernel interfaces on the same VLAN, vMotion cannot take place. I suppose it is upon this that Benik bases his statement.

However, the requirement that a group of physical systems share a common VLAN on a single interface is hardly a limitation. First, let’s assume that you are using an 8:1 consolidation ratio; this is an extremely conservative ratio considering that most servers have 8 cores (thus resulting in a 1:1 VM-to-core ratio). Still, assuming an 8:1 consolidation ratio and a maximum of 253 VMware ESX/ESXi hosts on a single Class C IP subnet, that’s enough physical systems to drive over 2,000 VMs (2,024 VMs, to be precise). And remember that these 2,024 VMs can be distributed across any number of VLANs themselves because the network interfaces that carry their traffic are capable of supporting multiple VLANs simultaneously. This means that the networking group can continue to use VLANs for breaking up broadcast domains and segmenting traffic, just like they do in the physical world.

Bump the consolidation ratio up to 16:1 (still only 2:1 VM-to-core ratio on an 8 core server) and the number of VMs that can be supported with a single VLAN for vMotion is over 4,000. How is this a limitation again? Somehow I think that we’d need to pay more attention to CPU, RAM, and storage usage before being concerned about the fact that vMotion requires Layer 2 connectivity between hosts. And this isn’t even taking into consideration that organizations might want multiple vMotion domains!

Clearly, the “limitation” that a single interface on each physical host share a common VLAN isn’t a limitation at all. Yes, it is a design consideration to keep in mind. But I would hardly consider it a limitation and I definitely don’t think that it’s preventing customers from using vMotion in their data centers today. No, Mr. Benik, there’s no vMotion myth—only vMotion reality.

Tags: , , , ,

« Older entries