Networking

You are currently browsing articles tagged Networking.

Good friend (and now colleague) John Troyer asked me via e-mail if I had a “reading list” for networking and network virtualization. I didn’t have a formally-prepared “reading list,” so I just shared with John some of the resources that I personally use. John’s response was: “You should blog this.” So, here it is—a list of resources for networking and network virtualization.

  • Brent Salisbury, aka @networkstatic, has plenty of good material on his site.
  • Jason Edelman (@jedelman8 on Twitter) is also worth reading.
  • Network Heresy is a great site. In the spirit of transparency, note that Network Heresy is maintained by folks from VMware/Nicira, such as Dr. Bruce Davie, Martin Casado, and others.
  • Brad Hedlund (@bradhedlund if you’d like to follow him on Twitter) is another great networking and network virtualization resource. Since Brad works with me at VMware, the focus of most of his material is on network virtualization.
  • Ivan Pepelnjak (@ioshints) is a legend among networking folk—I’ve learned a ton from reading his stuff.
  • Greg Ferro (@etherealmind) also shares tons of great networking material on his site.
  • I’ve also found some useful material at Nick Buraglio’s site.
  • Finally, we’re hoping to make VMware’s new Network Virtualization blog a useful destination for networking and network virtualization material. There are also some additional sites on the blogroll there.

Naturally, any list like this is incomplete (my apologies if you weren’t included, I’m sure it was an oversight not an intentional omission!), so I invite readers to share resources they’ve found useful in the comments below. Courteous comments are always welcome, although I do ask for disclosure of vendor affiliations where applicable.

Tags: ,

Welcome to Technology Short Take #33, the latest in my irregularly-published series of articles discussing various data center technology-related links, articles, rants, thoughts, and questions. I hope that you find something useful here. Enjoy!

Networking

  • Tom Nolle asks the question, “Is virtualization reality even more elusive than virtual reality?” It’s a good read; the key thing that I took away from it was that SDN, NFV, and related efforts are great, but what we really need is something that can pull all these together in a way that customers (and providers) reap the benefits.
  • What happens when multiple VXLAN logical networks are mapped to the same multicast group? Venky explains it in this post. Venky also has a great write-up on how the VTEP (VXLAN Tunnel End Point) learns and creates the forwarding table.
  • This post by Ranga Maddipudi shows you how to use App Firewall in conjunction with VXLAN logical networks.
  • Jason Edelman is on a roll with a couple of great blog posts. First up, Jason goes off on a rant about network virtualization, briefly hitting topics like the relationship between overlays and hardware, the role of hardware in network virtualization, the changing roles of data center professionals, and whether overlays are the next logical step in the evolution of the network. I particularly enjoyed the snippet from the post by Bill Koss. Next, Jason dives a bit deeper on the relationship between network overlays and hardware, and shares his thoughts on where it does—and doesn’t—make sense to have hardware terminating overlay tunnels.
  • Another post by Tom Nolle explores the relationship—complicated at times—between SDN, NFV, and the cloud. Given that we define the cloud (sorry to steal your phrase, Joe) as elastic, pooled resources with self-service functionality and ubiquitous access, I can see where Tom states that to discuss SDN or NFV without discussing cloud is silly. On the flip side, though, I have to believe that it’s possible for organizations to make a gradual shift in their computing architectures and processes, so one almost has to discuss these various components individually, because to tie them all together makes it almost impossible. Thoughts?
  • If you haven’t already introduced yourself to VXLAN (one of several draft protocols used as an overlay protocol), Cisco Inferno has a reasonable write-up.
  • I know Steve Jin, and he’s a really smart guy. I must disagree with some of his statements regarding what software-defined networking is and is not and where it fits, written back in April. I talked before about the difference between network virtualization and SDN, so no need to mention that again. Also, the two key flaws that Steve identifies—single point of failure and scalability—aren’t flaws with SDN/network virtualization, but rather flaws in an implementation of said technologies, IMHO.

Servers/Hardware

  • Correction from the last Technology Short Take—I incorrectly stated that the HP Moonshot offerings were ARM-based, and therefore wouldn’t support vSphere. I was wrong. The servers (right now, at least) are running Intel Atom S1260 CPUs, which are x86-based and do offer features like Intel VT-x. Thanks to all who pointed this out, and my apologies for the error!
  • I missed this on the #vBrownBag series: designing HP Virtual Connect for vSphere 5.x.

Security

Cloud Computing/Cloud Management

  • Hyper-V as hypervisor with OpenStack Compute? Sure, see here.
  • Cody Bunch, who has been focusing quite a bit on OpenStack recently, has a nice write-up on using Razor and Chef to automate an OpenStack build. Part 1 is here; part 2 is here. Good stuff—keep it up, Cody!
  • I’ve mentioned in some of my OpenStack presentations (see SpeakerDeck or Slideshare) that a great place to start if you’re just getting started is DevStack. Here, Brent Salisbury has a nice write-up on using DevStack to install OpenStack Grizzly.

Operating Systems/Applications

  • Boxen, a tool created by GitHub to manage their OS X Mountain Lion laptops for developers, looks interesting. Might be a useful tool for other environments, too.
  • If you use TextMate2 (I switched to BBEdit a little while ago after being a long-time TextMate user), you might enjoy this quick post by Colin McNamara on Puppet syntax highlighting using TextMate2.

Storage

  • Anyone have more information on Jeda Networks? They’ve been mentioned a couple of times on GigaOm (here and here), but I haven’t seen anything concrete yet. Hey, Stephen Foskett, if you’re reading: get Jeda Networks to the next Tech Field Day.
  • Tim Patterson shares some code from Luc Dekens that helps check VMFS version and block sizes using PowerCLI. This could come in quite handy in making sure you know how your datastores are configured, especially if you are in the midst of a migration or have inherited an environment from someone else.

Virtualization

  • Interested in using SAML and Horizon Workspace with vCloud Director? Tom Fojta shows you how.
  • If you aren’t using vSphere Host Profiles, this write-up on the VMware SMB blog might convince you why you should and show you how to get started.
  • Michael Webster tackles the question: is now the best time to upgrade to vSphere 5.1? Read the full post to see what Michael has to say about it.
  • Duncan points out an easy error to make when working with vSphere HA heartbeat datastores in this post. Key takeaway: sometimes the fix is a lot simpler than we might think at first. (I know I’m guilty of making things more complicated than they need to be at times. Aren’t we all?)
  • Jon Benedict (aka “Captain KVM”) shares a script he wrote to help provide high availability for RHEV-M.
  • Chris Wahl has a nice write-up on using log shipping to protect your vCenter database. It’s a bit over a year old (surprised I missed it until now), and—as Chris points out—log shipping doesn’t protect the database (primary and secondary copies) against corruption. However, it’s better than nothing (which I suspect it what far too many people are using).

Other

  • If you aspire to be a writer—whether that be a blogger, author, journalist, or other—you might find this article on using the DASH method for writing to be helpful. The six tips at the end of the article are especially helpful, I think.

Time to wrap this up for now; the rest will have to wait until the next Technology Short Take. Until then, feel free to share your thoughts, questions, or rants in the comments below. Courteous comments are always welcome!

Tags: , , , , , , , , , , , , , ,

In an earlier post, I provided an introduction to policy routing as implemented in recent versions of Ubuntu Linux (and possibly other distributions as well), and I promised that in a future post I would provide a practical application of its usage. This post looks at that practical application: how—and why—you would use Linux policy routing in an environment running OVS and a Linux hypervisor (I’ll assume KVM for the purposes of this post).

Before I get into the “how,” let’s first discuss the “why.” Let’s assume that you have a KVM+OVS environment and are leveraging tunnels (GRE or other) for some guest domain traffic. Recall from my post on traffic patterns with Open vSwitch that tunnel traffic is generated by the OVS process itself, and therefore is controlled by the Linux host’s IP routing table with regard to which interfaces that tunnel traffic will use. But what if you need the tunnel traffic to be handled differently than the host’s management traffic? What if you need a default route for tunnel traffic that uses one interface, but a different default route for your separate management network that uses its own interface? This is why you would use policy routing in this configuration. Using source routing (i.e., policy routing based on the source of the traffic), you could easily define a table for tunnel traffic that has its own default route while still allowing management traffic to use the host’s default routing table.

Let’s take a look at how it’s done. In this example, I’ll make the following assumptions:

  • I’ll assume that you’re running host management traffic through OVS, as I outlined here. I’ll use the name mgmt0 to refer to the management interface that’s running through OVS for host management traffic. We’ll use the IP address 192.168.100.10 for the mgmt0 interface.
  • I’ll assume that you’re running tunnel traffic through an OVS interface interface named tep0. (This helps provide some consistency with my walk-through on using GRE tunnels with OVS.) We’ll use the IP address 192.168.200.10 for the tep0 interface.
  • I’ll assume that the default gateway on each subnet uses the .1 address on that subnet.

With these assumptions out of the way, let’s look at how you would set this up.

First, you’ll create a custom policy routing table, as outlined here. I’ll use the name “tunnel” for my new table:

echo 200 tunnel >> /etc/iproute2/rt_tables

Next, you’ll need to modify /etc/network/interfaces for the tep0 interface so that a custom policy routing rule and custom route are installed whenever this interface is brought up. The new configuration stanza would look something like this:

(If the configuration stanza doesn’t appear above, click here.)

Finally, you’ll want to ensure that mgmt0 is properly configured in /etc/network/interfaces. No special configuration is required there, just the use of the gateway directive to install the default route. Ubuntu will install the default route into the main table automatically, making it a “system-wide” default route that will be used unless a policy routing rule dictates otherwise.

With this configuration in place, you now have a system that:

  • Can communicate via mgmt0 with other systems in other subnets via the default gateway of 192.168.100.1.
  • Can communicate via tep0 to establish tunnels with other hypervisors in other subnets via the 192.168.200.1 gateway.

This configuration requires only the initial configuration (which could, quite naturally, be automated via a tool like Puppet) and does not require using additional routes as the environment scales to include new subnets for other hypervisors (either for management or tunnel traffic). Thus, organizations can use recommended practices for building scalable L3 networks with reasonably-sized L2 domains without sacrificing connectivity to/from the hypervisors in the environment.

(By the way, this is something that is not easily accomplished in the vSphere world today. ESXi has only a single routing table for all VMkernel interfaces, which means that management traffic, vMotion traffic, VXLAN traffic, etc., are all bound by that single routing table. To achieve full L3 connectivity, you’d have to install specific routes into the VMkernel routing table on each ESXi host. When additional subnets are added for scale, each host would have to be touched to add the additional route.)

Hopefully this gives you an idea of how Linux policy routing could be effectively used in environments leveraging virtualization, OVS, and overlay protocols. Feel free to add your thoughts, idea, corrections, or questions in the comments below. Courteous comments are always welcome! (Please disclose vendor affiliations where applicable.)

Tags: , , , , ,

In this post, I’m going to introduce you to policy routing as implemented in recent versions of Ubuntu Linux (and possibly other Linux distributions as well, but I’ll be using Ubuntu 12.04 LTS). Policy routing actually allows us a great deal of flexibility in how we direct traffic out of a Linux host; I’ll discuss a rather practical application of this configuration in a future blog post. For now, though, let’s just focus on how to configure policy routing.

There are a couple parts involved in policy routing:

  • Policy routing tables: Linux comes with three by default: local (which cannot be modified or deleted), main, and default. Somewhat unintuitively, routes added to the system without a routing table specified go to the main table, not the default table.
  • Policy routing rules: Again, Linux comes with three rules, one for each of the default routing tables.

In order for us to leverage policy routing for our purposes, we need to do three things:

  1. We need to create a custom policy routing table.
  2. We need to create one or more custom policy routing rules.
  3. We need to populate the custom policy routing table with routes.

Let’s look at each of these steps separately.

Creating a Custom Policy Routing Table

The first step is to create a custom policy routing table. Each table is represented by an entry in the file /etc/iproute2/rt_tables, so creating a new table is generally accomplished using a command like this:

echo 200 custom >> /etc/iproute2/rt_tables

This creates the table with the ID 200 and the name “custom”. You’ll reference this name later as you create the rules and populate the table with routes, so make note of it. Because this entry is contained in the rt_tables file, it will be persistent across reboots.

Creating Policy Routing Rules

The next step is to create the policy routing rules that will tell the system which table to use to determine the correct route. In this particular case, I’m going to use the source address (i.e., the originating address for the traffic) as the determining factor in the rule. This is a common application of policy routing, and for that reason it’s often referred to as source routing.

To create the policy routing rule, use this command:

ip rule add from <source address> lookup <table name>

Let’s say that we wanted to create a rule that told the system to use the “custom” table we created earlier for all traffic originating from the source address 192.168.30.200. The command would look like this:

ip rule add from 192.168.30.200 lookup custom

You can see all the policy routing rules that are currently in effect using this command:

ip rule list

As I mentioned in the beginning of this article, there are default rules that govern the use of the local, main, and default tables (these are the built-in tables). Once you’ve added your rule, you should see it listed there as well.

There is a problem here, though: rules created this way are ephemeral and will disappear when the system is restarted (or when the networking is restarted). To make the rules persist, add a line like this to /etc/network/interfaces:

post-up ip rule add from 192.168.30.200 lookup custom

You’d want to place this line in the configuration stanza that configures the interface with the address 192.168.30.200. With this line in place, the rule should persist across reboots or across network restarts.

Populating the Routing Table

Once we have the custom policy routing table created and a rule defined that directs the system to use it, we need to populate the table with the correct routes. The generic command to do this is the ip route add command, but with a specific table parameter added.

Using our previous example, let’s say we wanted to add a default route that was specific to traffic originating from 192.168.30.200. We’ve already created a custom policy routing table, and we have a rule that directs the system to use that table for traffic originating from that address. To add a new default route specifically for that interface, you’d use this command:

ip route add default via 192.168.30.1 dev eth1 table custom

Naturally, you’d want to substitute the correct default gateway for 192.168.30.1 and the correct interface for eth1 in the above command, but this should give you the right idea. Of course, you don’t have to use default routes; you could install specific routes into the custom policy routing table as well. This also works on VLAN sub-interfaces, so you could create per-VLAN routing tables:

ip route add default via 192.168.30.1 dev eth0.30 table vlan30

This command installs a default route for the 192.168.30.x interface on VLAN 30, using a table named “vlan30″ (note that the table needs to created before you can add routes to it, as far as I can tell).

As with the policy routing tables, routes added this way are not persistent, so you’ll want to make them persistent by adding a line like this to your /etc/network/interfaces configuration file:

post-up ip route add default via 192.168.30.1 dev eth1 table custom

This will ensure that the appropriate routes are added to the appropriate policy routing table when the corresponding network interface is brought up.

Summary

There’s a great deal more functionality possible in policy routing, but this at least gives you the basics you need to understand how it works. In a future post, I’ll provide a specific use case where this functionality could be put to work. In the meantime, feel free to share any corrections, clarifications, questions, or thoughts in the comments below.

Tags: , ,

In other articles, I’ve talked about how to use Open vSwitch (OVS) with VLANs to place guest domains (VMs) into a particular VLAN. In this article, I want to show you how to pass VLAN tags all the way into the guest domain—in other words, how to do VLAN trunking to guest domains using OVS. To do this, we’re going to leverage the OVS-libvirt integration I referenced in this post on using VLANs with OVS and libvirt.

For this to work, you must have an operating system in the guest domain that is capable of recognizing and using the VLAN tags that are being passed to it by OVS. In this article, I’ll use Ubuntu 12.04 as the OS in the guest domain. For other operating systems, the commands and/or procedures to configure VLAN support appropriately will probably differ, so keep that in mind.

There are two parts to making this work:

  1. Configuring OVS (manually or via libvirt) to pass VLAN tags to the guest OS.
  2. Configuring the guest domain’s installed OS to take advantage of the VLAN tags being passed up by OVS.

Let’s look at each of these parts separately. We’ll start with configuring OVS, either manually or via libvirt, to pass the VLAN tags up to the guest domain.

Configuring OVS to Pass VLAN Tags to the Guest Domain

There are two ways to accomplish this: you can do it manually, or you can do it via OVS integration with recent builds of libvirt.

Manually Configuring OVS

To configure OVS manually, you would need to:

  1. Identify which vnet port you want to configure for VLAN trunking
  2. Configure the vnet port to trunk the VLANs.

To identify which vnet port needs to be modified, you’ll want to figure out the guest domain interface(s) that is/are connected to the vnet port. You can do this by using this command (substitute the desired vnet port name in place of vnet0 in the following command):

ovs-vsctl list interface vnet0

In the output of the command, look for the external_ids line; it will contain an entry called “attached-mac”, and that represents the MAC address of the interface in the guest domain OS attached to this particular vnet port. You can compare this to the output of ip addr list or ifconfig -a in Ubuntu to find a matching MAC address in the guest domain. Correlating the two values allows you to determine which guest domain is attached to which vnet port, and then you can modify the correct vnet port appropriately.

You’d modify the vnet port using this command:

ovs-vsctl set port vnet0 trunks=20,30,40

You’d want to substitute the appropriate values for vnet0 and the VLAN IDs that you want passed up to the guest domain. Once you’ve made the change, you can verify the changes using this command (replacing vnet0 with the correct port):

ovs-vsctl list port vnet0

Note that if you want the guest domain to receive both untagged (native VLAN) traffic as well as tagged (trunked) traffic, there is an additional setting you must set:

ovs-vsctl set port vnet0 vlan_mode=native-untagged

With this setting in place, the OS installed into the guest domain will be able to communicate over the untagged (native) VLAN as well as using VLAN tags.

Using libvirt Integration

If the manual method of configuring OVS seems a bit cumbersome, using the libvirt integration makes it much easier.

Basically, you’ll follow the configuration outlined in this blog post to create a libvirt network that corresponds to an OVS bridge. Here’s an example of the XML code to accomplish this task:

Of particular interest for what we’re trying to accomplish here is the very last section, the portgroup named “vlan-all.” Note that for this specific portgroup, the vlan element has a property that specifies it is a trunk, and then there are multiple tag elements that list each VLAN ID that will be trunked across this network into the guest domain.

Using this configuration, when we create the guest domain and specify that it is attached to the network named “vlan-all” (matching the portgroup in the libvirt network definition), libvirt will automatically configure OVS appropriately (it will set the trunks value for that domain’s OVS port).

However, it will not configure the OVS port to allow untagged traffic as well (only tagged traffic will be passed). If you want the guest domain to receive untagged traffic also, you must set the vlan_mode value manually as outlined above.

Configuring the Guest Domain to Use VLAN Tags

Once you’ve followed the steps outlined above and have OVS configured correctly, then you’re ready to configure the OS in the guest domain. Keep in mind that I’m using Ubuntu 12.04 in this post, but you’re welcome to use any operating system that supports VLAN tags.

Assuming that eth0 is the interface in the guest domain that is receiving tagged traffic from OVS, this snippet in /etc/network/interfaces will create and configure a VLAN interface:

Technically, the “raw-vlan-device” line isn’t needed because the parent device name is in the name of the VLAN device, but I like to include it for completeness and ease of debugging. (Your mileage may vary, of course.) The number on the end of the eth0 (for example, eth0.20) corresponds to the VLAN ID (VLAN 20, in this case) being passed up by OVS.

You can repeat this configuration for multiple VLAN interfaces.

Use Case

I’ll have to admit that I can’t immediately think of some useful use cases for this sort of configuration. At first glance, you might think that it would be useful in situations where you need logical separation, but I think there are better ways than VLANs to accomplish this task (and those ways are probably simpler). I primarily set out to document this in order to better solidify my knowledge of how OVS works and is configured. However, I’d be happy to hear from others on what they think might be interesting or useful use cases for this sort of configuration. Feel free to add your thoughts in the comments below. Courteous comments are always welcome!

Tags: , , , ,

This blog post kicks off a new series of posts describing my journey to become more knowledgeable about the Nicira Network Virtualization Platform (NVP). NVP is, in my opinion, an awesome platform, but there hasn’t been a great deal of information shared about the product, how it works, how you configure it, etc. That’s something I’m going to try to address in this series of posts. In this first post, I’ll start with a high-level description of the NVP architecture. Don’t worry—more in-depth information will come in future posts.

Before continuing, it might be useful to set some context around NVP and NSX. This series of posts will focus on NVP—a product that is available today and is currently in use in production. The architecture I’m describing here will also be applicable to NSX, which VMware announced in early March. Because NSX will leverage NVP’s architecture, spending some time with NVP now will pay off with NSX later. Make sense?

Let’s start with a figure. The diagram below graphically illustrates the NVP architecture at a high level:

High-level NVP architecture diagram

The key components of the NVP architecture include:

  • A scale-out controller cluster: The NVP controllers handle computing the network topology and providing configuration and flow information to create logical networks. The controllers support a scale-out model for high availability and increased scalability. The controller cluster supplies a northbound REST API that can be consumed by cloud management platforms such as OpenStack or CloudStack, or by home-grown cloud management systems.
  • A programmable virtual switch: NVP leverages Open vSwitch (OVS), an independent open source project with contributors from across the industry, to fill this role. OVS communicates with the NVP controller clusters to receive configuration and flow information.
  • Southbound communications protocols: NVP uses two open communications protocols to communicate southbound to OVS. For configuration information, NVP leverages OVSDB; for flow information, NVP uses OpenFlow. The management (OVSDB) communication between the controller cluster and OVS is encrypted using SSL.
  • Gateways: Gateways provide the “on-ramp” to enter or exit NVP logical networks. Gateways can provide either L2 gateway services (to bridge NVP logical networks onto physical networks) as well as L3 gateway services (to route between NVP logical networks and physical networks). In either case, the gateways are also built using a scale-out model that provides high availability and scalability for the L2 and L3 gateway services they host.
  • Encapsulation protocol: To provide full independence and isolation of logical networks from the underlying physical networks, NVP uses encapsulation protocols for transporting logical network traffic across physical networks. Currently, NVP supports both Generic Routing Encapsulation (GRE) and Stateless Transport Tunneling (STT), with additional encapsulation protocols planned for future releases.
  • Service nodes: To offload the handling of BUM (Broadcast, Unknown Unicast, and Multicast) traffic, NVP can optionally leverage one or more service nodes. Note that service nodes are optional; customers can choose to have BUM traffic handled locally on each hypervisor node. (Note that service nodes are not shown in the diagram above.)

Now that you have an idea of the high-level architecture, let me briefly outline how the rest of this series will be organized. The basic outline of this series will roughly correspond to how NVP would be deployed in a real-world environment.

  1. In the next post (or two), I’ll be focusing on getting the controller cluster built and diving a bit deeper into the controller cluster architecture.
  2. Once the controller cluster is up and running, I’ll take a look at getting NVP Manager up and running. NVP Manager is an application that consumes the northbound REST APIs from the controller cluster in order to view and manage NVP logical networks and NVP components. In most cases, this function is part of a cloud management platform (such as OpenStack or CloudStack), but using NVP Manager here allows me to focus on NVP instead of worrying about the details of the cloud management platform itself.
  3. The next step will be to bring hypervisor nodes into NVP. I’ll focus on using nodes running KVM, but keep in mind that Xen is also supported by NVP. If time (and resources) permit, I may try to look at bringing up Xen-based hypervisor nodes as well. Because NVP leverages OVS as the edge virtual switch, I’ll naturally be discussing some OVS-related tasks and topics as well.
  4. Following the addition of hypervisor nodes into NVP, I’ll look at creating a simple logical network, and we’ll examine how this logical network works with the underlying physical network.
  5. To add more flexibility to our logical network, we need to be able to bring physical resources into NVP logical networks. To enable that functionality, we’ll need to add gateways and gateway services to our configuration, so I’ll discuss gateways and L2 gateway services, how they work, and how we add them to an NVP configuration.
  6. The next step is to enable L3 (routing) functionality within NVP, and that is enabled by L3 gateway services. I’ll spend some time talking about the L3 gateway services, their architecture, adding them to NVP, and including L3 functionality in an NVP logical network. I’ll also explore distributed L3 routing, where the L3 routing is actually distributed across hypervisors in an NVP environment (this is a new feature just added in NVP 3.1).
  7. Now that we have both L2 and L3 gateway services in NVP, I’ll take a look at building more intricate logical networks.

Beyond that, it’s hard to say where the series will go. I’ll likely also take a look at some of NVP’s security features, and examine a few more complex NVP use cases. If there are additional topics you’d like to see beyond what I’ve outlined above, please feel free to speak up in the comments below.

I’m excited about this journey to learn NVP in more detail, and I’m looking forward to taking all of you along with me. Ready? Let’s go!

Tags: , , , , ,

In this post, I want to provide some additional insight on how the use of Open vSwitch (OVS) affects—or doesn’t affect, in some cases—how a Linux host directs traffic through physical interfaces, OVS internal interfaces, and OVS bridges. This is something that I had a hard time understanding as I started exploring more advanced OVS configurations, and hopefully the information I share here will be helpful to others.

To help structure this discussion, I’m going to walk through a few different OVS configurations and scenarios. In these scenarios, I’ll use the following assumptions:

  • The physical host has four interfaces (eth0, eth1, eth2, and eth3)
  • The host is running Linux with KVM, libvirt, and OVS installed

Scenario 1: Simple OVS Configuration

In this first scenario let’s look at a relatively simple OVS configuration, and examine how Linux host and guest domain traffic moves into or out of the network.

Let’s assume that our OVS configuration looks something like this (this is the output from ovs-vsctl show):

bc6b9e64-11d6-415f-a82b-5d8a61ed3fbd
    Bridge "br0"
        Port "br0"
            Interface "br0"
                type: internal
        Port "eth0"
            Interface "eth0"
    Bridge "br1"
        Port "br1"
            Interface "br1"
                type: internal
        Port "eth1"
            Interface "eth1"
    ovs_version: "1.7.1"

This is a pretty simple configuration; there are two bridges, each with a single physical interface. Let’s further assume, for the purposes of this scenario, that eth2 has an IP address and is working properly to communicate with other hosts on the network. The eth3 interface is shutdown.

So, in this scenario, how does traffic move into or out of the host?

  1. Traffic from a guest domain: Traffic from a guest domain will travel through the OVS bridge to which it is attached (you’d see an additional “vnet0″ port and interface appear on that bridge when you start the guest domain). So, a guest domain attached to br0 would communicate via eth0, and a guest domain attached to br1 would communicate via eth1. No real surprises here.

  2. Traffic from the Linux host: Traffic from the Linux host itself will not communicate over any of the configured OVS bridges, but will instead use its native TCP/IP stack and any configured interfaces. Thus, since eth2 is configured and operational, all traffic to/from the Linux host itself will travel through eth2.

The interesting point (to me, at least) about #2 above is that this includes traffic from the OVS process itself. In other words, if the OVS process(es) need to communicate across the network, they won’t use the bridges—they’ll use whatever interfaces the Linux host uses to communicate. This is one thing that threw me off: because OVS is itself a Linux process, when OVS needs to communicate across the network it will use the Linux network stack to do so. In this scenario, then, OVS would not communicate over any configured bridge, but instead using eth2. (This makes perfect sense now, but I recall that it didn’t earlier. Maybe it’s just me.)

Scenario 2: Adding Bonding

In this second scenario, our OVS configuration changes only slightly:

bc6b9e64-11d6-415f-a82b-5d8a61ed3fbd
    Bridge "br0"
        Port "br0"
            Interface "br0"
                type: internal
        Port "bond0"
            Interface "eth0"
            Interface "eth1"
    ovs_version: "1.7.1"

In this case, we’re now leveraging a bond that contains two physical interfaces (eth0 and eth1). (By the way, I have a write-up on configuring OVS and bonds, if you need/want more information.) The eth2 interface still has an IP address assigned and is up and communicating properly. The physical eth3 interface is shutdown.

How does this affect the way in which traffic is handled? It doesn’t, really. Traffic from guest domains will still travel across br0 (since this is the only configured OVS bridge), and traffic from the Linux host—including traffic from OVS itself—will still use whatever interfaces are determined by the host’s TCP/IP stack. In this case, that would be eth2.

Scenario 3: The Isolated Bridge

Let’s look at another OVS configuration, the so-called “isolated bridge”. This is a configuration that is commonly found in implementations using NVP, OpenStack, and others, and it’s a configuration that I recently discussed in my post on GRE tunnels and OVS.

Here’s the configuration:

bc6b9e64-11d6-415f-a82b-5d8a61ed3fbd
    Bridge "br0"
        Port "br0"
            Interface "br0"
                type: internal
        Port "bond0"
            Interface "eth0"
            Interface "eth1"
    Bridge "br-int"
        Port "br-int"
            Interface "br-int"
                type: internal
        Port "gre0"
            Interface "gre0"
                type: gre
                options: {remote_ip="192.168.1.100"}
    ovs_version: "1.7.1"

As with previous configurations, we’ll assume that eth2 is up and operational, and eth3 is shutdown. So how does traffic get directed in this configuration?

  1. Traffic from guest domains attached to br0: This is as before—traffic will go out one of the physical interfaces in the bond, according to the bonding configuration (active-standby, LACP, etc.). Nothing unusual here.

  2. Traffic from the Linux host: As before, traffic from processes on the Linux host will travel out according to the host’s TCP/IP stack. There are no changes from previous configurations.

  3. Traffic from guest domains attached to br-int: Now, this is where it gets interesting. Guest domains attached to br-int (named “br-int” because in this configuration the isolated bridge is often called the “integration bridge”) don’t have any physical interfaces they can use; they can only use the GRE tunnel. Here’s the “gotcha”, so to speak: the GRE tunnel is created and maintained by the OVS process, and therefore it uses the host’s TCP/IP stack to communicate across the network. Thus, traffic from guest domains attached to br-int would hit the GRE tunnel, which would travel through eth2.

I’ll give you a second to let that sink in.

Ready now? Good! The key to understanding #3 is, in my opinion, understanding that the tunnel (a GRE tunnel in this case, but the same would apply to a VXLAN or STT tunnel) is created and maintained by the OVS process. Thus, because it is created and maintained by a process on the Linux host (OVS itself), the traffic for the tunnel is directed according to the host’s TCP/IP stack and IP routing table(s). In this configuration, the tunnels don’t travel through any of the configured OVS bridges.

Scenario 4: Leveraging an OVS Internal Interface

Let’s keep ramping up the complexity. For this scenario, we’ll use an OVS configuration that is the same as in the previous scenario:

bc6b9e64-11d6-415f-a82b-5d8a61ed3fbd
    Bridge "br0"
        Port "br0"
            Interface "br0"
                type: internal
        Port "bond0"
            Interface "eth0"
            Interface "eth1"
    Bridge "br-int"
        Port "br-int"
            Interface "br-int"
                type: internal
        Port "gre0"
            Interface "gre0"
                type: gre
                options: {remote_ip="192.168.1.100"}
    ovs_version: "1.7.1"

The difference, this time, is that we’ll assume that eth2 and eth3 are both shutdown. Instead, we’ve assigned an IP address to the br0 interface on bridge br0. OVS internal interfaces, like br0, can appear as “physical” interfaces to the Linux host, and therefore can be assigned IP addresses and used for communication. This is the approach I used in describing how to run host management across OVS.

Here’s how this configuration affects traffic flow:

  1. Traffic from guest domains attached to br0: No change here. Traffic from guest domains attached to br0 will continue to travel across the physical interfaces in the bond (eth0 and eth1, in this case).

  2. Traffic from the Linux host: This time, the only interface that the Linux host has is the br0 internal interface. The br0 internal interface is attached to br0, so all traffic from the Linux host will travel across the physical interfaces attached to the bond (again, eth0 and eth1).

  3. Traffic from guest domains attached to br-int: Because Linux host traffic is directed through br0 by virtue of using the br0 internal interface, this means that tunnel traffic is also directed through br0, as dictated by the Linux host’s TCP/IP stack and IP routing table(s).

As you can see, assigning an IP address to an OVS internal interface has a real impact on the way in which the Linux host directs traffic through OVS. This has both positive and negative impacts:

  • One positive impact is that it allows for Linux host traffic (such as management or tunnel traffic) to take advantage of OVS bonds, thus gaining some level of NIC redundancy.
  • A negative impact is that OVS is now “in band,” so upgrades to OVS will be disruptive to all traffic moving through OVS—which could potentially include host management traffic.

Let’s take a look at one final scenario.

Scenario 5: Using Multiple Bridges and Internal Interfaces

In this configuration, we’ll use an OVS configuration that is very similar to the configuration I showed in my post on GRE tunnels with OVS:

bc6b9e64-11d6-415f-a82b-5d8a61ed3fbd
    Bridge "br0"
        Port "br0"
            Interface "br0"
                type: internal
        Port "mgmt0"
            Interface "mgmt0"
                type: internal
        Port "bond0"
            Interface "eth0"
            Interface "eth1"
    Bridge "br1"
        Port "br1"
            Interface "br1"
                type: internal
        Port "tep0"
            Interface "tep0"
                type: internal
        Port "bond1"
            Interface "eth2"
            Interface "eth3"
    Bridge "br-int"
        Port "br-int"
            Interface "br-int"
                type: internal
        Port "gre0"
            Interface "gre0"
                type: gre
                options: {remote_ip="192.168.1.100"}
    ovs_version: "1.7.1"

In this configuration, we have three bridges. br0 uses a bond that contains eth0 and eth1; br1 uses a bond that contains eth2 and eth3; and br-int is an isolated bridge with no physical interfaces. We have two “custom” internal interfaces, mgmt0 (on br0) and tep0 (on br1), to which IP addresses have been assigned and which are successfully communicating across the network. We’ll assume that mgmt0 and tep0 are on different subnets, and that tep0 is assigned to the 192.168.1.0/24 subnet.

How does traffic flow in this scenario?

  1. Traffic from guest domains attached to br0: The behavior here is as it has been in previous configurations—guest domains attached to br0 will communicate across the physical interfaces in the bond.

  2. Traffic from the Linux host: As it has been in previous scenarios, traffic from the Linux host is driven by the host’s TCP/IP stack and IP routing table(s). Because mgmt0 and tep0 are on different subnets, traffic from the Linux host will go out either br0 (for traffic moving through mgmt0) or br1 (for traffic moving through tep0), and thus will utilize the corresponding physical interfaces in the bonds on those bridges.

  3. Traffic from guest domains attached to br-int: Because the GRE tunnel is on the 192.168.1.0/24 subnet, traffic for the GRE tunnel—which is created and maintained by the OVS process on the Linux host itself—will travel through tep0, which is attached to br1. Thus, the physical interfaces eth2 and eth3 would be leveraged for the GRE tunnel traffic.

Summary

The key takeaway from this post, in my mind, is understanding where traffic originates, and separating the idea of OVS as a switching mechanism (to handle guest domain traffic) as well as a Linux host process itself (to create and maintain tunnels between hosts).

Hopefully this information is helpful. I am, of course, completely open to your comments, questions, and corrections, so feel free to speak up in the comments below. Courteous comments are always welcome!

Tags: , ,

I’m back with another “how to” article on Open vSwitch (OVS), this time taking a look at using GRE (Generic Routing Encapsulation) tunnels with OVS. OVS can use GRE tunnels between hosts as a way of encapsulating traffic and creating an overlay network. OpenStack Quantum can (and does) leverage this functionality, in fact, to help separate different “tenant networks” from one another. In this write-up, I’ll walk you through the process of configuring OVS to build a GRE tunnel to build an overlay network between two hypervisors running KVM.

Naturally, any sort of “how to” such as this always builds upon the work of others. In particular, I found a couple of Brent Salisbury’s articles (here and here) especially useful.

This process has 3 basic steps:

  1. Create an isolated bridge for VM connectivity.
  2. Create a GRE tunnel endpoint on each hypervisor.
  3. Add a GRE interface and establish the GRE tunnel.

These steps assume that you’ve already installed OVS on your Linux distribution of choice. I haven’t explicitly done a write-up on this, but there are numerous posts from a variety of authors (in this regard, Google is your friend).

We’ll start with an overview of the topology, then we’ll jump into the specific configuration steps.

Reviewing the Topology

The graphic below shows the basic topology of what we have going on here:

Topology overview

We have two hypervisors (CentOS 6.3 and KVM, in my case), both running OVS (an older version, version 1.7.1). Each hypervisor has one OVS bridge that has at least one physical interface associated with the bridge (shown as br0 connected to eth0 in the diagram). As part of this process, you’ll create the other internal interfaces (the tep and gre interfaces, as well as the second, isolated bridge to which VMs will connect. You’ll then create a GRE tunnel between the hypervisors and test VM-to-VM connectivity.

Creating an Isolated Bridge

The first step is to create the isolated OVS bridge to which the VMs will connect. I call this an “isolated bridge” because the bridge has no physical interfaces attached. (Side note: this idea of an isolated bridge is fairly common in OpenStack and NVP environments, where it’s usually called the integration bridge. The concept is the same.)

The command is very simple, actually:

ovs-vsctl add-br br2

Yes, that’s it. Feel free to substitute a different name for br2 in the command above, if you like, but just make note of the name as you’ll need it later.

To make things easier for myself, once I’d created the isolated bridge I then created a libvirt network for it so that it was dead-easy to attach VMs to this new isolated bridge.

Configuring the GRE Tunnel Endpoint

The GRE tunnel endpoint is an interface on each hypervisor that will, as the name implies, serve as the endpoint for the GRE tunnel. My purpose in creating a separate GRE tunnel endpoint is to separate hypervisor management traffic from GRE traffic, thus allowing for an architecture that might leverage a separate management network (which is typically considered a recommended practice).

To create the GRE tunnel endpoint, I’m going to use the same technique I described in my post on running host management traffic through OVS. Specifically, we’ll create an internal interface and assign it an IP address.

To create the internal interface, use this command:

ovs-vsctl add-port br0 tep0 -- set interface tep0 type=internal

In your environment, you’ll substitute br2 with the name of the isolated bridge you created earlier. You could also use a different name than tep0. Since this name is essentially for human consumption only, use what makes sense to you. Since this is a tunnel endpoint, tep0 made sense to me.

Once the internal interface is established, assign it with an IP address using ifconfig or ip, whichever you prefer. I’m still getting used to using ip (more on that in a future post, most likely), so I tend to use ifconfig, like this:

ifconfig tep0 192.168.200.20 netmask 255.255.255.0

Obviously, you’ll want to use an IP addressing scheme that makes sense for your environment. One important note: don’t use the same subnet as you’ve assigned to other interfaces on the hypervisor, or else you can’t control that the GRE tunnel will originate (or terminate) on the interface you specify. This is because the Linux routing table on the hypervisor will control how the traffic is routed. (You could use source routing, a topic I plan to discuss in a future post, but that’s beyond the scope of this article.)

Repeat this process on the other hypervisor, and be sure to make note of the IP addresses assigned to the GRE tunnel endpoint on each hypervisor; you’ll need those addresses shortly. Once you’ve established the GRE tunnel endpoint on each hypervisor, test connectivity between the endpoints using ping or a similar tool. If connectivity is good, you’re clear to proceed; if not, you’ll need to resolve that before moving on.

Establishing the GRE Tunnel

By this point, you’ve created the isolated bridge, established the GRE tunnel endpoints, and tested connectivity between those endpoints. You’re now ready to establish the GRE tunnel.

Use this command to add a GRE interface to the isolated bridge on each hypervisor:

ovs-vsctl add-port br2 gre0 -- set interface gre0 type=gre \
options:remote_ip=<GRE tunnel endpoint on other hypervisor>

Substitute the name of the isolated bridge you created earlier here for br2 and feel free to use something other than gre0 for the interface name. I think using gre as the base name for the GRE interfaces makes sense, but run with what makes sense to you.

Once you repeat this command on both hypervisors, the GRE tunnel should be up and running. (Troubleshooting the GRE tunnel is one area where my knowledge is weak; anyone have any suggestions or commands that we can use here?)

Testing VM Connectivity

As part of this process, I spun up an Ubuntu 12.04 server image on each hypervisor (using virt-install as I outlined here), attached each VM to the isolated bridge created earlier on that hypervisor, and assigned each VM an IP address from an entirely different subnet than the physical network was using (in this case, 10.10.10.x).

Here’s the output of the route -n command on the Ubuntu guest, to show that it has no knowledge of the “external” IP subnet—it knows only about its own interfaces:

ubuntu:~ root$ route -n
Kernel IP routing table
Destination  Gateway       Genmask        Flags Metric Ref Use Iface
0.0.0.0      10.10.10.254  0.0.0.0        UG    100    0   0   eth0
10.10.10.0   0.0.0.0       255.255.255.0  U     0      0   0   eth0

Similarly, here’s the output of the route -n command on the CentOS host, showing that it has no knowledge of the guest’s IP subnet:

centos:~ root$ route -n
Kernel IP routing table
Destination  Gateway        Genmask        Flags Metric Ref Use Iface
192.168.2.0  0.0.0.0        255.255.255.0  U     0      0   0   tep0
192.168.1.0  0.0.0.0        255.255.255.0  U     0      0   0   mgmt0
0.0.0.0      192.168.1.254  0.0.0.0        UG    0      0   0   mgmt0

In my case, VM1 (named web01) was given 10.10.10.1; VM2 (named web02) was given 10.10.10.2. Once I went through the steps outlined above, I was able to successfully ping VM2 from VM1, as you can see in this screenshot:

VM-to-VM connectivity over GRE tunnel

(Although it’s not shown here, connectivity from VM2 to VM1 was obviously successful as well.)

“OK, that’s cool, but why do I care?” you might ask.

In this particular context, it’s a bit of a science experiment. However, if you take a step back and begin to look at the bigger picture, then (hopefully) something starts to emerge:

  • We can use an encapsulation protocol (GRE in this case, but it could have just as easily been STT or VXLAN) to isolate VM traffic from the physical network and from other VM traffic. (Think multi-tenancy.)
  • While this process was manual, think about some sort of controller (an OpenFlow controller, perhaps?) that could help automate this process based on its knowledge of the VM topology.
  • Using a virtualized router or virtualized firewall, I could easily provide connectivity into or out of this isolated (encapsulated) private network. (This is probably something I’ll experiment with later.)
  • What if we wrapped some sort of orchestration framework around this, to help deploy VMs, create networks, add routers/firewalls automatically, all based on the customer’s needs? (OpenStack Networking, anyone?)

Anyway, I hope this is helpful to someone. As always, I welcome feedback and suggestions for improvement, so feel free to speak up in the comments below. Vendor disclosures, where appropriate, are greatly appreciated. Thanks!

Tags: , , , , ,

Welcome to Technology Short Take #32, the latest installment in my irregularly-published series of link collections, thoughts, rants, raves, and miscellaneous information. I try to keep the information linked to data center technologies like networking, storage, virtualization, and the like, but occasionally other items slip through. I hope you find something useful.

Networking

  • Ranga Maddipudi (@vCloudNetSec on Twitter) has put together two blog posts on vCloud Networking and Security’s App Firewall (part 1 and part 2). These two posts are detailed, hands-on, step-by-step guides to using the vCNS App firewall—good stuff if you aren’t familiar with the product or haven’t had the opportunity to really use it.
  • The sentiment behind this post isn’t unique to networking (or networking engineers), but that was the original audience so I’m including it in this section. Nick Buraglio climbs on his SDN soapbox to tell networking professionals that changes in the technology field are part of life—but then provides some specific examples of how this has happened in the past. I particularly appreciated the latter part, as it helps people relate to the fact that they have undergone notable technology transitions in the past but probably just don’t realize it. As I said, this doesn’t just apply to networking folks, but to everyone in IT. Good post, Nick.
  • Some good advice here on scaling/sizing VXLAN in VMware deployments (as well as some useful background information to help explain the advice).
  • Jason Edelman goes on a thought journey connecting some dots around network APIs, abstractions, and consumption models. I’ll let you read his post for all the details, but I do agree that it is important for the networking industry to converge on a consistent set of abstractions. Jason and I disagree that OpenStack Networking (formerly Quantum) should be the basis here; he says it shouldn’t be (not well-known in the enterprise), I say it should be (already represents work created collaboratively by multiple vendors and allows for different back-end implementations).
  • Need a reasonable introduction to OpenFlow? This post gives a good introduction to OpenFlow, and the author takes care to define OpenFlow as accurately and precisely as possible.
  • SDN, NFV—what’s the difference? This post does a reasonable job of explaining the differences (and the relationship) between SDN and NFV.

Servers/Hardware

  • Chris Wahl provides a quick overview of the HP Moonshot servers, HP’s new ARM-based offerings. I think that Chris may have accidentally overlooked the fact that these servers are not x86-based; therefore, a hypervisor such as vSphere is not supported. Linux distributions that offer ARM support, though—like Ubuntu, RHEL, and SuSE—are supported, however. The target market for this is massively parallel workloads that will benefit from having many different cores available. It will be interesting to see how the support of a “Tier 1″ hardware vendor like HP affects the adoption of ARM in the enterprise.

Security

  • Ivan Pepelnjak talks about a demonstration of an attack based on VM BPDU spoofing. In vSphere 5.1, VMware addressed this potential issue with a feature called BPDU Filter. Check out how to configure BPDU Filter here.

Cloud Computing/Cloud Management

  • Check out this post for some vCloud Director and RHEL 6.x interoperability issues.
  • Nick Hardiman has a good write-up on the anatomy of an AWS CloudFormation template.
  • If you missed the OpenStack Summit in Portland, Cody Bunch has a reasonable collection of Summit summary posts here (as well as materials for his hands-on workshops here). I was also there, and I have some session live blogs available for your pleasure.
  • We’ve probably all heard the “pets vs. cattle” argument applied to virtual machines in a cloud computing environment, but Josh McKenty of Piston Cloud Computing asks whether it is now time to apply that thinking to the physical hosts as well. Considering that the IT industry still seems to be struggling with applying this line of thinking to virtual systems, I suspect it might be a while before it applies to physical servers. However, Josh’s arguments are valid, and definitely worth considering.
  • I have to give Rob Hirschfeld some credit for—as a member of the OpenStack Board—acknowledging that, in his words, “we’ve created such a love fest for OpenStack that I fear we are drinking our own kool aide.” Open, honest, transparent dealings and self-assessments are critically important for a project like OpenStack to succeed, so kudos to Rob for posting a list of some of the challenges facing the project as adoption, visibility, and development accelerate.

Operating Systems/Applications

Nothing this time around, but I’ll stay alert for items to add next time.

Storage

  • Nigel Poulton tackles the question of whether ASIC (application-specific integrated circuit) use in storage arrays elongates the engineering cycles needed to add new features. This “double edged sword” argument is present in networking as well, but this is the first time I can recall seeing the question asked about modern storage arrays. While Nigel’s article specifically refers to the 3PAR ASIC and its relationship to “flash as cache” functionality, the broader question still stands: at what point do the drawbacks of ASICs begin to outweight the benefits?
  • Quite some time ago I pointed readers to a post about Target Driven Zoning from Erik Smith at EMC. Erik recently announced that TDZ works after a successful test run in a lab. Awesome—here’s hoping the vendors involved will push this into the market.
  • Using iSER (iSCSI Extensions for RDMA) to accelerate iSCSI traffic seems to offer some pretty promising storage improvements (see this article), but I can’t help but feel like this is a really complex solution that may not offer a great deal of value moving forward. Is it just me?

Virtualization

  • Kevin Barrass has a blog post on the VMware Community site that shows you how to create VXLAN segments and then use Wireshark to decode and view the VXLAN traffic, all using VMware Workstation.
  • Andre Leibovici explains how Horizon View Multi-VLAN works and how to configure it.
  • Looking for a good list of virtualization and cloud podcasts? Look no further.
  • Need Visio stencils for VMware? Look no further.
  • It doesn’t look like it has changed much from previous versions, but nevertheless some people might find it useful: a “how to” on virtualization with KVM on CentOS 6.4.
  • Captain KVM (cute name, a take-off of Captain Caveman for those who didn’t catch it) has a couple of posts on maximizing 10Gb Ethernet on KVM and RHEV (the KVM post is here, the RHEV post is here). I’m not sure that I agree with his description of LACP bonds (“2 10GbE links become a single 20GbE link”), since any given flow in a LACP configuration can still only use 1 link out of the bond. It’s more accurate to say that aggregate bandwidth increases, but that’s a relatively minor nit overall.
  • Ben Armstrong has a write-up on how to install Hyper-V’s integration components when the VM is offline.
  • What are the differences between QuickPrep and Sysprep? Jason Boche’s got you covered.

I suppose that’s enough information for now. As always, courteous comments are welcome, so feel free to add your thoughts in the comments below. Thanks for reading!

Tags: , , , , , , , , , , , ,

Is there a difference between network virtualization and Software-Defined Networking (SDN)? If so, what is the relationship between them? Is one a subset of the other? This is a topic that is increasingly being discussed and debated. So, in a similar fashion to my post on network overlays vs. network virtualization, I thought I’d weigh in with some thoughts. This post, like all of my posts, is intended to help spark a conversation, so I encourage you to share your thoughts in the comments after you’ve finished reading.

Last week I had the opportunity to join John Troyer on the VMware Communities podcast. My purpose, as John put it when he invited me, was to “gently introduce” the community to the idea of network virtualization, which is where I now spend most of my time since joining VMware in early February. One of the topics we discussed on the podcast was the relationship between SDN and network virtualization, which sparked this blog post by Keith Townsend, one of the attendees. I encourage you to read Keith’s full blog post, but I think the key point he makes is right here:

How is this different from Software Defined Networking or SDN? I think VMware (who Scott works for) would like you to consider SDN as just the abstraction of the control plane from the physical plane…I believe the industry outside of VMware is defining SDN in a broader sense. (emphasis mine)

I’ll get to the definition of SDN in just a moment, but first let’s look at the definition of network virtualization. In my earlier post on network overlays vs. network virtualization, I provided a definition (taken from a presentation Dan Wendlandt gave at the OpenStack Summit in Portland) for network virtualization: “A faithful and accurate reproduction of the physical network that is fully isolated and provides both location independence and physical network state independence.” Several of the readers of that particular blog post then came up with a definition of a virtualized network: “A virtualized network is one which can be instantiated, operated, and removed without physical asset interaction by the network manager.”

These definitions are, in my humble opinion, reasonably precise and accurate. They are easy to understand and easy to explain to others who aren’t familiar with the concept of network virtualization.

With this definition in hand, let’s compare network virtualization to SDN. To compare network virtualization to SDN, we must first define SDN. So—how do we define SDN?

There are two definitions we can use for SDN:

  1. We can use the original definition when SDN was first coined in 2009.
  2. We can use the definition currently in use within the industry.

Let’s look at the first scenario, and let’s examine SDN and network virtualization in the context of the original definition of SDN. I’ve had some discussions with Martin Casado, the so-called “Father of SDN” (he’s going to hate me for using that term—sorry Martin!), about what SDN meant when it was first coined. According to Martin, the term SDN originally referred to a change in the network architecture to include a) decoupling the distribution model of the control plane from the data plane; and b) generalized rather than fixed function forwarding hardware.

Bruce Davie (Principal Engineer at VMware and long-time networking guru) recently talked with some of the other creators of OpenFlow in preparation for his presentation at ONS 2013. According to the research that Bruce did, the term “SDN” was first coined in 2009 to refer specifically to the work being done on OpenFlow. Their original definition of SDN is consistent with Martin’s definition–it refers to the decoupling of the control plane from the data plane.

Therefore, if we look at the original definition of SDN—not VMware’s definition, but the definition of those involved in the creation of the term in 2009—it becomes fairly clear that SDN is not the same as network virtualization. Instead, SDN is a tool that can be leveraged in order to create network virtualization. SDN is a tool or a technique that can be used in the creation of virtualized networks and in satisfying the definition of network virtualization. This subtle distinction, by the way, is one that Bruce addressed in his recent ONS 2013 presentation.

Now, let’s talk about the second scenario, and let’s examine SDN and network virtualization in the context of the current industry definition of SDN. The problem that we have here is that “SDN,” like “cloud,” can really be made to mean just about anything. Think about it: existing protocols like OSPF and BGP (which run in software) manipulate the forwarding rules in the data plane—does that make them SDN, too? What about NX-OS, JUNOS, or EOS? These are all software—does that make them SDN? What about having APIs on switches? What about virtualized load balancers? Or virtualized firewalls? Are these SDN too? What about products that existed long before the SDN term was coined in 2009? Are they now SDN? No, we generally don’t accept all the vast assortment of software that’s involved in networking today as SDN, because SDN originated as a term to refer to the separation of the control plane from the data plane. I think it’s telling when Martin Casado indicates that he doesn’t even know what SDN means anymore, the term is probably being overused. SDN-washing, anyone?

Therefore, we can’t take the current industry definition of SDN, because SDN currently means all things to all people. If your company is doing “something cool” in networking, it’s probably going to be called SDN. There is no clear, crisp, understandable definition.

This lack of clarity around “what is SDN vs. what isn’t SDN” is part of the reason, I think, that VMware has converged on the use of network virtualization. This isn’t just about differentiating itself from all the other SDN players out there, but it’s also about giving customers something concrete and precise they can understand.

Now, if we want to—as an industry—come up with a clearer and more precise definition of SDN, then we can revisit this discussion. In fact, I encourage you to speak up in the comments below with what you might consider would be a useful, clear, precise definition of SDN. Clear definitions and better understanding enable meaningful communication, which can only be beneficial for everyone.

Tags: , ,

« Older entries