Thinking Out Loud: DIY Network Virtualization?

Following on the heels of this week’s VMware NSX announcement at VMworld 2013, I had someone contact me about a statement this person had heard. The statement claimed NSX was nothing more than a collection of tools, and that it was possible to get the equivalent of NSX using completely free (as in speech and as in beer) open source tools—specifically, IPTables, StrongSwan, OpenVPN, and Open vSwitch (OVS). Basically, the statement was that it was possible to create do-it-yourself network virtualization.

That got me thinking: is it true? After considering it for a while, I’d say it’s probably possible; I’m not enough of an expert with these specific tools to say it can’t be done. I will say that it would likely be difficult, beyond the reach of most organizations and individuals, and would still suffer from a number of operational drawbacks. Here’s why.

What are the core components of a network virtualization solution? In my view, there are at least three core components any network virtualization solution needs:

  1. Logically centralized knowledge of the network topology; this component should provide programmatic access to this information
  2. Programmable edge virtual switches in the hypervisor
  3. An encapsulation protocol to isolate logical networks from the physical network

Let’s compare this list with the DIY solution:

  1. I don’t see any component that is capable of building and/or maintaining knowledge of the network topology, and certainly no way to programmatically access this information. This has some pretty serious implications, which I’ll describe below.
  2. OVS fills the need for a programmable edge virtual switch quite nicely, considering that it was expressly designed for this purpose (and is itself leveraged by NSX).
  3. You could potentially leverage either StrongSwan or OpenVPN as an encapsulation protocol. Both of these solutions use encryption, so you’d have to accept the computational overhead of using encryption within your data center for hypervisor-to-hypervisor connectivity, but OK—I suppose these count. Neither of these solutions provide any sort of way to distinguish or mark traffic inside the tunnel, which also has some implications we need to explore.

OK, so the DIY solution is missing a couple of key components. What implications does this have?

  • Without any centralized knowledge of the network topology, there is nothing to handle programming OVS. Therefore, every single change must be manually performed. Provisioning a new VM? You must manually configure OVS, OpenVPN, StrongSwan, and possibly IPTables to handle connectivity to and from that VM. Moving a VM from one host to another? That requires manual reconfiguration. Live migration? That will require manual reconfiguration. Terminating a VM? That will require manual reconfiguration.
  • Without programmatic access to the solution, it can’t be integrated into any other sort of management framework. Want to use it with OpenStack? CloudStack? It’s probably not going to work. Want to use it with a custom CMP you’ve built? It might work, but only after extensive integration work (and a lot of scripts).
  • It’s my understanding that both StrongSwan and OpenVPN will leverage standard IP routing technologies to direct traffic appropriately through the tunnels. What happens when you have multiple logical networks with overlapping IP address space? How will StrongSwan and/or OpenVPN respond? Because neither StrongSwan nor OpenVPN have any way of identifying or marking traffic inside the tunnel (think of VXLAN’s 24-bit VNID or STT’s 64-bit Context ID), how do we distinguish one tenant’s traffic from another tenant’s traffic? Can we even support multi-tenancy? Do we have to fall back to using VLANs?
  • Do you really want to incur the computational overhead of using encryption for all intra-DC traffic?

Of course, this list doesn’t even begin to address other operational concerns: multiple hypervisor support, support for multiple operating systems (or even multiple Linux distributions), support for physical workloads, physical top-of-rack (ToR) switch integration, high availability for various components, and the supportability of the overall solution.

As you can see, there are some pretty significant operational concerns there. Manual reconfiguration for just about any VM-related task? That doesn’t sound like a very good approach. Sure, it might be technically feasible to build your own network virtualization solution, but what benefits does it bring to the business?

Granted, I’m not an expert with some of the open source tools mentioned, so I could be wrong. If you are an expert with one of these tools, and I have misrepresented the functionality the tool is capable of providing, please speak up in the comments below. Likewise, if you feel I’m incorrect in any of my conclusions, I’d love to hear your feedback. Courteous comments are always welcome!

Tags: , , , ,

  1. Lennie’s avatar

    I don’t see any reason to get VPN software in the mix other than inter-DC.

    The Linux kernel actually supports standard GRE-tunnels, VXLAN and soon STT. I don’t remember at the top of my head of NVGRE is also supported or planned.

    Ask yourself a different question: what is missing from open source IaaS and (public or private) cloud systems like OpenStack which supports using OVS and overlays.

    Because a system like OpenStack is where you would start.

    The central administration is already there, when you deploy VMs and virtual networks it will first create entries in the OpenStack database and ask an agent on the hypervisor host to make entries in OVS for the tunnel.

    Really, it’s kind of strange suggestion/request.

  2. slowe’s avatar

    Lennie, I agree regarding VPN software, but that was the argument presented, so I included it. Personally, with OVS in the mix—as well as Linux kernel support—you have encapsulation protocols. What’s still missing, though, is the logically centralized knowledge of the topology and the programmatic access to the solution. This means you’re still dealing with significant operational concerns.

  3. Lennie’s avatar

    I do have something to say about OpenStack first: OpenStack is the DIY solution, because OpenStack is more of a framework or building blocks of seperate projects you can combine the way you want, not a product. Or you can let others do that for you.

    It all depends on your needs, most of it already exists in open source form.

    For example OpenStack provides that programmatic access, you talk about.

    The really missing piece in the open source world is the the code to prevent flooding of ARP-requests and other broadcast packets. The hooks in the Linux kernel already exist and OVS already supports the use of an OpenFlow controller.

    An OpenFlow controller could be extended to keep a central registry and that would solve that.

    Depending on your needs you might also want to allocate network resources (read: devide and monitor avaialble bandwidth). I’m not sure if OpenStack has the hooks for that yet.

    Am I missing something ?

  4. slowe’s avatar

    Lennie, what you’re now describing is _very_ different from the environment that was originally proposed. So, let’s be clear: I’m not saying that it’s not possible to build your own solution. However, building a solution using the tools that were outlined is extremely difficult and fraught with operational concerns.

    Now, if we want to change the picture and add OpenStack and an OpenFlow controller, then the picture changes. Now you have to ask yourself:
    - How do I provide high availability for my OpenFlow controller?
    - If I use a cluster of controllers, how do I ensure consensus among the controllers regarding the network topology? How do I provide scale-out performance to the controller? These are not trivial questions.
    - What OpenStack Neutron plug-in will I use? Does it support all the features of my OpenFlow controller?

    Again, my point is not to say that it can’t be done, but rather that for the vast, VAST majority of folks, the technical expertise required to build an appropriately reliable and scalable solution is beyond their reach. (I haven’t even brought up the issue of supportability yet—who do you call when something goes wrong?)

    All that being said, I do agree that there is tremendous progress being made in the open source world, something that I am glad to see.

  5. Lennie’s avatar

    I’m not talking about fullblown OpenFlow, just setup the OVS switch with a few default flows so it would only send the broadcast traffic to the OpenFlow controller and handle that.

    The OpenFlow controller could just be a slightly hacked version of the ovs-controller which is included with OVS and run that on the hypervisor host. All you’ll need to have is that central registry. You wouldn’t need a Neutron plug-in to talk to the OpenFlow controller.

    The central registry could just be something simple like a key-value database. Something like Redis maybe, such a key-value database already support expiration. Even if the central registry is down, the only convience would be you’ll get flooding.

    I hope I’m not oversimplifying and forgetting something. :-)

  6. Jason Edelman (@jedelman8)’s avatar

    Hi Scott,

    One thing to bring up here for the DIY crowd. Brent Salisbury (@networkstatic) is working to integrate OVSDB into the OpenDaylight (ODP) SDN controller project. This will be another tool in the tool belt for managing OVS. Once Neutron support is also added to ODP, it’ll be a solid foundation for testing (and DIY networking) offering OF, OVSDB, and OpenStack support. As you said though, clustering, HA, etc. are still points of concern and should not be forgotten about.

    Reference: https://wiki.opendaylight.org/view/Project_Proposals:OVSDB-Integration

    FWIW, I was actually surprised to not see any mention of higher level network services for what you defined as the core components of network virtualization. Can’t forget about full L2-L7 services :) otherwise, I think you are talking about an overlay solution. Probably splitting hairs, but I remember you writing an overlay vs. NV post!

    -Jason

  7. Ian’s avatar

    It seems to me that not having a topology product listed in the “tools list” isn’t so much an omission but an indication of how many really capable databases there are in the FOSS realm that can do this pretty much in their sleep, giving you both your consolidated registry of virtual circuits and end-points and your programmatic access to that data. Yes, it is important to this system, but it isn’t something new, it is just called using a data base, and we’ve pretty much got that figured out.

    So, using your list, you need a database with a CRUD API capable of storing graphs or AVP tables (or both), a forwarding plane that is programmable, and an encapsulation protocol that is both native and provides for embedding meta-data in the encapsulation headers.

    Then most of your other challenges just become constraints of the paradigm that VMware is selling (and the rest of the dynamic data center world is accepting), which mostly boils down to using bridging for VM to VM communication instead of routing, using IPv4 for the outer layer encapsulation addressing instead of IPv6, and using broadcasts in the forwarding plane instead of using multicasts in the orchestration domain to disseminate all manner of network information to both hosts and guests.

    We’ve been building networks for quite a while now and we generally know that bridging, IPv4, and broadcast messaging are all poor ideas, or at least ideas that have run their course. Also, pretending that overlay networks and VPNs are different things (particularly when the proper, technical classification for things like VxLAN, NVGRE, and STT is Dynamic Multipoint VPN) or that dynamic datacenters don’t have more in common with mobile access networks and campus WLANS than static datacenters is somewhat disingenous. Just because you are running Ethernet inside the encapsulation protocol and not PPP doesn’t change the basic design pattern that says you have a private address space carried through an address space you don’t control. And just because your mobile end-points don’t leave the building (usually) doesn’t mean you don’t have to keep track of them just the same as a a mobile end-point with a radio. In a mobile world, all relationships are person-to-person, all connections are unicast, and the directory is out-of-band and that is true whether there are radios or hypervisors at the edge.

    So what your questioner might have been thinking is that IPv6 with ESP in transport mode encapsulating IPv4 (or, if you must, Ethernet), tied together with an orchestrated distributed route information server running on every host and ToR L2-4 network node and hypervisor network functionality that terminates the IPv6 and ESP obviates every challenge you’ve laid out and gives you a stateless packet-switched network with virtual circuit-switching in the encapsulated L4 payload that is identified by a manifest composed of IPv6 Options headers, ESP SNI, ESP sequence number, and ESP Next Header values.

    Yes, there are some Netfilter API calls that have to be tied up into a pretty interface for an orchestration system to use Linux as the forwarder (and many bar room debates about whether you should), and there are gaps in the hypervisor networking that have to be filled, which also means that you have to get both cloud orchestration frameworks and the OpenFlow controllers to see the network as more than just virtual L1 interfaces and data-links. But doing it is worth the (small technical) effort, since that kind of system is, from a networking perspective, vastly superior to anything based on bridging and switching, and necessary to do anything more advanced than tossing legacy applications into OS virtualization containers and pointing the Internet at them.

    In the end, if you abide by the “Cloud is Infrastructure with an API” worldview, then your network is just infrastructure with an API that you use to populate a directory of end-point to end-point relationships that your forwarding plane consults when it gets a new connection; SDN is just a smart, distributed router with a big forwarding database that pays attention to what else is going on and does the right thing.

  8. Hein Bloed’s avatar

    Obviously you never heard of BigSwitch (http://www.bigswitch.com/) and
    Floodlight (http://www.projectfloodlight.org/floodlight/).

  9. Lennie’s avatar

    If you are going to mention them, you should probably also mention:
    http://www.opendaylight.org/

    But those are large project which use OpenFlow not something simple as the person that talked to Scott was probably thinking about.

  10. slowe’s avatar

    Jason, I didn’t forget the L4-L7 services; they are essential to true network virtualization. Note that I said “at least three core components”. In the context of this particular discussion, I didn’t feel that bringing L4-L7 services into the mix made sense, since the argument focused more on infrastructure/plumbing than anything else. Thanks!

    Ian, wow! That’s a pretty lengthy comment. Thanks for taking the time to share your thoughts and views; I appreciate it. A couple of things stood out to me. First, you make mention of how L2 bridging, IPv4 instead of IPv6, and broadcast messaging are poor design decisions. I’ll make mention that part of the purpose of a network virtualization solution is to faithfully reproduce the physical network so as to enable applications to move into a logical network without modification. There is nothing inherent to VMware’s abstraction model that requires L2 bridging for VMs; it’s there for workloads that need/want it. Likewise, the underlying protocols (IPv4 or IPv6) are just mechanisms; IPv4 was selected because it is almost universally deployed and well understood. Likewise, VMware’s use of broadcasts is limited to what the workloads generate; the solution itself does not require broadcast in the control plane. All that being said, your points are certainly valid, and there are technologies out there that can be used for DIY network virtualization. My key point, though, is not that it is impossible, but that it is highly impractical for the vast majority of organizations out there.

    Hein, I have heard of Big Switch and Floodlight, thanks. I’m not an expert with them, so I can’t tell you if they can provide the necessary scale and high availability that such a component requires. Can you shed some light on how Floodlight, as an open source OpenFlow controller, handles scale and high availability? This might be something that all readers would be interested in knowing. I’d certainly appreciate any information you can share.

    Lennie, along with Floodlight, OpenDaylight (when it is finally released) may also be an option for building your own OpenFlow-enabled network. As you pointed out, though, this makes what we are discussing a different creature than that with which we started. Thanks!

  11. Lennie’s avatar

    Scott, on the topic of L4-L7 services OpenStack already has a number of these done. I believe you would need to have a few dedicated machines for that called ‘network nodes’ if I’m not mistaken.

    I’m pretty sure Big Switch as a commercial solution can deliver anything people want, their Floodlight project is more an entry-level or basic OpenFlow solution I believe.

    Seems a company also donated a LISP mapping service to the OpenDayLight project for WAN scenarios:

    http://searchsdn.techtarget.com/news/2240203720/ConteXtream-adds-LISP-based-NFV-orchestration-to-OpenDaylight-project

    Let’s just say there is a lot of stuff happening in open source and it can deliver something which fits the needs of a lot of people but it won’t be a packaged solution when you download it straight from the site. It will take knowledge and time to pick the right components, create the right architecture and set it up time.

    There is nothing preventing someone creating an open source ‘packaged solution’ based on these components though. Just look at what the Ubuntu or Debian developers did with OpenStack. These are the ‘integrators’ of the ‘open source turn key solution’. That will take time.

  12. Lennie’s avatar

    Brent is don’t interesting work on OpenDayLight and will be giving a talk at the OpenDayLight mini-summit at LinuxCon:

    http://www.opendaylight.org/blogs/2013/08/opendaylight-developer-spotlight-brent-salisbury-0
    http://www.opendaylight.org/blogs/2013/08/get-first-glimpse-opendaylight-september-18th-new-orleans

    I hope they record a video and post it online, like on http://video.linux.com/

  13. Susan Bilder’s avatar

    Even if you can’t build a DIY virtual network with the specific set of tools listed, there are, or will soon be, a set of tools to allow you to do this. The question is, are you better off building your own DIY network, or working with a vendor? For your home network, or for vast institutional networks with strict requirements and even stricter change control – DIY is feasible. Easy to back out of on the one hand, and large internal support teams on the other. For the rest of us in the middle, DIY isn’t a workable option.

  14. Lennie’s avatar

    Today I learned that OpenStack already has a system in Havana to handle the ARP-flooding problem: https://blueprints.launchpad.net/neutron/+spec/l2-population

Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>