Virtualization

You are currently browsing articles tagged Virtualization.

Welcome to part 9 in the Learning NVP/NSX blog series, in which I’ll discuss adding a gateway service to a logical network.

If you are just now joining me for this series, let me bring you up to speed real quick:

This installation in the series builds upon all the previous articles; in particular, I assume that you have a logical network created and configured and that you have an NSX gateway appliance up running and added to the NSX domain. In this post, I’ll show you how to add a logical gateway service to your network so that you can provide routed (layer 3) connectivity into and out of your logical network.

Before I start, I think it’s important to distinguish between a gateway appliance and a gateway service. A gateway appliance is a physical installation (or a VM; it’s supported as a VM in certain configurations) of the NSX gateway software; this is what I showed you how to set up in part 6 of the series. A gateway service, on the other hand, is a logical construct. Typically, you would have multiple gateway appliances. Then you would create a gateway service that would be instantiated across multiple gateway appliances for redundancy and availability. Gateway services can be either layer 2 (provided bridged connectivity between a logical network and VLANs on the physical network) or layer 3 (providing routed—with or without NAT—connectivity between a logical network and the physical network).

It’s also important to understand the distinction between a gateway service and a logical router. A gateway service is an NSX construct; a logical router, on the other hand, is usually a construct of the cloud management platform (like OpenStack). A single gateway service can host many logical routers.

In this series, I’m only going to focus on layer 3 gateway services (primarily because I have limited resources in my environment and can’t run both layer 2 and layer 3 gateway services).

To create a layer 3 gateway service, you’ll follow these steps from within NSX Manager (formerly NVP Manager, which I showed you how to set up in part 3):

  1. From the menu across the top of the NSX Manager page, select Network Components > Services > Gateway Services. This will take you to a page titled “Network Components Query Results,” where NSX Manager has precreated and executed a query for the list of gateway services. Your list will be empty, naturally.

  2. Click the Add button. This will open the Create Gateway Service dialog.

  3. Select “L3 Gateway Service” from the list. Other options in this list include “L2 Gateway Service” (to create a layer 2 gateway service) and “VTEP L2 Gateway Service” (to integrate a third-party top-of-rack [ToR] switch into NSX). Click Next, or click on the “2. Basics” button on the left.

  4. Provide a display name for the new layer 3 gateway service, then click Next (or click on “3. Transport Nodes” on the left). You can optionally add tags here as well, in case you wanted to associate additional metadata with this logical object in NSX.

  5. On the Transport Nodes screen, click Add Gateway to select a gateway appliance (which is classified as a transport node within NSX; hypervisors are also transport nodes) to host this layer 3 gateway service.

  6. From the Edit Gateway dialog box that pops up, you’ll need to select a transport node, a device ID, and a failure zone ID. The first option, the transport node, is pretty straightforward; this is a gateway appliance on which to host this gateway service. The device ID is the bridge (recall that NSX gateway appliances, by default, create OVS bridges to map to their interfaces) connected to the external network. The failure zone ID lets you “group” gateway appliances with regard to their relationship to the gateway service. For example, you could choose to host a gateway service on a gateway from failure zone 1 as well as failure zone 2. (Failure zones are intended to help represent different failure domains within your data center.)

  7. Once you’ve added at least two gateway appliances as transport nodes for your gateway service, click Save to create the gateway service and return to NSX Manager. That’s it—you’re done!

Note that this is more of an implementation task than an operational task. In other words, you’d generally deploy your gateway services when you first set up NSX and your cloud management platform, or when you are adding capacity to your environment. This isn’t something that you have to do when a cloud tenant (customer) needs a logical router; that’s handled automatically through the integration between NSX and the cloud management platform. As I mentioned earlier, a single layer 3 gateway service could host many logical routers.

In the next installation of the series, I’ll walk you through setting up an NSX service node to offload packet replication (for broadcast, unknown unicast, and multicast traffic).

As always, your feedback is welcome and encouraged, so feel free to speak up in the comments below.

Tags: , , , ,

Welcome to Technology Short Take #38, another installment in my irregularly-published series that collects links and thoughts on data center-related technologies from around the web. But enough with the introduction, let’s get on to the content already!

Networking

  • Jason Edelman does some experimenting with the Python APIs on a Cisco Nexus 3000. In the process, he muses about the value of configuration management tool chains such as Chef and Puppet in a world of “open switch” platforms such as Cumulus Linux.
  • Speaking of Cumulus Linux…did you see the announcement that Dell has signed a reseller agreement with Cumulus Networks? I’m pretty excited about this announcement, and I hope that Cumulus sees great success as a result. There are a variety of write-ups about the announcement; so good, many not so good. The not-so-good variety typically refers to Cumulus’ product as an SDN product when technically it isn’t. This article on Barron’s by Tiernan Ray is a pretty good summary of the announcement and some of its implications.
  • Pete Welcher has launched a series of articles discussing “practical SDN,” focusing on the key leaders in the market: NSX, DFA, and the yet-to-be-launched ACI. In the initial installation of the series, he does a good job of providing some basics around each of the products, although (as would be expected of a product that hasn’t launched yet) he has to do some guessing when it comes to ACI. The series continues with a discussion of L2 forwarding and L3 forwarding across the various products. Definitely worth reading, in my opinion.
  • Nick Buraglio takes away all your reasons for not collecting flow-based data from your environment with his write-up on installing nfsen and nfdump for NetFlow and/or sFlow collection.
  • Terry Slattery has a nice write-up on new network designs that are ideally suited for SDN. If you are looking for a primer on “next-generation” network designs, this is worth reviewing.
  • Need some Debian packages for Open vSwitch 2.0? Here’s another article from Nick Buraglio—he has some information to help you out.

Servers/Hardware

Nothing this time, but check back next time.

Security

Nothing from my end. Maybe you have something you’d like to share in the comments?

Cloud Computing/Cloud Management

  • Christian Elsen (who works in Integration Engineering at VMware) has a nice series of articles going on using OpenStack with vSphere and NSX. The series starts here, but follow the links at the bottom of that article for the rest of the posts. This is really good stuff—he includes the use of the NSX vSwitch with vSphere 5.5, and talks about vSphere OpenStack Virtual Appliance (VOVA) as well. All in all, well worth a read in my opinion.
  • Maish Saidel-Keesing (one of my co-authors on the first edition of VMware vSphere Design and also a super-sharp guy) recently wrote an article on how adoption of OpenStack will slow the adoption of SDN. While I agree that widespread adoption of OpenStack could potentially retard the evolution of enterprise IT, I’m not necessarily convinced that it will slow the adoption of SDN and network virtualization solutions. Why? Because, in part, I believe that the full benefits of something like OpenStack need a good network virtualization solution in order to be realized. Yes, some vendors are writing plugins for Neutron that manipulate physical switches. But for developers to get true isolation, application portability, the ability to re-create production environments in development—all that is going to require network virtualization.
  • Here’s a useful OpenStack CLI cheat sheet for some commonly-used commands.

Operating Systems/Applications

  • If you’re using Ansible (a product I haven’t had a chance to use but I’m closely watching), but I came across this article on an upcoming change to the SSH transport that Ansible uses. This change, referred to as “ssh_alt,” promises a significant performance increase for Ansible. Good stuff.
  • I don’t think I’ve mentioned this before, but Forbes Guthrie (my co-author on the VMware vSphere Design books and an already great guy) has a series going on using Linux as a domain controller for a vSphere-based lab. The series is up to four parts now: part 1, part 2, part 3, and part 4.
  • Need (or want) to increase the SCSI timeout for a KVM guest? See these instructions.
  • I’ve been recommending that IT pros get more familiar with Linux, as I think its influence in the data center will continue to grow. However, the problem that I sometimes face is that experienced folks tend to share these “super commands” that ordinary folks have a hard time decomposing. However, this site should make that easier. I’ve tried it—it’s actually pretty handy.

Storage

  • Jim Ruddy (an EMCer, former co-worker of mine, and an overall great guy) has a pretty cool series of articles discussing the use of EMC ViPR in conjunction with OpenStack. Want to use OpenStack Glance with EMC ViPR using ViPR’s Swift API support? See here. Want a multi-node Cinder setup with ViPR? Read how here. Multi-node Glance with ViPR? He’s got it. If you’re new to ViPR (who outside of EMC isn’t?), you might also find his articles on deploying EMC ViPR, setting up back-end storage for ViPR, or deploying object services with ViPR to also be helpful.
  • Speaking of ViPR, EMC has apparently decided to release it for free for non-commercial use. See here.
  • Looking for more information on VSAN? Look no further than Cormac Hogan’s extensive VSAN series (up to Part 14 at last check!). The best way to find this stuff is to check articles tagged VSAN on Cormac’s site. The official VMware vSphere blog also has a series of articles running; check out part 1 and part 2.

Virtualization

  • Did you happen to see this news about Microsoft Hyper-V Recovery Manager (HRM)? This is an Azure-hosted service that can be roughly compared to VMware’s Site Recovery Manager (SRM). However, unlike SRM (which is hosted on-premise), HRM is hosted by Microsoft Azure. As the article points out, it’s important to understand that this doesn’t mean your VMs are replicated to Azure—it’s just the orchestration portion of HRM that is running in Azure.
  • Oh, and speaking of Hyper-V…in early January Microsoft released version 3.5 of their Linux Integration Services, which primarily appears to be focused on adding Linux distribution support (CentOS/RHEL 6.5 is now supported).
  • Gregory Gee has a write-up on installing the Cisco CSR 1000V in VirtualBox. (I’m a recent VirtualBox convert myself; I find the vboxmanage command just so very handy.) Note that I haven’t tried this myself, as I don’t have a Cisco login to get the CSR 1000V code. If any readers have tried it, I’d love to hear your feedback. Gregory also has a few other interesting posts I’m planning to review in the next few weeks as well.
  • Sunny Dua, who works with VMware PSO in India, has a series of blog posts on architecting vSphere environments. It’s currently up to five parts; I don’t know how many more (if any) are planned. Here are the links: part 1 (clusters), part 2 (vCenter SSO), part 3 (storage), part 4 (design process), and part 5 (networking).

It’s time to wrap up now before this gets any longer. If you have any thoughts or tidbits you’d like to share, I welcome any and all courteous comments. Join (or start) the conversation!

Tags: , , , , , , , , , , , ,

A couple of weeks ago I had the privilege of joining Richard Campbell on RunAs Radio to talk VMware NSX and network virtualization. If you’d like to hear us get geeky about network virtualization and where it might take our industry, head over and listen to episode 346. I’d love to hear your feedback!

Tags: , , ,

Welcome to Technology Short Take #37, the latest in my irregularly-published series in which I share interesting articles from around the Internet, miscellaneous thoughts, and whatever else I feel like throwing in. Here’s hoping you find something useful!

Networking

  • Ivan does a great job of describing the difference between the management, control, and data planes, as well as providing examples. Of course, the distinction between control plane protocols and data plane protocols isn’t always perfectly clear.
  • You’ve heard me talk about snowflake servers before. In this post on why networking needs a Chaos Monkey, Mike Bushong applies to the terms to networks—a snowflake network is an intricately crafted network that is carefully tailored to utilize a custom subset of networking features unique to your environment. What is the fix—if one exists—to snowflake networks? Designing your network for resiliency and unleashing a Chaos Monkey on it is one way, as Mike points out. A fan of network virtualization might also say that decomposing today’s complex physical networks into multiple simple logical networks on top of a simpler physical transport network—similar to Mike’s suggestion of converging on a smaller set of reference architectures—might also help. (Of course, I am a fan of network virtualization, since I work with/on VMware NSX.)
  • Martijn Smit has launched a series of articles on VMware NSX. Check out part 1 (general introduction) and part 2 (distributed services) for more information.
  • The elephants and mice post at Network Heresy has sparked some discussion across the “blogosphere” about how to address this issue. (Note that my name is on the byline for that Network Heresy post, but I didn’t really contribute all that much.) Jason Edelman took up the idea of using OpenFlow to provide a dedicated core/spine for elephant flows, while Marten Terpstra at Plexxi talks about how Plexxi’s Affinities could be used to help address the problem of elephant flows. Peter Phaal speaks up in the comments to Marten’s article about how sFlow can be used to rapidly detect elephant flows, and points to a demo taking place during SC13 that shows sFlow tracking elephant flows on SCinet (the SC13 network).
  • Want some additional information on layer 2 and layer 3 services in VMware NSX? Here’s a good source.
  • This looks interesting, but I’m not entirely sure how I might go about using it. Any thoughts?

Servers/Hardware

Nothing this time around, but I’ll keep my eyes peeled for something to include next time!

Security

I don’t have anything to share this time—feel free to suggest something to include next time.

Cloud Computing/Cloud Management

Operating Systems/Applications

  • I found this post on getting the most out of HAProxy—in which Twilio walks through some of the configuration options they’re using and why—to be quite helpful. If you’re relatively new to HAProxy, as I am, then I’d recommend giving this post a look.
  • This list is reasonably handy if you’re not a Terminal guru. While written for OS X, most of these tips apply to Linux or other Unix-like operating systems as well. I particularly liked tip #3, as I didn’t know about that particular shortcut.
  • Mike Preston has a great series going on tuning Debian Linux running under vSphere. In part 1, he covered installation, primarily centered around LVM and file system mount options. In part 2, Mike discusses things like using the appropriate virtual hardware, the right kernel modules for VMXNET3, getting rid of unnecessary hardware (like the virtual floppy), and similar tips. Finally, in part 3, he talks about a hodgepodge of tips—things like blacklisting other unnecessary kernel drivers, time synchronization, and modifying the Linux I/O scheduler. All good stuff, thanks Mike!

Storage

  • “Captain KVM,” aka Jon Benedict, takes on the discussion of enterprise storage vs. open source storage solutions in OpenStack environments. One good point that Jon makes is that solutions need to be evaluated on a variety of criteria. In other words, it’s not just about cost nor is it just about performance. You need to use the right solution for your particular needs. It’s nice to see Jon say that if your needs are properly met by an open source solution, then “by all means stick with Ceph, Gluster, or any of the other cool software storage solutions out there.” More vendors need to adopt this viewpoint, in my humble opinion. (By the way, if you’re thinking of using NetApp storage in an OpenStack environment, here’s a “how to” that Jon wrote.)
  • Duncan Epping has a quick post about a VMware KB article update regarding EMC VPLEX and Storage DRS/Storage IO Control. The update is actually applicable to all vMSC configurations, so have a look at Duncan’s article if you’re using or considering the use of vMSC in your environment.
  • Vladan Seget has a look at Microsoft ReFS.

Virtualization

I’d better wrap it up here so this doesn’t get too long for folks. As always, your courteous comments and feedback are welcome, so feel free to start (or join) the discussion below.

Tags: , , , , , , ,

In part 7 of the Learning NVP series, I mentioned that I was planning to transition this series from NVP to NSX through an upgrade. I had an existing NVP installation running (all virtually) inside an OpenStack cloud, and I would just upgrade that to NSX 4.0.0. Here’s a quick update on that plan and the NVP-to-NSX transition.

As I mentioned, I have an installation of NVP 3.1.1 running successfully in a nested (virtualized) environment. (Yes, it is possible to run all of NVP completely virtualized, though we don’t support that for production environments.) Starting with NVP 3.1.x, NVP offered an “Update Coordinator” that coordinated and orchestrated the upgrade of the various components within an NVP domain. Since I was running NVP 3.1.1, I could just use the Update Coordinator to upgrade my installation and walk you (the readers) through the process along the way.

Using the Update Coordinator (which is built into NVP Manager), an NVP upgrade would typically look something like this:

  • You’d log into NVP Manager and go to the Update Coordinator screen.
  • If you hadn’t already, you’d upload the update files (appliance update files and OVS update files) to NVP Manager.
  • Once all the update files were uploaded, you’d select the version to which you’re upgrading and kick it off.
  • NVP Manager itself is upgraded first.
  • Next, the Update Coordinator pushes out the appliance update files (sometimes called NUB files because of their .nub extension) out to all the appliances (service node, gateways, and controllers).
  • Next, the non-hypervisor transport nodes are upgraded (this is the service nodes and gateways).
  • Following that, the hypervisors need to be upgraded, though this isn’t handled by the Update Coordinator. (You could, of course, leverage a tool like Puppet or Chef or similar to help automate this process.)
  • After you’ve verified that the hypervisors have been updated, then the Update Coordinator upgrades the controller nodes.
  • Following the successful upgrade of the controller nodes, there is a cleanup phase and then you’re all set.

This is really high-level and I’m glossing over some details, naturally. Because an NVP upgrade is a pretty big deal—it could have an effect on the network connectivity of all the VMs and hypervisors within the NVP domain—it typically involves lots of planning, lots of testing, proper backups of all the components, and so on. However, since this was a lab environment and not a real production environment, just running through the Update Coordinator should have been fine.

As it turns out, though, I ran into a few problems—not problems with NVP, but problems with how I had deployed it. Basically, I didn’t do my due diligence and read the documentation.

When I first deployed the virtualized NVP appliances, I selected VMs that had a 10GB root disk. While this was enough to get NVP up and running, it turns out that it is not enough space to perform an upgrade. Specifically, it’s not enough space to do an upgrade on the controllers; the transport nodes upgraded successfully. After the installation of the controllers, I was left with only a couple gigabytes of free space remaining. A fair portion of that is taken up then by the appliance update file, and this did not leave enough to actually perform the controller software upgrade.

Unfortunately, there was no easy workaround. Because the NVP controller cluster is scale out and highly available, I could have taken the controllers out (one at a time), rebuilt them with more disk space, and then re-joined the cluster—a rolling upgrade, if you will. However, because NVP 3.1.1 is a much older build of NVP, it wasn’t possible to rebuild the controllers with a matching software version (not easily, anyway).

So, long story short: instead of wasting cycles trying to fix a deployment issue that is completely my fault (and, by the way, completely documented—had I paid closer attention to the documentation I wouldn’t find myself in this position), I’m simply going to rebuild my lab environment from scratch using NSX 4.0.0. I had really hoped to be able to walk you through the upgrade process, but sadly it just doesn’t make sense to do so.

This will be the last post titled “Learning NVP”; moving forward, all future posts will be titled “Learning NSX.” The next post will discuss adding a gateway service to a logical network; this builds on information from part 5 (creating a logical network) and part 6 (adding a gateway appliance).

As always, your feedback is welcome and encouraged, so feel free to speak up in the comments below.

Tags: , , , ,

In this post I’m going to expand a little bit on using libvirt to connect Linux containers (created using LXC) to Open vSwitch (OVS). I made brief mention of this in my post on using LXC with libvirt, but did not provide any details. This post aims to provide those details.

I’m assuming that you’re already familiar with LXC, OVS, and libvirt. If you aren’t familiar with these projects, I suggest you have a look back at other articles I’ve written about them in the past. One of the easiest ways to do that is to browse articles tagged LXC, tagged OVS, and/or tagged Libvirt. Further, I’m using Ubuntu 12.04 LTS in my environment, so if you’re using another Linux distribution please note that some commands and/or package names might be different.

The basic process for connecting a Linux container to OVS using libvirt looks something like this:

  1. Create one or more virtual networks in libvirt to “front-end” OVS.
  2. Create your container(s) using standard LXC user-space tools.
  3. Create libvirt XML definitions for your container(s).
  4. Start the container(s) using virsh.

Steps 2, 3, and 4 were covered in my previous post on using LXC and libvirt, so I won’t repeat them here. Step 1 is the focus here. (If you are a long-time reader and/or well-versed with libvirt and OVS, there isn’t a great deal of new information here; I just wanted to present it in the context of LXC for the sake of completeness.)

To create a libvirt virtual network to front-end OVS, you need to create an XML definition that you can use with virsh to define the virtual network. Here’s an example XML definition:

(If you can’t see the code block above, please click here.)

A few notes about this XML definition:

  • You normally wouldn’t include the UUID, as that is generated automatically by libvirt. If you were using this XML to create the virtual network from scratch, I would recommend just deleting the UUID line.
  • The network is named “bridged”, and points to the OVS bridge named br-ex. In this particular case, br-ex is a simple OVS bridge that contains a single physical interface.
  • This particular virtual network only has a single portgroup configured for untagged traffic. If you wanted to provide a virtual network that supported multiple VLANs, you could add more portgroups with the VLAN tags as I describe in my post on using VLANs with OVS and libvirt. You’d then modify the container’s XML definition to point to the appropriate portgroup, and in this way you could easily support running multiple containers across multiple VLANs on a single host.
  • A libvirt virtual network can only point to a single bridge, so if you wanted to support both bridged (as shown here) as well as tunneled connectivity (perhaps as described in my post on LXC, OVS, and GRE tunnels), you would need to create a second XML definition that creates a separate virtual network. You could then modify the container’s XML definition to point to the new network you just created.

In the bullets above, I mentioned modifying the container’s XML definition. In particular, I’m referring to the <interface type='network‘> portion of the container’s XML definition. To use a libvirt network for a container’s network connectivity, you’d specify <source network='bridged'/> (replacing “bridged” with whatever the name of your virtual network is; I’m using the name provided in the sample XML code above). For multiple interfaces in the container, simply supply multiple <interface type='network'> entries in the container’s XML definition, and configure the source network for each of them appropriately.

Hopefully this post provides some additional details and information on using libvirt to connect Linux containers to OVS. If you have any questions, or if you have more information to share on this topic, please feel free to speak up in the comments below. I encourage and welcome all courteous feedback!

Tags: , , , , ,

In this post, I’ll discuss how you could use Open vSwitch (OVS) and GRE tunnels to connect bare metal workloads. While OVS is typically used in conjunction with a hypervisor such as KVM or Xen, you’re certainly not restricted to only using it on hypervisors. Similarly, while GRE tunnels are commonly used to connect VMs or containers, you’re definitely not restricted from using them with bare metal workloads as well. In this post, I’ll explore how you would go about connecting bare metal workloads over GRE tunnels managed by OVS.

This post, by the way, was sparked in part by a comment on my article on using GRE tunnels with OVS, in which the reader asked: “Is there a way to configure bare Linux (Ubuntu)…with OVS installed…to serve as a tunnel endpoint…?” Hopefully this post helps answer that question. (By the way, the key to understanding how this works is in understanding OVS traffic patterns. If you haven’t yet read my post on examining OVS traffic patterns, I highly recommend you go have a look right now. Seriously.)

Once you have OVS installed (maybe this is helpful?), then you need to create the right OVS configuration. That configuration can be described, at a high level, like this:

  • Assign an IP address to a physical interface. This interface will be considered the “tunnel endpoint,” and therefore should have an IP address that is correct for use on the transport network.
  • Create an OVS bridge that has no physical interfaces assigned.
  • Create an OVS internal interface on this OVS bridge, and assign it an IP address for use inside the GRE tunnel(s). This interface will be considered the primary interface for the OS instance.
  • Create the GRE tunnel for connecting to other tunnel endpoints.

Each of these areas is described in a bit more detail in the following sections.

Setting Up the Transport Interface

When setting up the physical interface—which I’ll refer to as the transport interface moving forward, since it is responsible for transporting the GRE tunnel across to the other endpoints—you’ll just need to use an IP address and routing entries that enable it to communicate with other tunnel endpoints.

Let’s assume that we are going to have tunnel endpoints on the 192.168.1.0/24 subnet. On the bare metal OS instance, you’d configure a physical interface (I’ll assume eth0, but it could be any physical interface) to have an IP address on the 192.168.1.0/24 subnet. You could do this automatically via DHCP or manually; the choice is yours. Other than ensuring that the bare metal OS instance can communicate with other tunnel endpoints, no additional configuration is required. (I’m using “required” as in “necessary to make it work.” You may want to increase the MTU on your physical interface and network equipment in order to accommodate the GRE headers in order to optimize performance, but that isn’t required in order to make it work.)

Once you have the transport interface configured and operational, you can move on to configuring OVS.

Configuring OVS

If you’ve been following along at home with all of my OVS-related posts (you can browse all posts using the OVS tag), you can probably guess what this will look like (hint: it will look a little bit like the configuration I described in my post on running host management through OVS). Nevertheless, I’ll walk through the configuration for the benefit of those who are new to OVS.

First, you’ll need to create an OVS bridge that has no physical interfaces—the so-called “isolated bridge” because it is isolated from the physical network. You can call this bridge whatever you want. I’ll use the name br-int (the “integration bridge”) because it’s commonly used in other environments like OpenStack and NVP/NSX.

To create the isolated bridge, use ovs-vsctl:

ovs-vsctl add-br br-int

Naturally, you would substitute whatever name you’d like to use in the above command. Once you’ve created the bridge, then add an OVS internal interface; this internal interface will become the bare metal workload’s primary network interface:

ovs-vsctl add-port br-int mgmt0 -- set interface mgmt0 type=internal

You can use a name other than mgmt0 if you so desire. Next, configure this new OVS internal interface at the operating system level, assigning it an IP address. This IP address should be taken from a subnet “inside” the GRE tunnel, because it is only via the GRE tunnel that you’ll want the workload to communicate.

The following commands will take care of this part for you:

ip addr add 10.10.10.30/24 dev mgmt0
ip link set mgmt0 up

The process of ensuring that the mgmt0 interface comes up automatically when the system boots is left as an exercise for the reader (hint: use /etc/network/interfaces).

At this point, the bare metal OS instance will have two network interfaces:

  • A physical interface (we’re assuming eth0) that is configured for use on the transport network. In other words, it has an IP address and routes necessary for communication with other tunnel endpoints.
  • An OVS internal interface (I’m using mgmt0) that is configured for use inside the GRE tunnel. In other words, it has an IP address and routes necessary to communicate with other workloads (bare metal, containers, VMs) via the OVS-hosted GRE tunnel(s).

Because the bare metal OS instance sees two interfaces (and therefore has visibility into the routes both “inside” and “outside” the tunnel), you may need to apply some policy routing configuration. See my introductory post on Linux policy routing if you need more information.

The final step is establishing the GRE tunnel.

Establishing the GRE Tunnel

The commands for establishing the GRE tunnel have been described numerous times, but once again I’ll walk through the process just for the sake of completeness. I’m assuming that you’ve already completed the steps in the previous section, and that you are using an OVS bridge named br-int.

First, add the GRE port to the bridge:

ovs-vsctl add-port br-int gre0

Next, configure the GRE interface on that port:

ovs-vsctl set interface gre0 type=gre options:remote_ip=<IP address of remote tunnel endpoint>

Let’s say that you’ve assigned 192.168.1.10 to the transport interface on this system (the bare metal OS instance), and that the remote tunnel endpoint (which could be a host with multiple containers, or a hypervisor running VMs) has an IP address of 192.168.1.15. On the bare metal system, you’d configure the GRE interface like this:

ovs-vsctl set interface gre0 type=gre options:remote_ip=192.168.1.15

On the remote tunnel endpoint, you’d configure the GRE interface like this:

ovs-vsctl set interface gre0 type=gre options:remote_ip=192.168.1.10

In other words, each GRE interface points to the transport IP address on the opposite end of the tunnel.

Once the configuration on both ends is done, then you should be able to go into the bare metal OS instance and ping an IP address inside the GRE tunnel. For example, I used this configuration to connect a bare metal Ubuntu 12.04 instance, a container running on an Ubuntu host, and a KVM VM running on an Ubuntu host (I had a full mesh topology with STP enabled, as described here). I was able to successfully ping between the bare metal OS instance, the container, and the VM, all inside the GRE tunnel.

Summary, Caveats, and Other Thoughts

While this configuration is interesting as a “proof of concept” that OVS and GRE tunnels can be used to connect bare metal OS instances and workloads, there are a number of considerations and/or caveats that you’ll want to think about before trying something like this in a production environment:

  • The bare metal OS instance has visibility both “inside” and “outside” the tunnel, so there isn’t an easy way to prevent the bare metal OS instance from communicating outside the tunnel to other entities. This might be OK—or it might not. It all depends on your requirements, and what you are trying to achieve. (In theory, you might be able to provide some isolation using network namespaces, but I haven’t tested this at all.)
  • If you want to create a full mesh topology of GRE tunnels, you’ll need to enable STP on OVS.
  • There’s nothing preventing you from attaching an OpenFlow controller to the OVS instances (including the OVS instance on the bare metal OS) and pushing flow rules down. This would eliminate the need for STP, since OVS won’t be in MAC learning mode. This means you could easily incorporate bare metal OS instances into a network virtualization-type environment. However…
  • There’s no easy way to provide a separation of OVS and the bare metal OS instance. This means that users who are legitimately allowed to make administrative changes to the bare metal OS instance could also make changes to OVS, which could easily “break” the configuration and cause problems. My personal view is that this is why you rarely see this sort of OVS configuration used in conjunction with bare metal workloads.

I still see value in explaining how this works because it provides yet another example of how to configure OVS and how to use OVS to help provide advanced networking capabilities in a variety of environments and situations.

If you have any questions, I encourage you to add them in the comments below. Likewise, if I have overlooked something, made any mistakes, or if I’m just plain wrong, please speak up below (courteously, of course!). I welcome all useful/pertinent feedback and interaction.

Tags: , , , , , ,

One of the cool things about libvirt is the ability to work with multiple hypervisors and virtualization technologies, including Linux containers using LXC. In this post, I’m going to show you how to use libvirt with LXC, including leveraging libvirt to help automate attaching containers to Open vSwitch (OVS).

If you aren’t familiar with Linux containers and LXC, I invite you to have a look at my introductory post on Linux containers and LXC. It should give you enough background to make this post make sense.

To use libvirt with an LXC container, there are a couple of basic steps:

  1. Create the container using standard LXC user-space tools.
  2. Create a libvirt XML definition for the container.
  3. Define the libvirt container domain.
  4. Start the libvirt container domain.

The first part, creating the container, is pretty straightforward:

lxc-create -t ubuntu -n cn-02

This creates a container using the Ubuntu template and calls it cn–01. As you may recall from my introductory LXC post, this creates the container’s configuration and root filesystem in /var/lib/lxc by default. (I’m assuming you are using Ubuntu 12.04 LTS, as I am.)

Once you have the container created, you next need to get it into libvirt. Libvirt uses a standard XML-based format for defining VMs, containers, networks, etc. At first, I thought this might be the most difficult section, but thanks to this page I was able to create a template XML configuration.

Here’s the template I was able to create:

(If you can’t see the embedded code above, please click here.)

Simply take this XML template and save it as something like lxc-template.xml or similar. Then, after you’ve created your container using lxc-create as above, you can easily take this template and turn it into a specific container configuration with only one command. For example, suppose you created a container named “cn–02″ (as I did with the command I showed earlier). If you wanted to customize the XML template, just use this simple Unix/Linux command:

sed 's/REPLACE/cn-02/g' lxc-template.xml > cn-02.xml

Once you have a container-specific libvirt XML configuration, then defining it in libvirt is super-easy:

virsh -c lxc:// define cn-02.xml

Then start the container:

virsh -c lxc:// start cn-02

And connect to the container’s console:

virsh -c lxc:// console cn-02

When you’re done with the container’s console, press Ctrl-] (that’s Control and right bracket at the same time); that will return you to your host.

Pretty handy, eh? Further, since you’re now controlling your containers via libvirt, you can leverage libvirt’s networking functionality as well—which means that you can easily create libvirt virtual networks backed by OVS and automatically attach containers to OVS for advanced networking configurations. You only need to create an OVS-backed virtual network like I describe in this post on VLANs with OVS and libvirt.

I still need to do some additional investigation and testing to see how the networking configuration in the container’s config file interacts with the networking configuration in the libvirt XML file. For example, how do you define multiple network interfaces? Can you control the name of the veth pairs that show up in the host? I don’t have any answers for these questions (yet). If you know the answers, feel free to speak up in the comments!

All courteous feedback and interaction is welcome, so I invite you to start (or join) the discussion via the comments below.

Tags: , , , , , ,

This post will show you how to use Open vSwitch (OVS) to connect LXC containers on different hosts together using GRE tunnels. This process is quite similar to using OVS and GRE tunnels to connect VMs, and even has some similarity to connecting network namespaces over a GRE tunnel.

If you’re not familiar with LXC, my introductory post on LXC might be useful. I also found this post on networking with LXC to be extremely helpful, as was this overview of LXC on Ubuntu 12.04. As you might guess, I used Ubuntu 12.04.3 LTS for my testing; note that if you are using a different distribution, the specific commands and/or file locations may differ.

At a high level, the process for creating this configuration looks like this:

  • Create your containers using virtual Ethernet (veth) networking.
  • Create a OVS bridge that does not contain any physical interfaces and attach a GRE interface configured to connect to a remote container host.
  • Attach the containers’ veth interfaces to this new OVS bridge.

Let’s take a look at each of these steps in a bit more detail.

Create the Containers

My LXC introduction post covers this in a fair amount of detail, but I’ll walk you through the steps again just for the sake of completeness. Additionally, I wanted to use containers with two separate interfaces: a “public” interface that could reach the outside world, and a “private” interface for inter-container communications.

To create the containers, you can simply use lxc-create to create the basic container. For example, if you are also using Ubuntu 12.04 64-bit for your testing, then creating a container with the same release and architecture would require only this command (note you might need to prepend sudo to this command, depending on your specific configuration):

lxc-create -t ubuntu -n <container name>

Once the base container is created, you can then edit the container configuration (which is found at /var/lib/lxc/container name/config by default) to add the second interface. You’d add this text to the configuration:

lxc.network.type = veth
lxc.network.flags = up
lxc.network.veth.pair = c1eth1
lxc.network.ipv4 = 10.10.10.10/24

In my specific testing, I left the container’s first network interface attached to the default lxcbr0 Linux bridge, which provides connectivity to external networks via NAT. If you wanted bridged connectivity, you’d need to set that up separately. Also, feel free to substitute a different name for c1eth1 above; this just provides a user-recognizable name for the host-facing side of the veth pair that provides connectivity for the container.

After you’ve created and configured the container(s) appropriately, start the container with lxc-start -n <container name> to ensure that it boots successfully and without any unexpected errors. Then shut it down and start it again with the console detached, like so:

lxc-start -d -n <container name>

Now that the container is up and running, you’re ready to configure OVS.

Create and Configure the OVS Bridge

If you’re not familiar with connecting things with GRE tunnels using OVS, I suggest you take a look here and here. You will probably also find this post on examining OVS traffic patterns to be helpful in understanding why you will configure OVS in this particular way.

Create the OVS bridge you’ll use for inter-container connectivity using ovs-vsctl, like this (feel free to substitute whatever name you want for “br-int” in the command below):

ovs-vsctl add-br br-int

Add a GRE interface to this bridge:

ovs-vsctl add-port br-int gre0

Then configure the GRE interface appopriately:

set interface gre0 type=gre options:remote_ip=192.168.1.100

The IP address specified above should point to a reachable interface on the remote host where the other container(s) is/are running. Also, you are welcome to use a different name for the port and interface than what I used (I used gre0), but be sure the port and interface match (as far as I know you can’t use one name for the port and a different name for the interface).

If you are connecting only two hosts with containers, then you’ll repeat this configuration on the second host, except that its GRE tunnel will point back to the first host (the first host specifies the IP address of the second host, the second host specifies the IP address of the first host).

If you are connecting more than two hosts, you’ll either want to a) manually build a non-looping topology of GRE tunnels, or b) build a full mesh of GRE tunnels and enable STP, as outlined here.

Once you’ve created the OVS bridge and the GRE interface, you’re ready for the final step: connecting the containers to OVS.

Attach the Containers to OVS

The final step is to connect the container’s veth interface to OVS. If you’re using an early enough version of OVS that still contains the Linux bridge compatibility code, then you can just specify the name of the OVS bridge with the lxc.network.link directive in the container’s configuration file. Newer versions of OVS, however, lack the Linux bridge compatibility layer, so this doesn’t work. In this case, you’ll need to manually attach the veth interfaces to OVS.

To manually attach the container’s veth interface to OVS, you must first identify the veth interface. This is why I used the lxc.network.veth.pair configuration directive in the container to assign a user-recognizable name. If you didn’t use that directive, you can use the ip link list command to list the interfaces.

Once you’ve identified the correct veth interface, simply add it to OVS with this command:

ovs-vsctl add-port br-int c1eth1

I used the c1eth1 name in the command above; you’d substitute the correct name for the appropriate veth interface for your container(s).

Once that is done, you should be able to ping over the GRE tunnel using the IP addresses assigned to the inter-container communication interfaces (which should be recognized as eth1 in the containers). Congratulations—you’ve just connected containers on two different hosts together over a GRE tunnel!

There are a couple of things to note about this configuration:

  • Note that we only used a single interface on the host. Obviously, we could have used more (for example, we could have use eth0 for host management and eth1 for host-to-host GRE tunnel connections), but using only a single host interface allows us to use this method in public cloud environments that only provide a single interface.
  • While the GRE tunnel provides encapsulation, it does not provide encryption.
  • The process of attaching the containers’ veth interface to OVS is manual right now, so that won’t scale in an environment of any size. I’m exploring ways to help automate that process. One possible avenue is the use of libvirt, so stay tuned for posts describing any progress I make in that area.

If you have any questions, corrections, suggestions, or clarifications, please feel free to post them in the comments below. All feedback is welcome and appreciated.

Tags: , , , , ,

In this post, I’m going to provide a brief introduction to working with Linux containers via LXC. Linux containers are getting a fair amount of attention these days (perhaps due to Docker, which leverages LXC on the back-end) as a lightweight alternative to full machine virtualization such as that provided by “traditional” hypervisors like KVM, Xen, or ESXi.

Both full machine virtualization and containers have their advantages and disadvantages. Full machine virtualization offers greater isolation at the cost of greater overhead, as each virtual machine runs its own full kernel and operating system instance. Containers, on the other hand, generally offer less isolation but lower overhead through sharing certain portions of the host kernel and operating system instance. In my opinion full machine virtualization and containers are complementary; each offers certain advantages that might be useful in specific situations.

Now that you have a rough idea of what containers are, let’s take a closer look at using containers with LXC. I’m using Ubuntu 12.04.3 LTS for my testing; if you’re using something different, keep in mind that certain commands may differ from what I show you here.

Installing LXC is pretty straightforward, at least on Ubuntu. To install LXC, simply use apt-get:

apt-get install lxc

Once you have LXC installed, your next step is creating a container. To create a container, you’ll use the lxc-create command and supply the name of the container template as well as the name you want to assign to the new container:

lxc-create -t <template> -n <container name>

You’ll need Internet access to run this command, as it will download (via your configured repositories) the necessary files to build a container according to the template you specified on the command line. For example, to use the “ubuntu” template and create a new container called “cn–01″, the command would look like this:

lxc-create -t ubuntu -n cn-01

Note that the “ubuntu” template specified in this command has some additional options supported; for example, you can opt to create a container with a different release of Ubuntu (it defaults to the latest LTS) or a different architecture (it defaults to the host’s architecture).

Once you have at least one container created, you can list the containers that exist on your host system:

lxc-list

This will show you all the containers that have been created, grouped according to whether the container is stopped, frozen (paused), or running.

To start a container, use the lxc-start command:

lxc-start -n <container name>

Using the lxc-start command as shown above is fine for initial testing of your container, to ensure that it boots up as you expect. However, you won’t want to run your containers long-term like this, as the container “takes over” your console with this command. Instead, you want the container to run in the background, detached from the console. To do that, you’ll add the “-d” parameter to the command:

lxc-start -d -n <container name>

This launches your container in the background. To attach to the console of the container, you can use the lxc-console command:

lxc-console -n <container name>

To escape out of the container’s console back to the host’s console, use the “Ctrl-a q” key sequence (press Ctrl-a, release, then press q).

You can freeze (pause) a container using the lxc-freeze command:

lxc-freeze -n <container name>

Once frozen, you can unfreeze (resume) a container just as easily with the lxc-unfreeze command:

lxc-unfreeze -n <container name>

You can also make a clone (a copy) of a container:

lxc-clone -o <existing container> -n <new container name>

On Ubuntu, LXC is configured by default to start containers in /var/lib/lxc. Each container will have a directory there. In a container’s directory, that container’s configuration will be stored in a file named config. I’m not going to provide a comprehensive breakdown of the settings available in the container’s configuration (this is a brief introduction), but I will call out a few that are worth noting in my opinion:

  • The lxc.network.type option controls what kind of networking the container will use. The default is “veth”; this uses virtual Ethernet pairs. (If you aren’t familiar with veth pairs, see my post on Linux network namespaces.)
  • The lxc.network.veth.pair configuration option controls the name of the veth interface created in the host. By default, a container sees one side of the veth pair as eth0 (as would be expected), and the host sees the other side as either a random name (default) or whatever you specify here. Personally, I find it useful to rename the host interface so that it’s easier to tell which veth interface goes to which container, but YMMV.
  • lxc.network.link specifies a bridge to which the host side of the veth pair should be attached. If you leave this blank, the host veth interface is unattached.
  • The configuration option lxc.rootfs specifies where the container’s root file system is stored. By default it is /var/lib/lxc/<container name>/rootfs.

There are a great deal of other configuration options, naturally; check out man 5 lxc.conf for more information. You may also find this Ubuntu page on LXC to be helpful; I certainly did.

I’ll have more posts on Linux containers in the future, but this should suffice to at least help you get started. If you have any questions, any suggestions for additional resources other readers should consider, or any feedback on the post, please add your comment below. I’d love to hear from you (courteous comments are always welcome).

Tags: , , , ,

« Older entries § Newer entries »