OVS

You are currently browsing articles tagged OVS.

Welcome to part 14 of the Learning NSX blog series, in which I discuss the ability for VMware NSX to do Layer 3 routing in logical networks. This post will also include a look at a very cool feature within VMware NSX known as distributed logical routing. This post will take a closer look at distributed logical routing within the context of an OpenStack environment that’s been integrated with VMware NSX. (Although NSX isn’t necessarily tied to OpenStack, I’ll assume you’re using OpenStack just to simplify the discussion.)

If you’re new to this series, you can find links to all the articles on my Learning NVP/NSX page. Ideally, I’d recommend you read all the articles, but if you’re just interested in some of the high-level concepts you probably don’t need to do that. For those interested in the deep technical details, I’d suggest catching up on the series before proceeding.

Overview of Logical Routing

One of the features of VMware NSX that can be useful, depending on customer requirements, is the ability to create complex network topologies. For example, creating a multi-tier network topology like the one shown below is easily accomplished via VMware NSX:

Sample network topology

Note that this topology has two tenant-specific routing entities—these are logical routers. A logical router is an abstraction created and maintained by VMware NSX on behalf of your cloud management platform (like OpenStack, which I’ll assume you’re using here). These logical entities perform the routing process just like a physical router would (forwarding traffic based on a routing table, changing the source and destination MAC address, maintaining an ARP cache of MAC addresses, decrementing the TTL, etc.). Of course, they are not exactly the same as physical routers; you can’t, for example, connect two logical routers directly to each other.

Logical routers also act as the logical boundary between one or more logical networks and an external network. Logical routers can be connected to multiple logical networks (each logical network with its own logical router interface), but can only be connected to a single external network. Thus, you can’t use a logical router as a transit path between two external networks (two VLANs, for example).

Now that you have a good understanding of logical routing, let’s take a closer look at the various components inside VMware NSX.

Components of Logical Routing

The components are pretty straightforward. In addition to the logical router abstraction that I’ve discussed already, you also have logical router ports (naturally, these are the ports on a logical router that connect it to a logical network or an external network), network address translation (NAT) rules (for handling address translation tasks), and a routing table (for…well, routing).

You can see all of these components in NSX Manager. Once you’re logged into NSX Manager, select Network Components > Logical Layer > Logical Routers, then click on a specific logical router from the list. This will display the screen shown below (click the image for a larger version):

Logical router detail in NSX Manager

A few things to note here:

  • You’ll note that the logical router has a port whose attachment is listed as “L3GW”. This denotes an attachment to a Layer 3 Gateway Service, an entity I described in part 9 of the series. This Layer 3 Gateway Service is itself comprised of two NSX gateway appliances; part 6 in the series discussed how to add a gateway appliance to your installation. The relationship between logical router, Layer 3 Gateway Service, and gateway appliance can be confusing for some; I plan to discuss that in more detail in the next post.
  • This particular logical router is not configured as a distributed logical router. This means that the actual routing function resides on a Layer 3 Gateway Service. The routing functionality is instantiated in a highly available configuration on two different gateway appliances within the Layer 3 Gateway Service.
  • NAT Synchronization is set to on; this refers to keeping NAT state synchronized between the active and standby routing functions instantiated on the gateway appliances.
  • As noted under Replication Mode, this router uses an NSX service node (refer to part 10 for more details on service nodes) for packet replication/BUM traffic.
  • You might notice that one of the logical router ports is assigned the IP address 169.254.169.253 (and you’ll also note a corresponding “no NAT” rule and routing table entries for that same network). Astute readers recognize this as the network for Automatic Private IP Addressing (APIPA), also known as IPv4 Link-Local Addresses per RFC 3927. This exists to support an OpenStack-specific feature known as the metadata service, and is created automatically by OpenStack. (I’ll talk more about OpenStack later in this post.)

All of these components and settings are accessible via the NSX API, and since NSX Manager is completely an API client (it merely consumes NSX APIs and does not provide standalone functionality outside of some logging features), you could create, modify, and delete any of the logical routing components directly within NSX Manager. (Or, if you were so inclined, you could make the API calls yourself to do these tasks.) Typically, though, these tasks would be handled via integration between NSX and your cloud management platform, like OpenStack.

One key component of NSX’s logical routing functionality that you can’t see in NSX Manager is how the routing is actually implemented in the data plane. As with most features in NSX, the actual data plane implementation is handled via Open vSwitch (OVS) and a set of flow rules pushed down by the NSX controllers. These flow rules control the flow of traffic within and between logical networks (logical switches in NSX). You can see some of the flow rules in OVS using the ovs-dpctl dump-flows command, which will produce output something like what’s shown in this screenshot (note that the addresses are highlighted because I used grep to show only the flows matching a certain IP address):

List of flows in OVS

(Click the image above for a larger version.)

These flow rules include actions like re-writing source and destination MAC addresses and decrementing the TTL, both tasks carried out by “normal” routers when routing traffic between networks. These flow rules also provide some insight into the differences between a logical router and a distributed logical router. While both are logical entities, the way in which the data plane is implemented is different for each:

  • For a logical router, the flow rules will direct traffic to the appropriate gateway appliance in the Layer 3 Gateway Service. The logical router is actually instantiated on a gateway appliance, so all routed traffic must go to the logical router, get “routed” (routing table consulted, source and destination MAC re-written, TTL decremented, NAT rules applied, etc.), then get sent on to the final destination (which might be a VM on a hypervisor in NSX or might be a physical network outside of NSX).
  • For a distributed logical router, the flow rules will direct traffic either to the appropriate gateway appliance in the Layer 3 Gateway Service or to the destination hypervisor directly. Why the “either/or”? If the traffic is north/south traffic—that is, traffic being routed out of a logical network onto the physical network—then it must go to the gateway appliance (which, as I have mentioned before, is where traffic is unencapsulated and placed onto the physical network). However, if the traffic is east/west traffic—traffic that is moving from one server on a logical network to another server on a logical network—then the traffic is “routed” directly on the source hypervisor and then sent across an encapsulated connection to the hypervisor where the destination VM resides.

In both cases, there is only one logical router. For a non-distributed logical router, the data plane is instantiated on a gateway appliance only. For a distributed logical router, the data plane is instantiated both on the local hypervisors as well as on a gateway appliance. (This is assuming you’ve set an uplink on the logical router, meaning you have a north/south connection. If you haven’t set an uplink, then the routing functionality is instantiated on the hypervisors only.)

This should provide a good overview of how logical routing is implemented in VMware NSX, but there’s one more aspect I want to cover: logical routers in OpenStack with NSX.

Logical Routers in OpenStack

As you work with OpenStack Networking—Neutron, as it’s commonly called—you’ll find that the abstractions Neutron uses map really well to the abstractions that NSX uses. So, to create a logical router in NSX, you just create a logical router in OpenStack. Attaching an OpenStack logical router to a logical network tells NSX to create the logical switch port, create the logical router port, and connect the two ports together.

In OpenStack, there are a number of different ways to create a logical router:

  • OpenStack Dashboard (Horizon)
  • Command-line interface (CLI)
  • OpenStack Orchestration (Heat) template
  • API calls directly

When using the web-based Dashboard user interface, you can only create centralized logical routers, not distributed logical routers. The Dashboard UI also doesn’t provide any way of knowing if a logical router is distributed or not; for that, you’ll need the CLI (the command is provided shortly).

On a system with the neutron CLI client installed, you can create a logical router like this:

neutron router-create <router name>

This creates a centralized logical router. If you want to create a distributed logical router, it’s as simple as this:

neutron router-create <router name> -\-distributed True

The neutron router-show command will return output about the specified logical router; that output will tell you if it is a distributed logical router.

The neutron CLI client also offers commands to update a logical router’s routing table (to add or remove static routes, for example), or to connect a logical router to an external network (to set an uplink, in other words).

If you want to create a logical router as part of a stack created via OpenStack Orchestration (Heat), you could use this YAML snippet in a HOT-formatted template to create a distributed logical router (click here if you can’t see the code block below):

OpenStack Heat also offers resource types for setting the router’s external gateway and creating router interfaces (logical router ports). If you aren’t familiar with OpenStack Heat, you might find this introduction useful.

That wraps up this post on logical routing with VMware NSX. As always, I welcome your courteous feedback, so feel free to speak up in the comments below. In the next post, I’ll spend a bit of time discussing logical routers, gateway servies, and gateway appliances. See you next time!

Tags: , , , , , , ,

Welcome to Technology Short Take #41, the latest in my series of random thoughts, articles, and links from around the Internet. Here’s hoping you find something useful!

Networking

  • Network Functions Virtualization (NFV) is a networking topic that is starting to get more and more attention (some may equate “attention” with “hype”; I’ll allow you to draw your own conclusion there). In any case, I liked how this article really hit upon what I personally feel is something many people are overlooking in NFV. Many vendors are simply rushing to provide virtualized versions of their solution without addressing the orchestration and automation side of the house. I’m looking forward to part 2 on this topic, in which the author plans to share more technical details.
  • Rob Sherwood, CTO of Big Switch, recently published a reasonably in-depth look at “modern OpenFlow” implementations and how they can leverage multiple tables in hardware. Some good information in here, especially on OpenFlow basics (good for those of you who aren’t familiar with OpenFlow).
  • Connecting Docker containers to Open vSwitch is one thing, but what about using Docker containers to run Open vSwitch in userspace? Read this.
  • Ivan knocks centralized SDN control planes in this post. It sounds like Ivan favors scale-out architectures, not scale-up architectures (which are typically what is seen in centralized control plane deployments).
  • Looking for more VMware NSX content? Anthony Burke has started a new series focusing on VMware NSX in pure vSphere environments. As far as I can tell, Anthony is up to 4 posts in the series so far. Check them out here: part 1, part 2, part 3, and part 4. Enjoy!

Servers/Hardware

  • Good friend Simon Seagrave is back to the online world again with this heads-up on a potential NIC issue with an HP Proliant firmware update. The post also contains a link to a fix for the issue. Glad to see you back again, Simon!
  • Tom Howarth asks, “Is the x86 blade server dead?” (OK, so he didn’t use those words specifically. I’m paraphrasing for dramatic effect.) The basic premise of Tom’s position is that new technologies like server-side caching and VSAN/Ceph/Sanbolic (turning direct-attached storage into shared storage) will dramatically change the landscape of the data center. I would generally agree, although I’m not sure that I agree with Tom’s statement that “complexity is reduced” with these technologies. I think we’re just shifting the complexity to a different place, although it’s a place where I think we can better manage the complexity (and perhaps mask it). What do you think?

Security

Cloud Computing/Cloud Management

  • Juan Manuel Rey has launched a series of blog posts on deploying OpenStack with KVM and VMware NSX. He has three parts published so far; all good stuff. See part 1, part 2, and part 3.
  • Kyle Mestery brought to my attention (via Twitter) this list of the “best newly-available OpenStack guides and how-to’s”. It was good to see a couple of Cody Bunch’s articles on the list; Cody’s been producing some really useful OpenStack content recently.
  • I haven’t had the opportunity to use SaltStack yet, but I’m hearing good things about it. It’s always helpful (to me, at least) to be able to look at products in the context of solving a real-world problem, which is why seeing this post with details on using SaltStack to automate OpenStack deployment was helpful.
  • Here’s a heads-up on a potential issue with the vCAC 6.0.1.1 upgrade—the upgrade apparently changes some configuration files. The linked blog post provides more details on which files get changed. If you’re looking at doing this upgrade, read this to make sure you aren’t adversely affected.
  • Here’s a post with some additional information on OpenStack live migration that you might find useful.

Operating Systems/Applications

  • RHEL7, Docker, and Puppet together? Here’s a post on just such a use case (oh, I forgot to mention OpenStack’s involved, too).
  • Have you ever walked through a spider web because you didn’t see it ahead of time? (Not very fun.) Sometimes I feel that way with certain technologies or projects—like there are connections there with other technologies, projects, trends, etc., that aren’t quite “visible” just yet. That’s where I am right now with the recent hype around containers and how they are going to replace VMs. I’m not so sure I agree with that just yet…but I have more noodling to do on the topic.

Storage

  • “Server SAN” seems to be the name that is emerging to describe various technologies and architectures that create pools of storage from direct-attached storage (DAS). This would include products like VMware VSAN as well as projects like Ceph and others. Stu Miniman has a nice write-up on Server SAN over at Wikibon; if you’re not familiar with some of the architectures involved, that might be a good place to start. Also at Wikibon, David Floyer has a write-up on the rise of Server SAN that goes into a bit more detail on business and technology drivers, friction to adoption, and some recommendations.
  • Red Hat recently announced they were acquiring Inktank, the company behind the open source scale-out Ceph project. Jon Benedict, aka “Captain KVM,” weighs in with his thoughts on the matter. Of course, there’s no shortage of thoughts on the acquisition—a quick web search will prove that—but I find it interesting that none of the “big names” in storage social media had anything to say (not that I could find, anyway). Howard? Stephen? Chris? Martin? Bueller?

Virtualization

  • Doug Youd pulled together a nice summary of some of the issues and facts around routed vMotion (vMotion across layer 3 boundaries, such as across a Clos fabric/leaf-spine topology). It’s definitely worth a read (and not just because I get mentioned in the article, either—although that doesn’t hurt).
  • I’ve talked before—although it’s been a while—about Hyper-V’s choice to rely on host-level NIC teaming in order to provide network link redundancy to virtual machines. Ben Armstrong talks about another option, guest-level NIC teaming, in this post. I’m not so sure that using guest-level teaming is any better than relying on host-level NIC teaming; what’s really needed is a more full-featured virtual networking layer.
  • Want to run nested ESXi on vCHS? Well, it’s not supported…but William Lam shows you how anyway. Gotta love it!
  • Brian Graf shows you how to remove IP pools using PowerCLI.

Well, that’s it for this time around. As always, I welcome all courteous comments, so feel free to share your thoughts, ideas, rants, links, or feedback in the comments below.

Tags: , , , , , , , , , , , , ,

Welcome to Technology Short Take #39, in which I share a random assortment of links, articles, and thoughts from around the world of data center-related technologies. I hope you find something useful—or at least something interesting!

Networking

  • Jason Edelman has been talking about the idea of a Common Programmable Abstraction Layer (CPAL). He introduces the idea, then goes on to explore—as he puts it—the power of a CPAL. I can’t help but wonder if this is the right level at which to put the abstraction layer. Is the abstraction layer better served by being integrated into a cloud management platform, like OpenStack? Naturally, the argument then would be, “Not everyone will use a cloud management platform,” which is a valid argument. For those customers who won’t use a cloud management platform, I would then ask: will they benefit from a CPAL? I mean, if they aren’t willing to embrace the abstraction and automation that a cloud management platform brings, will abstraction and automation at the networking layer provide any significant benefit? I’d love to hear others’ thoughts on this.
  • Ethan Banks also muses on the need for abstraction.
  • Craig Matsumoto of SDN Central helps highlight a recent (and fairly significant) development in networking protocols—the submission of the Generic Network Virtualization Encapsulation (Geneve) proposal to the IETF. Jointly authored by VMware, Microsoft, Red Hat, and Intel, this new protocol proposal attempts to bring together the strengths of the various network virtualization encapsulation protocols out there today (VXLAN, STT, NVGRE). This is interesting enough that I might actually write up a separate blog post about it; stay tuned for that.
  • Lee Doyle provides an analysis of the market for network virtualization, which includes some introductory information for those who might be unfamiliar with what network virtualization is. I might contend that Open vSwitch (OVS) alone isn’t an option for network virtualization, but that’s just splitting hairs. Overall, this is a quick but worthy read if you are trying to get started in this space.
  • Don’t think this “software-defined networking” thing is going to take off? Read this, and then let me know what you think.
  • Chris Margret has a nice dissection of how bash completion works, particularly in regards to the Cumulus Networks implementation.

Servers/Hardware

  • Via Kevin Houston, you can get more details on the Intel E7 v2 and new blade servers based on the new CPU. x86 marches on!
  • Another interesting tidbit regarding hardware: it seems as if we are now seeing the emergence of another round of “hardware offloads.” The first round came about around 2006 when Intel and AMD first started releasing their hardware assists for virtualization (Intel VT and AMD-V, respectively). That technology was only “so-so” at first (VMware ESX continued to use binary translation [BT] because it was still faster than the hardware offloads), but it quickly matured and is now leveraged by every major hypervisor on the market. This next round of hardware offloads seems targeted at network virtualization and related technologies. Case in point: a relatively small company named Netronome (I’ve spoken about them previously, first back in 2009 and again a year later), recently announced a new set of network interface cards (NICs) expressly designed to provide hardware acceleration for software-defined networking (SDN), network functions virtualization (NFV), and network virtualization solutions. You can get more details from the Netronome press release. This technology is actually quite interesting; I’m currently talking with Netronome about testing it with VMware NSX and will provide more details as that evolves.

Security

  • Ben Rossi tackles the subject of security in a software-defined world, talking about how best to integrate security into SDN-driven architectures and solutions. It’s a high-level article and doesn’t get into a great level of detail, but does point out some of the key things to consider.

Cloud Computing/Cloud Management

  • “Racker” James Denton has some nice articles on OpenStack Neutron that you might find useful. He starts out with discussing the building blocks of Neutron, then goes on to discuss building a simple flat network, using VLAN provider networks, and Neutron routers and the L3 agent. And if you need a breakdown of provider vs. tenant networks in Neutron, this post is also quite handy.
  • Here’s a couple (first one, second one) of quick walk-throughs on installing OpenStack. They don’t provide any in-depth explanations of what’s going on, why you’re doing what you’re doing, or how it relates to the rest of the steps, but you might find something useful nevertheless.
  • Thinking of building your own OpenStack cloud in a home lab? Kevin Jackson—who along with Cody Bunch co-authored the OpenStack Cloud Computing Cookbook, 2nd Edition—has three articles up on his home OpenStack setup. (At least, I’ve only found three articles so far.) Part 1 is here, part 2 is here, and part 3 is here. Enjoy!
  • This post attempts to describe some of the core (mostly non-technical) differences between OpenStack and OpenNebula. It is published on the OpenNebula.org site, so keep that in mind as it is (naturally) biased toward OpenNebula. It would be quite interesting to me to see a more technically-focused discussion of the two approaches (and, for that matter, let’s include CloudStack as well). Perhaps this already exists—does anyone know?
  • CloudScaling recently added a Google Compute Engine (GCE) API compatibility module to StackForge, to allow users to leverage the GCE API with OpenStack. See more details here.
  • Want to run Hyper-V in your OpenStack environment? Check this out. Also from the same folks is a version of cloud-init for Windows instances in cloud environments. I’m testing this in my OpenStack home lab now, and hope to have more information soon.

Operating Systems/Applications

Storage

Virtualization

  • Brendan Gregg of Joyent has an interesting write-up comparing virtualization performance between Zones (apparently referring to Solaris Zones, a form of OS virtualization/containerization), Xen, and KVM. I might disagree that KVM is a Type 2 hardware virtualization technology, pointing out that Xen also requires a Linux-based dom0 in order to function. (The distinction between a Type 1 that requires a general purpose OS in a dom0/parent partition and a Type 2 that runs on top of a general purpose OS is becoming increasingly blurred, IMHO.) What I did find interesting was that they (Joyent) run a ported version of KVM inside Zones for additional resource controls and security. Based on the results of his testing—performed using DTrace—it would seem that the “double-hulled virtualization” doesn’t really impact performance.
  • Pete Koehler—via Jason Langer’s blog—has a nice post on converting in-guest iSCSI volumes to native VMDKs. If you’re in a similar situation, check out the post for more details.
  • This is interesting. Useful, I’m not so sure about, but definitely interesting.
  • If you are one of the few people living under a rock who doesn’t know about PowerCLI, Alan Renouf is here to help.

It’s time to wrap up; this post has already run longer than usual. There was just so much information that I want to share with you! I’ll be back soon-ish with another post, but until then feel free to join (or start) the conversation by adding your thoughts, ideas, links, or responses in the comments below.

Tags: , , , , , , , , , , , ,

Welcome to Technology Short Take #38, another installment in my irregularly-published series that collects links and thoughts on data center-related technologies from around the web. But enough with the introduction, let’s get on to the content already!

Networking

  • Jason Edelman does some experimenting with the Python APIs on a Cisco Nexus 3000. In the process, he muses about the value of configuration management tool chains such as Chef and Puppet in a world of “open switch” platforms such as Cumulus Linux.
  • Speaking of Cumulus Linux…did you see the announcement that Dell has signed a reseller agreement with Cumulus Networks? I’m pretty excited about this announcement, and I hope that Cumulus sees great success as a result. There are a variety of write-ups about the announcement; so good, many not so good. The not-so-good variety typically refers to Cumulus’ product as an SDN product when technically it isn’t. This article on Barron’s by Tiernan Ray is a pretty good summary of the announcement and some of its implications.
  • Pete Welcher has launched a series of articles discussing “practical SDN,” focusing on the key leaders in the market: NSX, DFA, and the yet-to-be-launched ACI. In the initial installation of the series, he does a good job of providing some basics around each of the products, although (as would be expected of a product that hasn’t launched yet) he has to do some guessing when it comes to ACI. The series continues with a discussion of L2 forwarding and L3 forwarding across the various products. Definitely worth reading, in my opinion.
  • Nick Buraglio takes away all your reasons for not collecting flow-based data from your environment with his write-up on installing nfsen and nfdump for NetFlow and/or sFlow collection.
  • Terry Slattery has a nice write-up on new network designs that are ideally suited for SDN. If you are looking for a primer on “next-generation” network designs, this is worth reviewing.
  • Need some Debian packages for Open vSwitch 2.0? Here’s another article from Nick Buraglio—he has some information to help you out.

Servers/Hardware

Nothing this time, but check back next time.

Security

Nothing from my end. Maybe you have something you’d like to share in the comments?

Cloud Computing/Cloud Management

  • Christian Elsen (who works in Integration Engineering at VMware) has a nice series of articles going on using OpenStack with vSphere and NSX. The series starts here, but follow the links at the bottom of that article for the rest of the posts. This is really good stuff—he includes the use of the NSX vSwitch with vSphere 5.5, and talks about vSphere OpenStack Virtual Appliance (VOVA) as well. All in all, well worth a read in my opinion.
  • Maish Saidel-Keesing (one of my co-authors on the first edition of VMware vSphere Design and also a super-sharp guy) recently wrote an article on how adoption of OpenStack will slow the adoption of SDN. While I agree that widespread adoption of OpenStack could potentially retard the evolution of enterprise IT, I’m not necessarily convinced that it will slow the adoption of SDN and network virtualization solutions. Why? Because, in part, I believe that the full benefits of something like OpenStack need a good network virtualization solution in order to be realized. Yes, some vendors are writing plugins for Neutron that manipulate physical switches. But for developers to get true isolation, application portability, the ability to re-create production environments in development—all that is going to require network virtualization.
  • Here’s a useful OpenStack CLI cheat sheet for some commonly-used commands.

Operating Systems/Applications

  • If you’re using Ansible (a product I haven’t had a chance to use but I’m closely watching), but I came across this article on an upcoming change to the SSH transport that Ansible uses. This change, referred to as “ssh_alt,” promises a significant performance increase for Ansible. Good stuff.
  • I don’t think I’ve mentioned this before, but Forbes Guthrie (my co-author on the VMware vSphere Design books and an already great guy) has a series going on using Linux as a domain controller for a vSphere-based lab. The series is up to four parts now: part 1, part 2, part 3, and part 4.
  • Need (or want) to increase the SCSI timeout for a KVM guest? See these instructions.
  • I’ve been recommending that IT pros get more familiar with Linux, as I think its influence in the data center will continue to grow. However, the problem that I sometimes face is that experienced folks tend to share these “super commands” that ordinary folks have a hard time decomposing. However, this site should make that easier. I’ve tried it—it’s actually pretty handy.

Storage

  • Jim Ruddy (an EMCer, former co-worker of mine, and an overall great guy) has a pretty cool series of articles discussing the use of EMC ViPR in conjunction with OpenStack. Want to use OpenStack Glance with EMC ViPR using ViPR’s Swift API support? See here. Want a multi-node Cinder setup with ViPR? Read how here. Multi-node Glance with ViPR? He’s got it. If you’re new to ViPR (who outside of EMC isn’t?), you might also find his articles on deploying EMC ViPR, setting up back-end storage for ViPR, or deploying object services with ViPR to also be helpful.
  • Speaking of ViPR, EMC has apparently decided to release it for free for non-commercial use. See here.
  • Looking for more information on VSAN? Look no further than Cormac Hogan’s extensive VSAN series (up to Part 14 at last check!). The best way to find this stuff is to check articles tagged VSAN on Cormac’s site. The official VMware vSphere blog also has a series of articles running; check out part 1 and part 2.

Virtualization

  • Did you happen to see this news about Microsoft Hyper-V Recovery Manager (HRM)? This is an Azure-hosted service that can be roughly compared to VMware’s Site Recovery Manager (SRM). However, unlike SRM (which is hosted on-premise), HRM is hosted by Microsoft Azure. As the article points out, it’s important to understand that this doesn’t mean your VMs are replicated to Azure—it’s just the orchestration portion of HRM that is running in Azure.
  • Oh, and speaking of Hyper-V…in early January Microsoft released version 3.5 of their Linux Integration Services, which primarily appears to be focused on adding Linux distribution support (CentOS/RHEL 6.5 is now supported).
  • Gregory Gee has a write-up on installing the Cisco CSR 1000V in VirtualBox. (I’m a recent VirtualBox convert myself; I find the vboxmanage command just so very handy.) Note that I haven’t tried this myself, as I don’t have a Cisco login to get the CSR 1000V code. If any readers have tried it, I’d love to hear your feedback. Gregory also has a few other interesting posts I’m planning to review in the next few weeks as well.
  • Sunny Dua, who works with VMware PSO in India, has a series of blog posts on architecting vSphere environments. It’s currently up to five parts; I don’t know how many more (if any) are planned. Here are the links: part 1 (clusters), part 2 (vCenter SSO), part 3 (storage), part 4 (design process), and part 5 (networking).

It’s time to wrap up now before this gets any longer. If you have any thoughts or tidbits you’d like to share, I welcome any and all courteous comments. Join (or start) the conversation!

Tags: , , , , , , , , , , , ,

I’ve previously discussed using Open vSwitch (OVS) with Linux Containers (LXC) in a couple of previous posts (here and here). In this post, I’m going to show you one way to have your containers automatically connected to OVS on startup without having to use libvirt.

I tested this configuration using Ubuntu 12.04 with the Linux 3.8.0 kernel and an alpha release of LXC 1.0.0 from the precise-backports repository. The version of LXC in the 12.04 repositories (version 0.7.5, if I recall correctly) isn’t new enough to support the specific feature I’m describing here, so plan accordingly.

If you aren’t familiar with LXC, I’d suggest you first read my LXC introductory post. You’ll probably also find the post on using LXC, OVS, and GRE tunnels useful, as some of the information there is applicable here also.

Ready? Let’s get started.

Configuring the Container

You can create your container using the standard LXC tools:

lxc-create -n cn-01 -t ubuntu

As you may already know, this will create a container named “cn–01″ based on the Ubuntu template. The configuration for this container will be found, by default, at /var/lib/lxc/cn–01/config. By default, unless you’ve changed the configuration of your system, this container will be configured to use virtual Ethernet network interfaces and be attached to the default LXC bridge.

The changes required to make the container connect to OVS are, fortunately, quite minimal:

  1. First, remove or comment out the lxc.network.link line. This is the configuration parameter that causes the container to attach to the default LXC bridge (normally called “lxcbr0″).
  2. Add a configuration line to run a script after creating the network interfaces. In my examples here, I’ll assume the script is called “ovsup” and is stored in the /etc/lxc/ directory. The configuration parameter should look something like this:
lxc.network.script.up = /etc/lxc/ovsup

(Note that there is also a corresponding lxc.network.script.down configuration parameter, but I won’t be using it in this example.)

Once you’ve made these changes to the container’s configuration, then you’re ready to create the actual script.

Creating the Network Attachment Script

Your script—the one referenced on the lxc.network.script.up in the container’s configuration file—should look something like this:

(If you can’t see the code block above, please click here.)

LXC passes five parameters to the script when it is called:

  1. The name of the container
  2. The configuration section of the container’s configuration (“net” in this case)
  3. Either “up” or “down”, depending on which configuration option is calling the script (lxc.network.script.up passes “up”, lxc.network.script.down passes “down”)
  4. The type of networking (“veth” in this case)
  5. The name of the interface (randomly generated unless you have included lxc.network.veth.peer in the container’s configuration)

This simple script doesn’t really need anything other than the interface name, so it only uses parameter 5 (the $5 in the script). The script first ensures that the appropriate OVS bridge exists (creating it if necessary), then deletes the interface from the OVS bridge (if it exists) and adds it back to the OVS bridge.

(Note: If you are using the lxc.network.script.down configuration parameter, you could eliminate the line to delete the port from the OVS bridge and place it in the down script instead. Or, you could write logic into the script to see if “down” is being called and delete the port. There are a variety of ways to approach the situation.)

Using this configuration, when you start the container the host-side virtual Ethernet interface created by LXC will be automatically added to OVS, and your container will have whatever network connectivity is dictated by the OVS configuration. This could include tunneled connectivity (as described here) or bridged connectivity.

If you have any questions, feedback, or corrections, please feel free to speak up in the comments below. I encourage reader interaction!

Tags: , , ,

For the last couple of years, I’ve been sharing my annual “projects list” and then grading myself on the progress (or lack thereof) on the projects at the end of the year. For example, I shared my 2012 project list in early January 2012, then gave myself grades on my progress in early January 2013.

In this post, I’m going to grade myself on my 2013 project list. Here’s the project list I posted just under a year ago:

  1. Continue to learn German.
  2. Reinforce base Linux knowledge.
  3. Continue using Puppet for automation.
  4. Reinforce data center networking fundamentals.

So, how did I do? Here’s my assessment of my progress:

  1. Continue to learn German: I have made some progress here, though certainly not the progress that I wanted to learn. I’ve incorporated the use of Memrise, which has been helpful, but I still haven’t made the progress I’d like. If anyone has any other suggestions for additional tools, I’m open to your feedback. Grade: D (below average)

  2. Reinforce base Linux knowledge: I’ve been suggesting to VMUG attendees that they needed to learn Linux, as it’s popping up all over the place in all sorts of roles. In my original 2013 project list, I said that I was going to focus on RHEL and RHEL variants, but over the course of the year ended up focusing more on Debian and Ubuntu instead (due to more up-to-date packages and closer alignment with OpenStack). Despite that shift in focus, I think I’ve made decent progress here. There’s always room to grow, of course. Grade: B (above average)

  3. Continue using Puppet for automation: I’ve made reasonable progress here, expanding my use of Puppet to include managing Debian/Ubuntu software repositories (see here and here for examples), managing SSH keys, managing Open vSwitch (OVS) via a third-party module, and—most recently—exploring the use of Puppet with OpenStack (no blog posts—yet). There’s still quite a bit I need to learn (some of my manifests don’t work quite as well as I’d like), but I did make progress here. Grade: C (average)

  4. Reinforce data center networking fundamentals: Naturally, my role at VMware has me spending a great deal of time on how network virtualization affects DC networking, and this translated into some progress on this project. While I gained solid high-level knowledge on a number of DC networking topics, I think I was originally thinking I needed more low-level “in the weeds” knowledge. In that regard, I don’t feel like I did well; on the flip side, though, I’m not sure whether I really needed more low-level “in the weeds” knowledge. This highlights a key struggle for me personally: how to balance the deep, “in the weeds” knowledge with the high-level knowledge. Suggestions on how others have overcome this challenge are welcome. Grade: C (average)

In summary: not bad, but could have been better!

What’s not reflected in this project list is the progress I made with understanding OpenStack, or my deepened level of knowledge of OVS (just browse articles tagged OVS for an idea of what I’ve been doing in that area).

Over the next week or two, I’ll be reflecting on my progress with my 2013 projects and thinking about what projects I should be taking in 2014. In the meantime, I would love to hear any feedback, suggestions, or thoughts on projects I should consider, technologies that should be incorporated, or learning techniques I should leverage. Feel free to speak up in the comments below.

Tags: , , , , , , ,

In this post, I’m going to show you how to manage Open vSwitch (OVS) using the popular open source configuration management tool Puppet. This is not the first time I’ve written about this topic; in the past I showed you how to automate OVS configuration with Puppet via a hack utilizing some RHEL-OVS integrations. This post, however, focuses on the use of an actual Puppet module that will manage the configuration of OVS, a much cleaner solution—in my view, at least—than leveraging the file-based integrations I discussed earlier.

The Puppet module I’ll be using and discussing in this post is the L23Network module (found here on GitHub). This is an extremely flexible and useful module, capable of not only configuring and managing network interfaces but also capable of managing the configuration of OVS. The latter functionality—managing the configuration of OVS—will be the primary focus of this article (with one exception).

The L23Network module is pretty well-documented, so I won’t bother regurgitating the documentation here. Instead, I’ll just try to provide some specific examples, and tie those examples back to some of the various OVS configurations I’ve shown you in earlier posts.

First, let’s get “the one exception” I mentioned earlier out of the way. In OVS environments, you’ll often need to bring up a physical interface without assigning that interface an IP address. For example, consider a physical interface that is providing bridged connectivity to guest domains (VMs) on an OVS bridge. You’ll want the interface to be up, but the interface does not need an IP address. Using the L23Network module, you can accomplish that with this piece of code in your manifest:

l23network::l3::ifconfig {'eth1': ipaddr => 'none'}

Now that eth1 is up, you could create a bridge to which to attach it with this code:

l23network::l2::bridge {'br-ex': }

And then you could actually attach eth1 like this:

l23network::l2::port {'eth1': bridge => 'br-ex'}

You could then provide multi-VLAN bridged connectivity to guest domains via libvirt as I explained in my post on using VLANs with libvirt and OVS. (Or, if you are using LXC with libvirt and OVS, you could provide multi-VLAN bridged connectivity to containers.)

The L23Network module can also work with other types of interfaces, not just physical interfaces. Want to create an internal interface, perhaps to use as a tunnel endpoint for GRE tunnel as I described here? Use this snippet of Puppet code:

l23network::l2::port {'tep0': bridge => 'br-tun', type => 'internal'}

You could then assign the newly-created tep0 interface an IP address on your transport network like this:

l23network::l3::ifconfig {'tep0': ipaddr => '10.1.1.1/24'}

(In theory, you could also use the L23Network module to create an internal interface so as to run host management through OVS, but then you could run into issues communicating with the Puppet server over the same interfaces the Puppet server is configuring.)

I haven’t yet used L23Network to create/manage patch ports or GRE ports, but the documentation indicates the module is capable of doing so. This is an area that I plan to explore in a bit more detail in the near future (in my copious free time).

Based on the snippets I’ve given you above, it should be pretty straightforward how to combine these various pieces together to fully configure and manage OVS instances across a large number of systems. However, if you have any questions, feel free to post them in the comments below. I also welcome all other courteous feedback; you are encouraged to start (or join) the conversation.

Tags: , , , , , ,

In this post I’m going to expand a little bit on using libvirt to connect Linux containers (created using LXC) to Open vSwitch (OVS). I made brief mention of this in my post on using LXC with libvirt, but did not provide any details. This post aims to provide those details.

I’m assuming that you’re already familiar with LXC, OVS, and libvirt. If you aren’t familiar with these projects, I suggest you have a look back at other articles I’ve written about them in the past. One of the easiest ways to do that is to browse articles tagged LXC, tagged OVS, and/or tagged Libvirt. Further, I’m using Ubuntu 12.04 LTS in my environment, so if you’re using another Linux distribution please note that some commands and/or package names might be different.

The basic process for connecting a Linux container to OVS using libvirt looks something like this:

  1. Create one or more virtual networks in libvirt to “front-end” OVS.
  2. Create your container(s) using standard LXC user-space tools.
  3. Create libvirt XML definitions for your container(s).
  4. Start the container(s) using virsh.

Steps 2, 3, and 4 were covered in my previous post on using LXC and libvirt, so I won’t repeat them here. Step 1 is the focus here. (If you are a long-time reader and/or well-versed with libvirt and OVS, there isn’t a great deal of new information here; I just wanted to present it in the context of LXC for the sake of completeness.)

To create a libvirt virtual network to front-end OVS, you need to create an XML definition that you can use with virsh to define the virtual network. Here’s an example XML definition:

(If you can’t see the code block above, please click here.)

A few notes about this XML definition:

  • You normally wouldn’t include the UUID, as that is generated automatically by libvirt. If you were using this XML to create the virtual network from scratch, I would recommend just deleting the UUID line.
  • The network is named “bridged”, and points to the OVS bridge named br-ex. In this particular case, br-ex is a simple OVS bridge that contains a single physical interface.
  • This particular virtual network only has a single portgroup configured for untagged traffic. If you wanted to provide a virtual network that supported multiple VLANs, you could add more portgroups with the VLAN tags as I describe in my post on using VLANs with OVS and libvirt. You’d then modify the container’s XML definition to point to the appropriate portgroup, and in this way you could easily support running multiple containers across multiple VLANs on a single host.
  • A libvirt virtual network can only point to a single bridge, so if you wanted to support both bridged (as shown here) as well as tunneled connectivity (perhaps as described in my post on LXC, OVS, and GRE tunnels), you would need to create a second XML definition that creates a separate virtual network. You could then modify the container’s XML definition to point to the new network you just created.

In the bullets above, I mentioned modifying the container’s XML definition. In particular, I’m referring to the <interface type='network‘> portion of the container’s XML definition. To use a libvirt network for a container’s network connectivity, you’d specify <source network='bridged'/> (replacing “bridged” with whatever the name of your virtual network is; I’m using the name provided in the sample XML code above). For multiple interfaces in the container, simply supply multiple <interface type='network'> entries in the container’s XML definition, and configure the source network for each of them appropriately.

Hopefully this post provides some additional details and information on using libvirt to connect Linux containers to OVS. If you have any questions, or if you have more information to share on this topic, please feel free to speak up in the comments below. I encourage and welcome all courteous feedback!

Tags: , , , , ,

In this post, I’ll discuss how you could use Open vSwitch (OVS) and GRE tunnels to connect bare metal workloads. While OVS is typically used in conjunction with a hypervisor such as KVM or Xen, you’re certainly not restricted to only using it on hypervisors. Similarly, while GRE tunnels are commonly used to connect VMs or containers, you’re definitely not restricted from using them with bare metal workloads as well. In this post, I’ll explore how you would go about connecting bare metal workloads over GRE tunnels managed by OVS.

This post, by the way, was sparked in part by a comment on my article on using GRE tunnels with OVS, in which the reader asked: “Is there a way to configure bare Linux (Ubuntu)…with OVS installed…to serve as a tunnel endpoint…?” Hopefully this post helps answer that question. (By the way, the key to understanding how this works is in understanding OVS traffic patterns. If you haven’t yet read my post on examining OVS traffic patterns, I highly recommend you go have a look right now. Seriously.)

Once you have OVS installed (maybe this is helpful?), then you need to create the right OVS configuration. That configuration can be described, at a high level, like this:

  • Assign an IP address to a physical interface. This interface will be considered the “tunnel endpoint,” and therefore should have an IP address that is correct for use on the transport network.
  • Create an OVS bridge that has no physical interfaces assigned.
  • Create an OVS internal interface on this OVS bridge, and assign it an IP address for use inside the GRE tunnel(s). This interface will be considered the primary interface for the OS instance.
  • Create the GRE tunnel for connecting to other tunnel endpoints.

Each of these areas is described in a bit more detail in the following sections.

Setting Up the Transport Interface

When setting up the physical interface—which I’ll refer to as the transport interface moving forward, since it is responsible for transporting the GRE tunnel across to the other endpoints—you’ll just need to use an IP address and routing entries that enable it to communicate with other tunnel endpoints.

Let’s assume that we are going to have tunnel endpoints on the 192.168.1.0/24 subnet. On the bare metal OS instance, you’d configure a physical interface (I’ll assume eth0, but it could be any physical interface) to have an IP address on the 192.168.1.0/24 subnet. You could do this automatically via DHCP or manually; the choice is yours. Other than ensuring that the bare metal OS instance can communicate with other tunnel endpoints, no additional configuration is required. (I’m using “required” as in “necessary to make it work.” You may want to increase the MTU on your physical interface and network equipment in order to accommodate the GRE headers in order to optimize performance, but that isn’t required in order to make it work.)

Once you have the transport interface configured and operational, you can move on to configuring OVS.

Configuring OVS

If you’ve been following along at home with all of my OVS-related posts (you can browse all posts using the OVS tag), you can probably guess what this will look like (hint: it will look a little bit like the configuration I described in my post on running host management through OVS). Nevertheless, I’ll walk through the configuration for the benefit of those who are new to OVS.

First, you’ll need to create an OVS bridge that has no physical interfaces—the so-called “isolated bridge” because it is isolated from the physical network. You can call this bridge whatever you want. I’ll use the name br-int (the “integration bridge”) because it’s commonly used in other environments like OpenStack and NVP/NSX.

To create the isolated bridge, use ovs-vsctl:

ovs-vsctl add-br br-int

Naturally, you would substitute whatever name you’d like to use in the above command. Once you’ve created the bridge, then add an OVS internal interface; this internal interface will become the bare metal workload’s primary network interface:

ovs-vsctl add-port br-int mgmt0 -- set interface mgmt0 type=internal

You can use a name other than mgmt0 if you so desire. Next, configure this new OVS internal interface at the operating system level, assigning it an IP address. This IP address should be taken from a subnet “inside” the GRE tunnel, because it is only via the GRE tunnel that you’ll want the workload to communicate.

The following commands will take care of this part for you:

ip addr add 10.10.10.30/24 dev mgmt0
ip link set mgmt0 up

The process of ensuring that the mgmt0 interface comes up automatically when the system boots is left as an exercise for the reader (hint: use /etc/network/interfaces).

At this point, the bare metal OS instance will have two network interfaces:

  • A physical interface (we’re assuming eth0) that is configured for use on the transport network. In other words, it has an IP address and routes necessary for communication with other tunnel endpoints.
  • An OVS internal interface (I’m using mgmt0) that is configured for use inside the GRE tunnel. In other words, it has an IP address and routes necessary to communicate with other workloads (bare metal, containers, VMs) via the OVS-hosted GRE tunnel(s).

Because the bare metal OS instance sees two interfaces (and therefore has visibility into the routes both “inside” and “outside” the tunnel), you may need to apply some policy routing configuration. See my introductory post on Linux policy routing if you need more information.

The final step is establishing the GRE tunnel.

Establishing the GRE Tunnel

The commands for establishing the GRE tunnel have been described numerous times, but once again I’ll walk through the process just for the sake of completeness. I’m assuming that you’ve already completed the steps in the previous section, and that you are using an OVS bridge named br-int.

First, add the GRE port to the bridge:

ovs-vsctl add-port br-int gre0

Next, configure the GRE interface on that port:

ovs-vsctl set interface gre0 type=gre options:remote_ip=<IP address of remote tunnel endpoint>

Let’s say that you’ve assigned 192.168.1.10 to the transport interface on this system (the bare metal OS instance), and that the remote tunnel endpoint (which could be a host with multiple containers, or a hypervisor running VMs) has an IP address of 192.168.1.15. On the bare metal system, you’d configure the GRE interface like this:

ovs-vsctl set interface gre0 type=gre options:remote_ip=192.168.1.15

On the remote tunnel endpoint, you’d configure the GRE interface like this:

ovs-vsctl set interface gre0 type=gre options:remote_ip=192.168.1.10

In other words, each GRE interface points to the transport IP address on the opposite end of the tunnel.

Once the configuration on both ends is done, then you should be able to go into the bare metal OS instance and ping an IP address inside the GRE tunnel. For example, I used this configuration to connect a bare metal Ubuntu 12.04 instance, a container running on an Ubuntu host, and a KVM VM running on an Ubuntu host (I had a full mesh topology with STP enabled, as described here). I was able to successfully ping between the bare metal OS instance, the container, and the VM, all inside the GRE tunnel.

Summary, Caveats, and Other Thoughts

While this configuration is interesting as a “proof of concept” that OVS and GRE tunnels can be used to connect bare metal OS instances and workloads, there are a number of considerations and/or caveats that you’ll want to think about before trying something like this in a production environment:

  • The bare metal OS instance has visibility both “inside” and “outside” the tunnel, so there isn’t an easy way to prevent the bare metal OS instance from communicating outside the tunnel to other entities. This might be OK—or it might not. It all depends on your requirements, and what you are trying to achieve. (In theory, you might be able to provide some isolation using network namespaces, but I haven’t tested this at all.)
  • If you want to create a full mesh topology of GRE tunnels, you’ll need to enable STP on OVS.
  • There’s nothing preventing you from attaching an OpenFlow controller to the OVS instances (including the OVS instance on the bare metal OS) and pushing flow rules down. This would eliminate the need for STP, since OVS won’t be in MAC learning mode. This means you could easily incorporate bare metal OS instances into a network virtualization-type environment. However…
  • There’s no easy way to provide a separation of OVS and the bare metal OS instance. This means that users who are legitimately allowed to make administrative changes to the bare metal OS instance could also make changes to OVS, which could easily “break” the configuration and cause problems. My personal view is that this is why you rarely see this sort of OVS configuration used in conjunction with bare metal workloads.

I still see value in explaining how this works because it provides yet another example of how to configure OVS and how to use OVS to help provide advanced networking capabilities in a variety of environments and situations.

If you have any questions, I encourage you to add them in the comments below. Likewise, if I have overlooked something, made any mistakes, or if I’m just plain wrong, please speak up below (courteously, of course!). I welcome all useful/pertinent feedback and interaction.

Tags: , , , , , ,

One of the cool things about libvirt is the ability to work with multiple hypervisors and virtualization technologies, including Linux containers using LXC. In this post, I’m going to show you how to use libvirt with LXC, including leveraging libvirt to help automate attaching containers to Open vSwitch (OVS).

If you aren’t familiar with Linux containers and LXC, I invite you to have a look at my introductory post on Linux containers and LXC. It should give you enough background to make this post make sense.

To use libvirt with an LXC container, there are a couple of basic steps:

  1. Create the container using standard LXC user-space tools.
  2. Create a libvirt XML definition for the container.
  3. Define the libvirt container domain.
  4. Start the libvirt container domain.

The first part, creating the container, is pretty straightforward:

lxc-create -t ubuntu -n cn-02

This creates a container using the Ubuntu template and calls it cn–01. As you may recall from my introductory LXC post, this creates the container’s configuration and root filesystem in /var/lib/lxc by default. (I’m assuming you are using Ubuntu 12.04 LTS, as I am.)

Once you have the container created, you next need to get it into libvirt. Libvirt uses a standard XML-based format for defining VMs, containers, networks, etc. At first, I thought this might be the most difficult section, but thanks to this page I was able to create a template XML configuration.

Here’s the template I was able to create:

(If you can’t see the embedded code above, please click here.)

Simply take this XML template and save it as something like lxc-template.xml or similar. Then, after you’ve created your container using lxc-create as above, you can easily take this template and turn it into a specific container configuration with only one command. For example, suppose you created a container named “cn–02″ (as I did with the command I showed earlier). If you wanted to customize the XML template, just use this simple Unix/Linux command:

sed 's/REPLACE/cn-02/g' lxc-template.xml > cn-02.xml

Once you have a container-specific libvirt XML configuration, then defining it in libvirt is super-easy:

virsh -c lxc:// define cn-02.xml

Then start the container:

virsh -c lxc:// start cn-02

And connect to the container’s console:

virsh -c lxc:// console cn-02

When you’re done with the container’s console, press Ctrl-] (that’s Control and right bracket at the same time); that will return you to your host.

Pretty handy, eh? Further, since you’re now controlling your containers via libvirt, you can leverage libvirt’s networking functionality as well—which means that you can easily create libvirt virtual networks backed by OVS and automatically attach containers to OVS for advanced networking configurations. You only need to create an OVS-backed virtual network like I describe in this post on VLANs with OVS and libvirt.

I still need to do some additional investigation and testing to see how the networking configuration in the container’s config file interacts with the networking configuration in the libvirt XML file. For example, how do you define multiple network interfaces? Can you control the name of the veth pairs that show up in the host? I don’t have any answers for these questions (yet). If you know the answers, feel free to speak up in the comments!

All courteous feedback and interaction is welcome, so I invite you to start (or join) the discussion via the comments below.

Tags: , , , , , ,

« Older entries