General

This category contains general posts about blogging or this site.

I had the pleasure of attending the OpenStack Summit in Portland, OR last week. It was my first time at the OpenStack Summit, and it was great to meet lots of folks in the OpenStack community as well as be exposed to some more in-depth and detailed OpenStack information. While I was there I tried to liveblog as many sessions as I was able; here are links to the various session liveblogs that I managed to publish. Enjoy!

Getting From Grizzly to Havana, a DevOps Upgrade Pattern
Nicira NVP Deep Dive
Considerations for Building a Private Cloud, Folsom Update
Building HA OpenStack with Puppet in 20 Minutes
OpenStack Capacity Planning
Networking in the Cloud, an SDN Primer
OpenStack Back to the Enterprise, Keep Calm and Boldly Go On
OpenStack High Availability in Grizzly and Beyond

If anyone has any other liveblog sessions that should be added to this list, drop me a comment and let me know.

Tags:

This is a session titled “More Reliable, More Resilient, More Redundant: OpenStack High Availability in Grizzly and Beyond.” The presenter is Florian Haas from Hastexo (@hastexo). Florian is one of the founders of Hastexo, which is a services firm that provides OpenStack services, among other things.

There are four things that need to be addressed when discussing high availability:

  • Infrastructure
  • Storage
  • Compute
  • Networking

The presentation starts with a discussion of the infrastructure layer. Changes from Folsom to Grizzly are relatively few. Examples of infrastructure services like the databases (typically MySQL, sometimes PostgreSQL), AMQP (message queue; could be using RabbitMQ, ZeroMQ, etc.), and API services. With regard to infrastructure HA, there are 5 types of infrastructure nodes:

  1. Cloud controller
  2. API node
  3. Network node
  4. Compute node
  5. Storage controller

The cloud controller runs services that underpin OpenStack. It runs a relational database, an AMQP server, registry services, etc. The ability for an implementation to use active/passive or active/active depends largely on the specific back-end applications in use. Largely, cloud controllers will be deployed active/passive (this is due, in part at least, to the persistent data storage found in the relational databases).

API nodes are fundamentally stateless (locally); it interacts with the AMQP message bus. For most API services, we can use active/active scale-out approaches for API services.

The network node is, according to the presenter, “interesting.” This node takes care of routing between tenant networks, provider networks, and upstream networks. The network node typically also runs the DHCP agent to provide IP addresses to tenant systems. We’ll come back to the network node shortly.

The compute node is the hypervisor itself, where guest VMs/instances actually execute.

The storage controller depends greatly on what kind of back-end block storage. The block storage server (Cinder server) might have no local storage (which might be the case if you were running with Ceph, for example) or it might have lots of local storage.

All five of these infrastructure node types can use the same high availability stack, but with a few minor differences. The “recommended” high availability stack is Pacemaker. Hastexo has reference configurations for using Pacemaker with all OpenStack infrastructure services. According to the presenter, “it’s not rocket science—you can do it.”

The presenter next shows a diagram of an architecture using Pacemaker and Corosync for high availability of the cloud controller. From there he shows an example of using Pacemaker/Corosync for high availability of the stateless API services.

From here, he moves on to discuss HA for compute in a bit more detail. Grizzly addresses guest HA. He shows a hack where you “override” a couple of parameters to “trick” Nova into restarting VMs on another host after the first host fails. Unfortunately, this hack breaks live migration and is unsafe with Cinder volumes (although the Cinder issue is fixable).

In Grizzly, Nova has a nova evacuate command, but this command actually only works after a host has failed. This evacuation functionality isn’t present in Horizon (the Dashboard).

Another interesting feature is VM Ensembles. This feature allows you to group guests in a resilient fashion. The presenter provides an example of 6 VMs (2 each of three tiers), and the desire to make sure that the redundant nodes aren’t running on the same compute node. (For VMware users, this would be analogous to anti-affinity rules.) Unfortunately, this feature did not make it into the Grizzly release (it should be in Havana). A workaround would be to use a scheduler filter.

Looking in more detail at Networking, the same HA architecture (Pacemaker/Corosync) worked reasonably well in Folsom for Quantum-server and L3 agent. It didn’t work so well for the DHCP agent. However, this active/passive approach didn’t scale well. Grizzly helps address this by running multiple DHCP agents and multiple L3 agents to provide better scale-out support (via the Quantum scheduler).

Finally, the presenter moves on to discuss storage. Grizzly has dramatically expanded the amount of support within Cinder for block storage options; the presenter highly recommends upgrading to Grizzly if you need expanded Cinder support. He briefly mentions new Cinder drivers for NFS, GlusterFS, 3Par, LeftHand, EMC, and others. Pacemaker/Corosync is needed for the Cinder volume server. There is a hack required (need to override a host value in the database) in order to provide high availability for the Cinder volume server. It’s possible that this hack will be fixed in Havana (a bug has already been filed to fix it).

A few other tidbits:

  • Libvirt watchdog support in Nova and Glance is coming
  • Heat will bring some additional high availability features (Heat is now an integrated project and will be fully supported with Havana, as will Ceilometer)
  • The RabbitMQ library (Kombu) has gained the ability to have a list of queue hosts to which to connect (instead of just a single host)
  • ZeroMQ (peer-to-peer messaging) is another interesting option
  • MySQL/Galera is firming up to provide write-set replication (which will enable synchronous replication)
  • Some changes are occurring within RBD when you use it for both Glance and Cinder (the presenter did not elaborate)

At this point, the presenter wrapped up the session by opening up for questions and answers.

Tags:

This is a session titled “OpenStack Back to the Enterprise: Keep Calm and Boldly Go.” The session is led by Florian Otel (@florianotel on Twitter), with HP Cloud Services in EMEA. The purpose of this talk is to share some of the “lessons learned” in how to position OpenStack to enterprise customers and overcome their objections.

Florian starts out the presentation with a slide that says, “This is a business, not a science project.” He re-iterates that this session is about making business sense. He also assures us that this presentation won’t be a glitzy marketing session, either—it will be real, nitty gritty, “in the trenches” knowledge learned when positioning OpenStack to enterprise customers. Finally, Florian acknowledges that his presentation will probably be a bit biased toward service provider-type use cases.

The presentation goes on to display a picture of Geoffrey Moore, who wrote a book titled “Crossing the Chasm”. Florian ties this to the adoption curves of various technologies and Moore’s assertion in his book that “we need to very mindful of the customers in the market”. Specifically, marketing to the early adopters (on the left edge of the bell curve) and the mainstream (the bulge of the bell curve) is very different.

Next, the presenter shows us a picture of Clayton Christiansen, who wrote (among other books) “The Innovator’s Dilemma.” The conclusion drawn in the book is that there are two types of innovation: sustaining innovation and disruptive innovation.

Florian ties these two thoughts together in a chart the combines the adoption curve with the adoption/evolution of OpenStack as a disruptive innovation.

So how does one pitch OpenStack to an enterprise organization? Florian shares this quote: “Never try to sell a meteor to a dinosaur. It wastes your time and annoys the dinosaur.”

If that’s not the right way, then what is? Florian makes the “dreaded Linux-OpenStack comparison,” combining it with models and charts from Moore’s “Crossing the Chasm.” Florian posits that a key adoption point is that the underlying platform—be it Linux or OpenStack—must “become irrelevant.” He points to Comcast’s demo (which is powered by OpenStack) and asks, “Did anyone see OpenStack there?”

Florian goes to another quote from Moore stating that applications have an advantage over platforms when it comes to crossing the chasm. Moore believes that “platforms must be garbed in application clothing” in order to cross the chasm. In other words, “mind the gap” between applications and platforms.

The next slide in Florian’s presentation says this: “The more I love the idea, the less money it makes!!” The key point to take away is that as technologists we often “fall in love” with a technology/project/platform, but we need to be able to articulate the value of this technology/project/platform in some way other than “it’s a really cool technology.” This aligns very closely with my own thinking—we need to adopt some practicality if we want to see the technology/project/platform we love so much actually succeed.

Florian now moves from abstract and theoretical applications and moves into a more concrete discussion of various use cases for HPCS (HP Cloud Services). These use cases include archival, collaboration, “cloud bursting,” dev/test PaaS, and production applications. He delves in a bit deeper on one particular use case, to which he refers as “Dropbox for the enterprise.”

Next the presenter shares a warning: “All good ideas must die—so that great ideas might live.” Good use cases are going to die and pass away, but new (potentially even better) use cases will emerge. We mustn’t get “tied” to our existing use cases.

There are fundamentally three different areas where a company can focus:

  • Operational excellence
  • Product leadership
  • Customer intimacy

Florian says he believes that one lesson HP learned is that customer intimacy is critically important. He didn’t say, but I suspect that customer intimacy is important at earlier stages of market adoption (going back to the bell curve of market adoption), while other areas of focus might be more important at other stages of adoption.

According to Florian, it’s called bleeding edge for a reason. Be ready to help your customers that hurt themselves. It’s also important to “not get in your own way.” Be willing to admit when you’re wrong, press the Reset button, and press forward with customer needs in the forefront of your vision.

The secret to success is, according to Florian, simple: “Just learn to use OpenStack the way Hendrix uses his guitar.”

Tags: ,

This is a session titled “Networking in the cloud: An SDN primer.” It’s led by Ben Cherian, Chief Strategy Officer for Midokura. Ben indicates he’ll try to remain as impartial as possible while attempting to describe and define SDN.

Ben asserts that the basic driver pushing folks toward SDN is that the current state of networking in the cloud is too complex and too manual. As a result, the first question that people start asking is, how can I automate this? Telecom has gone through this before, and data networking is in a similar state of flux and development.

The presenter next discusses Almon Strowger, who invented the first electromechanical switch. His work–which might have been driven by some level of paranoia–led to automated phone switching and the rotary phone. Almon Strowger wanted to address privacy concerns and intentional human errors, and along the way he also solved unintentional human errors, connection speeds, and lower operational costs.

Ben indicates that the cloud—in particular, networking—is in a similar position. It’s time for the “Almon Strowger” of cloud networking to solve the challenges that keep networking from scaling.

The presenter next covers the concept of a control plane and the data plane. Abstraction is a key tenet of computer science, and when the control plane and the data plane reside on the same physical device (which is typically the case in traditional networks today), there is no abstraction. Abstraction exists in other areas of computing, but not in networking. To bring that abstraction into networking, a simple example is to separate the control plane into a separate controller that exists apart from the data plane. This is basic SDN.

Ben sees three use cases for SDN:

  1. IaaS
  2. Data center fabrics
  3. Carrier/WAN use cases

Looking at these in reverse order, examples of SDN technologies that you could see here would be a “hybrid control plane” leveraging either BGP (a distributed control plane) or OpenFlow (centralized control plane). In the data center fabric space, this involves the use of OpenFlow to manage multiple switches. Finally, in the IaaS space, you see software-based solutions and overlays. Examples of companies in this IaaS space are Midokura, VMware/Nicira, and others.

Some key requirements for IaaS “cloud networking”/SDN:

  • Multi-tenancy
  • L2 isolation
  • L3 isolation
  • Scalable control plane
  • NAT (floating IP)
  • ACLs
  • Stateful (L4) firewall
  • VPN
  • BGP gateway
  • RESTful API
  • Integration with CMS (like OpenStack)

Looking at this list of requirements, Ben feels like you can eliminate the technologies leveraged in the carrier/WAN and data center fabric SDN use cases, because these technologies don’t properly address all the IaaS/cloud networking requirements.

Ben next reviews a networking diagram that shows how these various requirements translate into cloud networks.

The candidate models to address these requirements are:

  • Traditional network
  • Hop=by-hop OpenFlow
  • Edge-to-edge IP overlays

Using traditional network models, VLANs become a constraint (only 4096 VLANs available means only 4096 tenants). This constrains L2 isolation. L3 isolation is constrained by VRFs, which are not fault tolerant, require expensive hardware, and don’t scale well.

If you wanted to use an OpenFlow fabric, the issue is more about the limitations of the physical switch itself. Storing the state in physical switches has issues with scale, not fast enough to update, and no atomicity of updates. There is also the issue of provisioning the physical switches that will then be controlled by OpenFlow. This approach also doesn’t address the other “higher level” concerns of cloud networking.

Edge-to-edge IP overlays are the method that Ben (and his company) prefers. Isolation is provided without VLANs, providing additional scalability. Only IP connectivity is required. You can use a scalable IGP (iBGP, OSPF) to build a reliable multi-path underlay network. This sort of thinking is inspired by a research paper by Microsoft regarding VL2 (no link provided).

Trends that support this sort of solution:

  • Faster packet processing on x86 servers at the edge
  • Clos (fat tree/leaf-spine) networks for the underlay
  • Merchant silicon brings down the cost of efficient IP switches
  • Optical intra-DC networks for plentiful bandwidth

The presenter also alludes to the use of configuration management tools (Chef, Puppet) on merchant silicon-based switches.

Ben next shows a diagram that illustrates an overlay network and an underlay network, and he walks through how traffic flows work (both from the overlay perspective as well as the underlay perspective).

Naturally, Ben believes that overlays are the right approach—but you still need a scalable control plane.

At this point, he opens the session up for questions and answers.

Tags: , ,

This is a liveblog of a session titled “OpenStack Capacity Planning.” The presenter starts out with a shout-out to the OpenStack Operations Guide that was recently written.

Here’s how to make capacity planning easy and simple:

  • A blank check backed by limitless funds
  • Unlimited time
  • A well-organized team of geniuses
  • Perfectly clear expectations that never change (up front & in writing)

Don’t have all that? Well, then you have to worry about capacity planning.

To start with capacity planning, the presenter suggests that the absolute best place to start is DevStack. Using DevStack allows you to test various capacity planning scenarios easily and quickly. However, the presenter warns against trying to use DevStack in production.

Next, you need to answer the question: Public cloud or private cloud? The answer to this question will drive a lot of the follow-up questions that you must answer. If the answer to this first question is private cloud, then it’s typically easier to do capacity planning because you’ll generally have a better idea of what sort of applications and workloads will be deployed. The presenter also feels that limited hardware/networking/storage choices in private cloud deployments makes capacity planning easier (although I’m not sure I agree). Deployment is also easier and can be a bit more leisurely due to lower growth rates.

For a public cloud, capacity planning is much trickier. You’ll have to design against a generic use case/workload because you won’t know the types of applications and workloads that your customers will actually put on this public cloud. In public cloud environments, you’ll likely be standing up new compute nodes constantly, so the speed and frequency of deployment is a key issue. Tools for fast, reliable provisioning and configuration management are very important.

The presenter mentions a few tools in this space: Crowbar, Puppet, Razor, Cobbler, Chef, CFEngine, and Fuel.

Monitoring is often an afterthought, but it really shouldn’t be—it should be done up front as much as possible. Here you can look at tools like Nagios, but there is a lot of debate in this space within the open source community. One of the reasons monitoring is important is to help establish some trending information, which helps in forecasting capacity needs.

What about the hypervisor you’re going to use? The presenter feels like this is an important decision. Pick the best hypervisor for your workload. The presenter prefers KVM, but there are others (he notably doesn’t mention vSphere, but does mention Hyper-V—interesting). He recommends against mixed hypervisor/heterogeneous environments. Hypervisor decision will also drive some storage decisions down the road (among other things).

At this point, the presenter circles back to DevStack, and recommends the use of DevStack to “test the heck out of” your selected hypervisor and anticipated workload. This isn’t necessarily for performance benchmarks, but rather to validate that everything works as expected.

Networking is the next big topic. The presenter recommends being very intentional about network selections, as he says that it is extremely difficult to switch between networking architectures (like from FlatDHCP to Quantum). Also be sure to watch out for bandwidth requirements and design accordingly. Naturally, IP addressing will be another concern you’ll need to consider.

Software-defined networking (SDN) is what the presenter recommends for any sizable deployments. He’s partial to NEC’s OpenFlow solution.

Compute density is a key factor in capacity planning. You’ll need to incorporate physical CPU cores, RAM, oversubscription ratio, and instance storage (ephemeral storage is local or shared). Using shared storage, like NFS or Ceph, means recovering VMs is much easier. Naturally, this means you’ll need to balance IOPS against storage capacity (GB/TB). Based on the speaker’s comments, I’d guess that he heavily favors Ceph.

When it comes to storage, your options for object storage are Swift, Ceph, and Gluster. Bandwidth is a concern (moving data to/from the object storage platform). The presenter stresses the importance of testing to understand the impact of the workloads on the object storage platform. The same goes for persistent block storage.

Finally, the presenter touches on concerns over the scalability of the cloud controller (which hosts the API endpoints and database). At some point you’ll have to consider separating the API endpoints onto dedicated boxes and using a load balancer. The presenter also suggests considering the use of Nova cells to help with growth and partitioning the load on the cloud controllers. (Cells are something that I really need to understand a bit better.)

And that concludes the session.

Overall, I didn’t find this session as useful as I would have liked—I expected this to be about ongoing capacity planning, not implementation design considerations. However, others might have come into the session with different expectations and might have found this session helpful. The key takeaway for me is that, as I saw from a similar session yesterday, it’s just as important when designing an OpenStack implementation to consider the resource demands, workload needs, and similar requirements. Just because it’s “cloud” doesn’t mean that it doesn’t still require some knowledge of how these components work under the covers.

Tags:

This is a liveblog of a session on using Puppet to build a highly available (HA) installation of OpenStack. The presenter is Boris Renski of Mirantis.

Boris believes that you need to know many things in order to successfully create the build architecture for an OpenStack deployment:

  • Linux
  • Networking
  • Virtualization
  • Python
  • Ruby
  • Puppet
  • Cobbler (or Razor)
  • mCollective/Salt

Boris introduces Fuel, which is an automation library for OpenStack (that is supposedly easy enough for a goat to use—a play on Mirantis’ geographical location and the Borat movies).

Fuel essentially includes the following items:

  • An OpenStack reference architecture with HA
  • Puppet manifests for deploying OpenStack
  • Cobbler-based bare metal provisioning
  • OpenStack packages
  • Support for CentOS, RHEL, and Ubuntu
  • Support for OpenStack Essex, Folsom, and (in May) Grizzly
  • A detailed configuration guide

Fuel supports a number of different deployment configurations: single node (pretty straightforward, much like DevStack); multi-node (including compact Swift and standalone Swift); and multi-node HA (with compact and standalone Swift and Quantum elements). “Swift compact” is for when Swift will be used only as back-end storage for VMs. “Quantum compact” is running Quantum on the controller node, even with high availability.

Fuel was specifically created in the form of a library so that users could easily modify and adopt the scripts to fit their particular OpenStack deployment. This gives users more flexibility when using Fuel to deploy OpenStack.

For the HA architecture:

  • They use HAProxy for most of the OpenStack services
  • For the message queue, they use an active/active RabbitMQ cluster
  • For the database, they use an active/active Galera MySQL cluster (this forces a minimum of four physical nodes)
  • The architecture uses keepalived for VIP (virtual IP) management

The overall process for deploying OpenStack with Fuel goes like this:

  1. Build the Fuel “master node,” which runs Cobbler and Puppet master
  2. Enter hardware info into Cobbler
  3. Cobbler installs the base OS (CentOS, RHEL, Ubuntu)
  4. Puppet picks up the node and installs/configures OpenStack

Next, Boris goes into a more light-hearted section on how he taught a goat to use Fuel. For us humans, this means the “Fuel portal,” which provides step-by-step instructions on using Fuel. They (Mirantis) also created “Fuel Web,” which is an easy GUI for Fuel. A private beta for Fuel Web is starting today.

Boris now turns the stage over to Roman, who shows a live demo of using Fuel to turn up a 2-node OpenStack deployment. Overall, Fuel looks like a very interesting and useful tool.

Looking ahead, what’s on the Fuel roadmap? Roman wants to add screens for the management of disks and NICs, which don’t exist in Fuel Web today. There’s also no support for Cinder in the web UI today, which is another item they’d like to add in future releases. They are also considering some level of monitoring and performance metrics for the OpenStack environments deployed using Fuel. Finally, they want to extend Fuel to help with OpenStack upgrades as well.

Fuel is available for download at http://fuel.mirantis.com.

Tags: , ,

This is a live blog for the session titled “Considerations for Building a Private Cloud, Folsom Update,” led by Ryan Richard of Rackspace (@rackninja on Twitter). As with other sessions here at the 2013 OpenStack Summit, this session is totally full, with people standing in the back, sitting on the floor along the sides, and seated on the floor across the front.

This session is about design considerations for building a private cloud with OpenStack. The focus will be the Folsom release. This session is based on experience after running Folsom for 6 months. Ryan intends to be able to provide a Grizzly-based version of this talk at the next time, after running Grizzly for 6 months.

First he tackles the question, “what is a private cloud”?

  • Are you looking for elastic or traditional virtualization? It most likely won’t be both.
  • Multi-tenant (or, more likely, multi-application)
  • Size (this talk will be limited to discussions of up to 100 nodes)
  • Private endpoints (the management endpoints aren’t accessible from the Internet)
  • Limited inbound connectivity
  • Customized for specific workloads

Ryan’s first recommendation is “Build with the end in mind.” He looks at how deploying the “m1.tiny” flavor would create a mismatch between CPU and RAM utilization, in that 48 vCPUs will be utilized but only a fraction of the host’s RAM would be allocated. The “m1.medium” flavor (4GB RAM, 2 vCPUs) creates a more balanced workload, whereas the larger flavors imbalance utilization the other way.

What this tells me is that capacity planning is just as importan with a private cloud deployment as it is with a “traditional virtualization” solution. Ryan’s recommendations around capacity are:

  • Don’t use a disk size of 0.
  • For public cloud offerings, you can limit the number of flavors. For private cloud offerings, you can create “customized” flavors for specific workloads. Find a balance for your organization.
  • Don’t forget about network utilization.

It’s important to remember that it’s easy to add compute nodes, but you can’t changed the fixed network (without Quantum) once instances are running. Quantum helps address this. (Note that Ryan indicates it’s possible to create multiple networks using the CLI or API in Folsom without Quantum, but the dashboard doesn’t respect or recognize it.)

This limit on the fixed network means that it is critically important to size the number of addresses available by calculating the number of instances that could be spun up within the cloud.

It’s easy to add nodes or networks on the host network side, but you can’t change the fixed network once you go into production (not without destroying your instances). It’s also easy to add more floating networks.

Ryan now switches gears to talk about images and storage. (There’s a session tomorrow in C123 at 1:50 PM.) Rackspace is going with virtio, qcow2 disk format, bare container, cloud-init with dynamic partitioning. I’m not 100% familiar with all of these terms, so you might expect to see some posts soon on some of these. There is a small performance hit with qcow2, so be sure to quantify that when building images (or re-using images that someone else has built).

Snapshotting is another sizing consideration that can’t really be adequately predicted (it’s hard to know if your users will be using snapshots or not). Ryan recommends qcow2 for snapshots. To help maximize the use of host caching of images, try to streamline the number of images.

A few more Glance tips:

  • Watch for network utilization. Glance could consume an entire 1Gb NIC.
  • Consider RAID–5 for large sequential reads/writes.
  • Disk bandwidth is more important than disk IOPS.
  • Reduce the number of images to improve host cache functionality.

Storage on the compute nodes, on the other hand, require a different view. Build for random I/O (RAID 10 or SSD or both).

A few architectural examples and thoughts:

  • For 1–20 servers, a single controller, a single API, and a single network (1 to 2 Gbps) are probably sufficient. If you need HA, you’ll need to increase those numbers appropriately.
  • For 20–100 servers, take the same architecture but add load balancing for APIs (for HA and scalability), use Swift (or CloudFiles or S3) for Glance, consider the use of availability zones, and consider dedicated networks for management, Cinder, VM-to-VM traffic, etc. You might also want to consider a dedicated system for gathering compute node metrics.

Some other general performance considerations:

  • Watch random IO and try to get as much random IO off local storage onto Cinder where possible.
  • Review hypervisor best practices.

A few other “lessons learned”:

  • Floating IPs must be associated with the “public_interface”
  • Each piece of OpenStack has its own architecture.
  • Folsom is stable.
  • Migration (live, block) works but scenarios exist where it doesn’t. Try not to rely on these mechanisms where possible, especially if you’re building an “elastic cloud” as opposed to “traditional virtualization.”
  • OpenStack is changing often, so keep up to date with the current state of the projects.
  • Don’t do heterogeneous nodes.

A few other operational updates:

  • There are some new nova hypervisor calls
  • New image types in Glance (including VMDK)
  • The policy.json file
  • Other things coming in Grizzly: cells, Quantum, and better AD/LDAP support

At this point Ryan opens the session up for Q&A.

Tags:

This is a “201 level” technical deep dive on Nicira Network Virtualization Platform (NVP). It was supposed to be led by Brad Hedlund, but due to unforeseen circumstances Brad was unable to make it to the Summit. Instead, Dan Wendlandt is leading the session. Dan, if you aren’t familiar, was the former Quantum PTL. I’m already familiar with all of this stuff (naturally), but wanted to sit in this session so that I could liveblog it for others’ benefit.

Dan starts the session with an explanation of network virtualization and why its important. He provides a technical definition of network virtualization as a “faithful reproduction of physical networks’ that is “fully isolated” and provides “location independence” while being “physical network state independent". This is NOT the same as simply running network software inside a VM. (While that is valuable, that’s not the same as network virtualization.)

NVP 1.0 was released in July 2011; the latest version, NVP 3.0, was released recently. The end of April will see NVP 3.1.

The only requirement from the physical network is IP connectivity. Open vSwitch (OVS) is used in all the various components, like Service Nodes and L2/L3 Gateways, and a set of out-of-band controllers manage all the components. Dan contrasts the “non-virtualized” (more physically-oriented) and “virtualized” (more logically oriented) views of NVP and the functionality is presents.

Starting at the “bottom” of the stack:

  • NVP wants to treat your network as a pool of resource capacity to be sliced on-demand for tenants.
  • NVP relies only on commodity features (specifically, L3 forwarding).
  • Configuration is done once (when the equipment is racked).
  • No humans in the loop when provisioning occurs.
  • Have the flexibility to choose/change architecture without it impacting the logical abstractions you’ve created for your workloads.

Next Dan shows that you could use a very scalable leaf-spine L3 network design, but you could also leverage existing network architectures—NVP only requires L3 connectivity.

The session then moves on to discuss Open vSwitch (OVS), which forms the basis for many of the Quantum plugins in OpenStack. It’s important to understand OVS is really like a generic “engine” that can be programmed to provide various levels of functionality, from “dumb” L2 learning to more sophisticated OpenFlow-based and tunnel-oriented networking.

This leads into a discussion of why L2/L3 tunneling as employed by NVP is important in satisfying the definition of network virtualization provided earlier. In order to truly isolate networks, NVP leverages tunneling protocols to encapsulate the traffic and provide the isolation and decoupling properties.

It is important to note, as Dan does, that a tunneling protocol alone is not network virtualization.

Next Dan moves on to describing the NVP controllers, which are x86-based software that manages OVS southbound and uses a northbound API (to expose to Quantum, for example). NVP controllers are NEVER in the data plane. The NVP controllers leverage a clustered/distributed architecture that provides both high availability (HA) and scale-out functionality.

As mentioned earlier, the controllers also provide the NVP API northbound to applications and/or cloud management systems (CMSes, like OpenStack and CloudStack). In an OpenStack environment, the NVP API is consumed by Quantum to create logical networks and logical network services.

Dan next provides an overview of the communication flow between the various components in a Quantum+NVP deployment.

The session now shifts focus to gateways, which provide the mechanism to get into or out of logical networks created by NVP. An L2 gateway allows you to map physical systems or VLANs into a logical switch managed/created by NVP. (Kudos to Dan, who actually goes with a very complex example–mapping in a remote VLAN across the WAN as an adjacent L2 segment–in order to explain L2 gateways.) He also explains the use of L3 gateways, which provide routed access into and out of the NVP logical networks. One of the interesting things about L3 gateways is that they are highly available (using failure zones) and a scale-out architecture (meaning that multiple L3 gateway appliances are supported).

Service nodes are used to handle broadcast and multicast packet replication. Using service nodes allows you to offload these tasks, if you wish, from the hypervisor compute nodes. Like the L3 gateways, the service nodes are built to provide high availability and scale-out functionality.

Dan wraps up the session with a review of the management and operations tool that NVP provides. NVP provides tunnel status, has a “port connectivity” tool that shows VM-to-VM connectivity, and actively inject traffic into flows to test network connectivity.

NVP isn’t just about scale. It’s about:

  • Data plane performance
  • Fast, reliable high availability
  • Rich logical capabilities (ACLs, QoS, statistics)
  • Ability to bring physical workloads into logical networks, onboard remote customers
  • Rich management and operations tools

At this point, Dan opens the session for questions and answers.

Tags: ,

This session is led by Rob Hirschfeld of Dell and—as the session title implies—focuses on orchestrating OpenStack upgrades. Rob, in case you were not aware, has been extensively involved in OpenStack and is a founder (co-founder?) of the Crowbar project.

Rob recommends reviewing Greg’s talk on this same topic from the last Summit because, unfortunately, things haven’t changed that much from then until now. He also recommends participating in the Reference Architecture (Tuesday @ 11:00 AM) and Interop (Tuesday @ 5:20 PM) sessions as well.

There is also a project, led by Piston, called Grenade, that is upgrade focused. Grenade, in Rob’s view, doesn’t adequately address multi-node upgrades, which is the focus of this session. The goal, of couse, is to enable upgrades while still enabling operators to keep their clouds up and running.

There are 3 basic ways to address the problem:

  1. On the fly: This is the “continuous deployment” model
  2. Split/Migrate/Replace: Upgrade nodes in staged groups/sets
  3. Rolling Upgrade: Nodes upgraded individually by a system-wide orchestration system that supports multiple versions

DevOps is crucial here in order to support these models, especially the continuous deployment and rolling upgrade models. As OpenStack grows and matures, it grows in complexity and interdependencies (Rob gives an example of the dependencies if one wanted to upgrade Nova), and this increases the need for automation and orchestration.

According to Rob, it’s crucial for operators (or teams like Rob’s at Dell that focus on deployments) to be able to communicate in “near real-time” with developers so that issues that affect deployments and upgrades can be addressed as part of the development cycle, not at the end of the development cycle. Addressing issues quickly means it’s easier to catch the individual commits and easier to fix potential issues. Working only at the major release level (like Grizzly to Havana) means there are more combinations to test and more changes at one time that need to be tested.

Rob uses a graphic (titled the “Migration Cube”) which charts where the continuous deployment model fits along the axes of small steps vs. large steps, level of risk, and server vs. client upgrades. His point is that continuous deployment offers the least risk and the least disruption. By so doing, it also provides the most profit.

He applies the “pets vs. cattle” analogy to your servers, which is a different take (normally it’s applied to VMs/instances). In that regard, it’s a lot like the idea of “snowflake servers” vs. “phoenix servers” (I’ve mentioned this a few times in various presentations)—you don’t want your servers to be carefully nurtured, lovingly crafted masterpieces. You want them to be easily repeatable, commodity resources. You want to treat them like cattle.

So how do we get the community to the continuous deployment model?

  1. Servers and agents must be version tolerant.
  2. Client protocols must be testable and documented.
  3. Ensure non-destructive migrations.
  4. Fast-fail on client, but version tolerant on server.

Upgrades need to stop being the “caboose”—it has to be more important and done up front.

At this point, Rob wraps up his session and opens it up for questions and answers.

Tags:

You might have seen that the second edition of VMware vSphere Design was recently released. How would you like to win a free copy?

Starting today (no, this isn’t an April Fool’s joke) and running through April 15, you can enter to win a signed print copy of VMware vSphere Design, 2nd Edition! I’m giving away a free, signed copy of the book to five lucky individuals who enter and win in this book giveaway contest.

Here’s how to enter to win one of the 5 free, signed copies of the book: simply post a comment on this site telling me why you would find this book useful. Yes, that’s it!

There are a few contest rules you’ll want to note:

  • You’ll need to use a valid e-mail address when you post your comment, because if you’re a winner I’ll be contacting you via e-mail to get your shipping information.
  • If I notify you via e-mail that you’ve been selected to win a copy but you fail to respond within 48 hours with your shipping address, I’ll select a new winner in your place.
  • Because I’m paying for the shipping out of my own pocket, I’ll have to limit this contest to continental US residents only. (Sorry folks—my personal budget can’t sustain sending print books worldwide.)

Aside from the limitation to continental US residents only, this contest is open to anyone, so don’t hesitate—enter today!

Tags: , ,

« Older entries