Cisco UCS Virtualization-Optimized CNAs

There’s some great information being shared in the comments to my “More on Cisco UCS” post. So good, in fact, that I thought it entirely and completely appropriate to bring that information into the limelight with a full-blown post.

If you look back at the diagram that’s included in that UCS post, toward the bottom you’ll see a very small blurb about some Cisco UCS Network Adapters that are optimized for efficiency and performance, compatibility, and virtualization. In a nutshell, the idea here is that there are three different CNA families targeted at different markets: high-performance Ethernet, compatibility with existing driver stacks, and virtualization. Users will choose the CNA that best suits their needs. For the purposes of this post, I’d like to discuss the virtualization-optimized CNA.

The idea here is that the virtualization-optimized CNA (what is being referred to as “Palo”) will leverage a number of different technologies in virtualized environments:

  • It will utilize SR-IOV (Single Root I/O Virtualization), a PCI SIG standard for allowing a physical network adapter to present multiple virtual adapters to upper-level software, in this case the hypervisor. This eliminates the need for the hypervisor to manage the physical network adapter and allows VMs to attach directly to one of the SR-IOV virtual adapters (or, as Brad Hedlund put it in this comment to my original article, an “SR-IOV slice of the adapter”).
  • It will utilize Intel I/O Acceleration Technology (Intel I/OAT) to minimize bottlenecks in the hardware and allow the server to better cope with massive dataflows like those generated by 10GbE adapters.
  • It will use Intel Virtual Machine Device Queues (VMDq) to improve traffic management within the server and decrease the processing burden on the VMM, i.e., the hypervisor.

Together, these technologies can be referred to as Intel VT-c. The virtualization-optimized drivers will also take advantage of Intel VT-d to provide hardware-assisted DMA remapping and better protection and performance of direct-assigned devices.

“OK,” you say. “But where is all this leading?” Good question! Let’s bring it all together.

Today, in the VMware space, virtual machines are connected to a vSwitch because connecting them directly to a physical adapter just isn’t practical. Yes, there is VMDirectPath, but for VMDirectPath to really work it needs more robust hardware support. Otherwise, you lose useful features like VMotion. (Refer back to my VMworld 2008 session notes from TA2644.) So, we have to manage physical switches and virtual switches—that’s two layers of management and two layers of switching. Along comes the Cisco Nexus 1000V. The 1000V helps to centralize management but we still have two layers of switching.

That’s where the “Palo” adapter comes in. Using VMDirectPath “Gen 2″ (again, refer to my TA2644 notes) and the various hardware technologies I listed and described above, we now gain the ability to attach VMs directly to the network adapter and eliminate the virtual switching layer entirely. Now we’ve both centralized the management and eliminated an entire layer of switching. And no matter how optimized the code may be, the fact that the hypervisor doesn’t have to handle packets means it has more cycles to do other things. In other words, there’s less hypervisor overhead. I think we can all agree that’s a good thing.

<aside>I’ve clashed with a couple of different people thus far because of differences on perspective with regard to UCS. OK, specifically, it’s been regarding people who insist that UCS isn’t a blade server. OK, UCS as an overall system is not a blade server, but the B-Series blades are a significant part of this overall system—so to say that Cisco’s isn’t building blade servers really isn’t accurate. They are building blade servers, but these are blade servers with an as-yet-unseen level of integration with other technologies. If there is one area in which UCS stands apart from any other blade server-related solution on the market, it would be this level of integration, especially the integration with virtualization technology.</aside>

Chad Sakac of EMC touches on this very lightly in his latest post. Being who I am, though, I much prefer digging a bit deeper to better understand exactly what’s going on.

UCS experts, feel free to correct me or clarify my statements in the comments. Thanks!

Tags: , , , ,

14 comments

  1. Rob Leist’s avatar

    This VMDirectPath stuff sounds great from a performance point of view. But sounds like the VMs will now need hardware specific device drivers again which is opposite of one of the advantages of virtualization of separating the guest OS from the harware. How would you VMotion a VM to different hardware? Will the management tool (vCenter) be responsible for putting the other device driver in the guest?

  2. Brad Hedlund’s avatar

    Scott,

    You are spot on when you say Cisco UCS stands apart with it’s optimizations for virtualization. However there is more to this story.

    In addition to the optimized and simplified VM networking aspects you have clearly articulated, note that the full height B-Series blade has an incredible 48 DIMM slots with (2) sockets.
    That equates to 384 GB of RAM in a (2) socket system (48x8GB). With 4 full height B-Series blades in one 6U enclosure, that’s an incredible 1536GB of RAM per 6U of rack space.

    I’ll save detail on how that’s done for post on my own site… (coming soon).

    Cheers,
    Brad

  3. David Magda’s avatar

    Why is Cisco’s UCS having SR-IOV a big deal? If it’s an open standard [1] [2], then everyone will probably have it eventually (good on on Cisco for being first I guess). The Xen guys discussed SR-IOV in a general way [3] in June 2008 [4]. You also mention a few other Intel technologies, which will be utilized by others as well.

    I guess the best way you can describe UCS is as some sort of “virtualization appliance”.

    Virtualization is great for low utilization tasks that only take up the proverbial 10% of a modern system’s CPU, and so you can consolidate many systems onto one, but many workloads don’t work that way in the Internet world [5].

    The fact that the most virtualization solutions seems to be running after VMware (i.e., emulating the entire machine) smacks of lack of creativity for those of us who like using more “light weight” solutions [6]. Bringing together ten OS images onto one physical box still means you have to manage ten images, /plus/ the virtualization stack. This “application” from Cisco makes things slicker in some ways, but it’s still the same idea that VMware came up with in 1998. Can’t we do better? (This is the future damn it! Where are the flying cars?! Why do we still have weather?! :)

    [1] http://www.pcisig.com/specifications/iov/single_root/
    [2] http://www.pcisig.com/specifications/iov/
    [3] http://www.xen.org/files/xensummitboston08/Xen-SR-IOV.pdf
    [4] http://www.xen.org/xensummit/xensummit_summer_2008.html
    [5] http://www.stdlib.net/%7Ecolmmacc/2006/03/23/niagara-vs-ftpheanetie-showdown/
    [6] http://en.wikipedia.org/wiki/Solaris_Containers

  4. slowe’s avatar

    David,

    It’s not always the best technology that wins–you should know that!

    Perhaps OS virtualization (aka containers) would work better in some instances, but the vendors that are pushing OS virtualization apparently aren’t doing a good enough job extolling its virtues.

    See Nik Simpson’s comment on my earlier UCS article–the winner won’t be the vendor who incorporates all these standards that you, rightfully so, point out that anyone can use. Rather, the winner will be the vendor that puts it all together and makes it transparent to the end-user community.

    Brad,

    I look forward to more information from your site!

    Rob,

    Check out the linked notes on TA2644, the session from VMworld 2008; I believe it helps to address some of your questions. There are hurdles to overcome, no doubt.

  5. David Magda’s avatar

    My larger point is that just about most companies are basically doing the same whole-system or paravirtualization. They’re either emulating what VMware did, or using hardware-assist (Intel VT, AMD-V) to set up separate partitions. This is useful and a wonderful tool to have.

    What I’m whining about is that it is generally the only option available for most things, and it’s a hammer that makes everything look like a nail. Yes Sun (and to some extent IBM) do have other offerings, but given the resources available to other companies (e.g., Microsoft), you’d think that there would be a few more options available.

    Perhaps it’s just that we’re in the early stages of this virt. bandwagon, and the players out there are simply taking baby steps in this area. Certainly IBM has experience in these things, it may be just everyone else is content with the status quo or doesn’t want to rock the gravy boat that is the Wintel domination.

  6. Nate’s avatar

    A couple of quick things that pop to mind that I wonder if anyone knows the answers to yet:

    Is Cisco going to actually implement the standards in this inctance or some earlier than standard version of their own that will come around to bite us all later (examples CDP vs LLDP or Cisco PoE vs standard PoE)

    Wasn’t one of the benefits of virtualizing the NIC that it could be shared like the proc? Will this new slice and dice method of splitting the connection be able to dynamically adapt to needs. Example: if Server A is chugging and Server B is idle, will server A get the bandwidth it desires or will it sit unused and dedicated to Server B? Will there be priotitization so that Server A should always get more than server B, but Server B should get at least X amount? I’ve been investigating HP’s flex technology and have the same questions there.

    What will be the HA/redundancy options? Having a piece of a link presented as a single link to the VM sounds good from a mangaement standpoint, but what if you want to ensure that virtual link has two complete seperate physical paths? When you use a virtual switch you can do this with an EtherChannel or a LAG (or in VMware land their NIC teaming). Will there be a similarly available option in the slice and dice realm? Again I’ve been trying to figure out the same thing for HP’s take on this as well.

  7. Nate’s avatar

    David, I’m not sure it’s lack of creativity as much as it is going after a target market. Most folks looking at virtualization do so because of consolidation. They have a lot of very different systems that they want to get onto fewer pieces of hardware. To me virtualization has never been about simplicity its been about space plain and simple. There may be marketing hype out there to claim it’s all going to make things simpler (and to be honest for certain things like deployment maybe it does), but you never add another layer without adding complexity. Even something like containers adds complexity. At the moment P2V is a big deal so that generally means running a full virtual OS. Now in the future the need for P2V should diminish as machines are built from te start in virtual land, but a lot of folks are still going to want difference to their systems even virtual. To me the shared kernel model seems most well suited to the hosting company where they have many identical systems that they just want essentially walled off from stomping on each other. In the enterprise market it is less likely to have as many identical setups running at one time. You need different OSes at different release levels with different configurations etc. So I agree there is a place for the type of virtualization you discuss in the market, but I don’t think it cancels the need for the likes of VMware, HyperV, and Xen.

  8. Jagane Sundar, Founder Thinsy Corporation’s avatar

    I agree that Virtualization aware network hardware is a great enhancement to the platform. However, a necessary part of this will be standards for the virtual network interface visible to the Guest OS.

    Such a standard would ensure that a Virtual Machine running on Cisco UCS could potentially be moved (live or cold migrated) to another Virtualization platform.

    Today, you can’t run a VMware VM on a Xen Hypervisor, and vice versa. This is because there is no standard for the devices emulated by each of these Virtualization platforms. New virtualization aware hardware enhancements could potentially exacerbate this situation. For example, it may be impossible to run a VMware+Cisco UCS VM on another setup, say a VMware+HP Blade. This induces vendor lock in, and is detrimental to the customers, and eventually to the Virtualization industry.

  9. Randy’s avatar

    While cross virtualization migration might be nice from a customer standpoint, actually implementing this would be a ton of work, and would eliminate the ability for any of the solutions to differentiate themselves with new features and better performance. Ultimately you would be limited by the lowest common denominator. Let the virtualization vendors build their own stacks, and pick the best one when it is all over, you can do non-live migration by using clunky import tools, so there really isn’t much lock in anyways.

    Also, David Magda is way off with “Bringing together ten OS images onto one physical box still means you have to manage ten images, /plus/ the virtualization stack”. Have you ever used linked clones or templates? You can create all 10 VMs from the same image and only store changes above that in a seamless and transparent way.

    Also- on OS container virtualization- there is a reason this hasn’t caught on in a big way. It just isn’t good enough. Sure you can run a lot of Citrix sessions or have lots of sessions of webservers, but when you start doing real work, you need to change things that you can’t change when you are ultimately only using one kernel. How do you deal with software that requires a conflicting system level api? And the obvious one- OS containers give NO isolation from O/S crashes or reboots, how do you do a kernel update for the one consumer who needs it? You bring down the box and kill all sessions. With VMs, you have a much better granularity. One of your systems is hacked- with root credentials even- another one kernel panics, and the other 10 keep on ticking, since the hypervisor is unaffected. Also, I don’t think any of the OS container virtualization solutions allow VMotion (and can’t, since all possible state is not encapsulated anywhere, and you don’t even know the internal state of the hardware devices since they are physical, not virtual).

    Using VMs over OS containers has had some drawbacks- more memory/disk overhead, but with technologies like linked clones, deduplication, and memory page sharing these are essentially already dealt with in ESX. There is some more overhead perhaps, but this is often outweighed by better isolation- your VMs CPU/IO workload cannot starve mine, unlike on a regular OS with containers. Also, the cost of virtualization overheads is falling quite fast since CPU vendors have been focusing on this, and VT exit on Nehalem servers is like 5x faster than ever before.

  10. Jagane Sundar, Founder Thinsy Corporation’s avatar

    My point is this – if you push Vendor Specific VM Enhancements far enough, you get technology that is very similar to Citrix/Container Virtualization.

    This leads to exactly the problems with Citrix like solutions that you have highlighted above.

    The end result will severely undermine the VM value proposition.
    Application Virtualization vendors such as Google Apps and force.com will look a lot more attractive. (Yes – App Virt vendors do indeed compete with VM vendors)

  11. Clarke’s avatar

    Just got to talking with my Cisco engineers… I understand the basic concept here is to populate a chassis with nearly solid state b-blades with, in effect, an ethernet-based backplane capable of massive I/O bandwidth. That way, you could have “conventional” servers as well as HPC cluster existing within the same platform. How you push differing frames over that I/O bus is effectively up to you. With the addition of Nexus 5000 switches, you could run both regular Ethernet and FCoE out of the same physical adapter on the Palos and have the Nexus switch split the FC frames out of the FCoE, and pass them off to a standard FC SAN or DAS. Theoretically, you could even do the same with iSCSI and get SCSI split to a DAS. Or, potentially, a virtualized or encapsulated InfiniBand for HPC. The technology here allows Cisco to come up with a single interconnect, so that as a consumer, we really wouldn’t have to worry about buying blades with 10GE daughtercards, FC daughtercards, etc. That future-proofs your investment somewhat, and should really help with stuff like VMotion since you’re really talking about a homogeneous blob of hardware rather than ensuring individual blades in a chassis have like hardware configurations.

  12. Joe’s avatar

    What I still dont get about SR-IOV and the new CNA’s is will it now securely allow me to collapse network zones on one physical NIC/cable or will I still have to use multiple CNA’s per network zone (data, management, storage, backup etc, outer dmz, inner dmz etc) but i’ll just be able to run either FC or lossless 10GB ethernet down the same card

    If all these new CNA’s and DCE allow mutliple VLAN’s down the same cable how is this more secure than what I already do with multiple NIC’s per VLAN? Whats the new security protocols used, whats different?

    Else all I see is an expensive adapter and no reason to ever update my ethernet NICs and HBA’s?

  13. slowe’s avatar

    Joe,

    That’s a great question! With regard to putting all the traffic on the same wire, keep in mind that FCoE is Layer 2 only, with (as I understand it) a modified Ethernet frame. This should further differentiate FCoE traffic from “standard” IP traffic and provide a greater separation of traffic than is possible using VLANs. I could be wrong, but that’s my understanding.

    As far as SR-IOV goes, I don’t have any strong details on that topic just yet. I’ll have to do some digging and see what I can find. Given that the creation and management of virtual functions (VFs) is all done in hardware, my initial assessment is that there won’t be a lot of room for security issues there.

    At this point it is quite unclear and unknown what the “best practice” is going to be for handling multiple security zones (outer DMZ, inner DMZ, etc.) on a converged fabric.

  14. Joe’s avatar

    Thanks Scott

    Dont get me wrong, I can see the benefits of a)having this done on a card/chip with the processor cycles taken out of the hypervisor, b) standardising transmissions protocols/cables/adapters, c) lossless ethernet etc is all good news but it would be great to know if it gets rid of the very real issue most architects face of having to load virtual servers with NICs to properly separate network zones (and no VLANS arent secure before anyone says they are!)

    Is there anything on the Nexus side that does this in software? I was hoping not since youd want this all to be hardware function and something that SR-IOV could/would provide?

    Look forward to whatever you find out re: “best practice” Scott

Comments are now closed.