Monday, August 29, 2011

You are currently browsing the daily archive for Monday, August 29, 2011.

This is the session blog for the Monday general session. I’m fortunate enough to have arrived in time to get a seat at the blogger/press/analyst tables. While the network connectivity is good, the power is—unfortunately—not so good.

The general session started with an impressive lightshow across the front of the conference that depicts the change of computing with the advent of virtualization and cloud computing. It was visually appealing and interesting.

At the conclusion of the visual show, Rick Jackson, Chief Marketing Officer for VMware, takes the stage to kick off the general session. Rick indicates that there are about 19,000 people here at VMworld 2011 this week; attendance is down, understandably, due to Hurricane Irene’s effect on the East Coast of the United States and the resulting impact on air travel.

Rick indicates that the Hands-On Labs for VMworld 2011 are completely hosted on public clouds: Switch SuperNAP, Colt, and Terremark all provide public cloud services for this year’s labs. The labs are built on vSphere 5.0 and vCloud Director 1.5. Both Paul Maritz and Carl Eschenbach will be speaking later in this session; and tomorrow morning VMware CTO Steve Herrod will be doing a technology keynote to demonstrate what VMware’s working on.

Rick also confirms that VMworld 2012 will be back in San Francisco (yay!), being held from August 27 to 30, 2012. At this point, Rick introduces Paul Maritz, CEO of VMware, and gives him the stage.

Paul gives some statistics:

  • One VM being deployed every six seconds (that’s faster than babies being born in the US)
  • 20 million VMs running on VMware vSphere
  • More VMs in flight using vMotion than there are aircraft in flight
  • Greater than 800,000 vSphere administrators (that’s the population of San Francisco)
  • Greater than 68,000 VMware Certified Professionals (across 146 countries)
  • More than 1,650 ISV partners and more than 3,000 apps certified on VMware

So, given all this success, where does VMware go from here? This sets Paul up to give VMware’s vision and explain the various forces that are at work in the transformation of IT in this “unfolding cloud era.” Paul takes us on a journey from his early days in IT and how the industry transformed during the client-server era and now into the cloud era. For the most part, this is the same material that we’ve seen in previous conferences, but with one notable addition: a strange focus on data fabrics (the relational database, for example). Maritz says that the relational database as a data fabric simply cannot handle the scale of traffic that the cloud era demands.

Maritz spends some time talking about the tasks that need to be completed to help us move into the cloud era, and ties that to vSphere versions that have been delivered by VMware in recent years (4.0 in 2009, 4.1 in 2010, 5.0 in 2011). The delivery of vSphere 5 is a key part of the first task to be completed: modernize infrastructure and operations.

VMware is also aggressively target public cloud-based services running on vCloud Director, and Maritz announces a couple new vCloud partners. Not leaving out the sizable SMB market, Paul Maritz also described VMware’s commitment to that marked with a new release of vSphere Essentials, and he touches base on VMware Go, a SaaS-based service to assist in getting their infrastructure setup and running.

The second task we must address to move into the cloud era is to handle the migration or transition of existing apps to new and renewed apps. This is the core of VMware’s vFabric push: to build new frameworks, provide new platforms, and supply new data fabrics that are capable of handling the scale and volume that the new cloud era needs. SQLFire takes the extraordinary scalability of GemFire and enables people to use it with the more traditional SQL query language. VMware is also announcing vFabric Data Director, a new way of automatically provisioning and managing databases on vSphere. The first “example” or “implementation” of vFabric Data Director is vFabric Postgres, a vSphere- and vFabric-optimized version of Postgres to be used with vFabric Data Director and vSphere. The third aspect of vFabric and VMware’s push to modernize applications is CloudFoundry, a new Platform-as-a-Service (PaaS) offering. CloudFoundry supports node.js, Ruby, and Spring. Scala support has been added by the open source community. To help with adoption, VMware has created a local version of CloudFoundry that can run on a local laptop.

The third task to move into the cloud era is addressing end-user access. To that end, VMware is announcing VMware View 5.0, with improvements in bandwidth usage, greater availability of View clients (clients for just about any device), and greater integration with VoIP/unified communications providers and services. View is, of course, only part of the strategy; there’s also Horizon, VMware’s offering to manage users and applications across traditional applications and “cloud era” applications. Horizon is no longer a single product, but a collection of products that allow IT to associate information and applications to people instead of devices. Maritz also makes references to MVP, VMware’s Mobile Virtualization Platform. Virtual phones? We shall see.

At this point, Carl Eschenbach is brought onto the stage to transition into a discussion about moving to the cloud era from the perspective of three different customers who have made this journey themselves.

My battery is now running down, so I’m wrapping up this session blog.

Tags: , , , ,

This is a session blog for VSP3205, titled “Tech Preview: vStorage APIs for VM and Application Granular Data Management.” The presenters are Satyam Vaghani and Vijay Ramachandran, both with VMware.

This session will be repeated tomorrow at 1PM and Wednesday (didn’t catch the time). As usual, the presentation starts out with the VMware disclaimer, followed by a quick review of the “state of the union” with vStorage APIs for Array Integration (VAAI) and vStorage APIs for Storage Awareness (VASA). Both of these features enable greater communication and information exchange between vSphere and the underlying storage arrays. They are attempts to “bridge the gap” between vSphere’s logical view and the array’s physical view. While these features do help, there is still more to be done. What is really needed is a general framework to leverage future storage array features and enhancements.

The presenters share some quotes and comments from customers, where the feedback primarily resolves around greater granularity (i.e., being able to failover a single VM, or offer differentiated services on a per-VM/per-application basis). The existing VAAI and VASA features aren’t granular enough to meet these requests/demands from customers. VMware needs more granular data management.

The reason VMware can’t provide more granular data management currently is because vSphere is managing VMs and VMDKs but the arrays are operating at LUNs or RAID groups or volumes. This again reinforces the need for a larger framework that allows more integration with arrays and at the same time offers granular data management.

Here are the ideal “wish list” requirements:

  • Ability for VMware to offload per VMDK-level operations to storage systems
  • Build a framework for future features and enhancements
  • No disruption to existing VM creation workflows
  • It needs to be highly scalable

Next we see the VMware vision. They want to move to an application-centric view of the world. Let the applications specify the policies (the requirements), and let the virtualization and storage layers provision against those policies. In addition, the unit of management should be the same between vSphere and the array. That is, the unit of management needs to be the VMDK.

RDMs can help accomplish some of these goals, but the management overhead is tremendous.

Now Satyam takes the stage to show VMware’s technology direction. Satyam reinforces the disconnect between vSphere (which operates at/on the virtual storage layer/fabric) and the arrays (which operate at/on the physical storage layer/fabric). The physical and virtual fabrics never exchange information, which means that information like QoS and hardware-based data services cannot be effectively leveraged.

So the goal is to enable storage systems to natively storage VMDKs as distinct entities and provide granular VMDK data services. This interaction would be done via a policy-based interface where vSphere acts as an arbitrator of services between the virtual fabric and the physical fabric. This functionality requires new storage access methods and new vStorage APIs, and it will fundamentally change how storage is provisioned and managed in vSphere environments.

VMware decided that VMFS was not a good option for storage system differentiation and new storage solutions. Likewise, NAS was not an ideal solution, because most data services are not at file granularity. Object-based storage was considered, but VMware felt the protocol (part of T10) was not successful.

Satyam next shows a demonstration of a VM volume, which is a representation of a VMDK on a storage system. The demonstration was using a future build of ESXi and a future build of an EMC storage array with VM volume awareness. This demo shows how, for the first time, the virtualization layer and the storage layer working on the same management objects. It operates the same across both SAN and NAS. Lots of questions arise from this demonstration: Does this mean millions of VM volumes? Are we re-inventing storage systems? Is this even feasible?

To accomplish the goal of enabling VM volumes, VMware has the idea of a IO Demultiplexer, or IO Demux. The IO Demux is a special I/O channel from the host to the entire storage system. Behind the IO Demux will reside thousands of VM volumes. To handle the capacity management issue—i.e., preventing the vSphere administrator from creating too many VM volumes and overrunning the entire array—VMware introduces the idea of the capacity pool (CP). The capacity pool is not visible in the data path; it is not a LUN or mount point. CPs can span arrays, or could even span datacenters. The VM admin/user can carve VM volumes out of the CP until they run out of space. (However, this doesn’t address IOPs requirements for VM volumes or the CP. How will IOPs requirements be handled?)

Profiles enable the application to communicate its specific QoS requirements to the storage system. (Is this how we handle IOPs?) Profiles will have fixed and customizable attributes (snapshots are allowed, snapshot retention is a certain value, snapshot frequency is a certain value, etc.). (Let’s hope that these attributes are implemented in a more granular basis than the current VASA implementation.) VM admins/users can customize attributes on a per-VM (i.e., per VM volume) basis.

Satyam next moves into a demonstration with more detail on the IO demultiplexer. In this demo using protocol EMC equipment, we show multiple IO demultiplexers using multiple storage protocols.

Following this demo, Satyam shows a prototype vCenter Server UI from VMware showing capacity pools with various storage capabilities (profiles).

Next there is a demo of prototype storage from NetApp showing the awareness of the capacity pool from vCenter Server and using an NFS-based IO demultiplexer.

From there we move into a CLI-based demonstration of the same capabilities using a Dell EqualLogic array.

The next demonstration flips back into a prototype of EMC Unisphere to show off storage profiles, where the storage administrator has defined multiple storage profiles at the storage array.

Putting everything together, VM volumes looks a lot like profile-driven storage in vSphere 5 today: the storage profiles are defined at the array level (instead of at the vSphere level) and the destination “datastore” is a capacity pool instead of a LUN or NFS mount point. After creating a VM using this process, we flip over to EMC Unisphere to see the individual VM volumes created on the array, and looking at the properties for each VM volume. Satyam also shows demos of the same operation on prototype NetApp and Dell EqualLogic arrays.

The next demonstration shows an IBM XIV prototype that supports VM volumes and shows a VM being cloned at the hardware level, on a per-VM/per-VM volume basis.

The session wraps up with a review of the four major components: IO demultiplexer, capacity pools, storage profiles, and VM volumes. Satyam closes the session with a mention of the partners that participation vendors: Dell, VMware, HP, NetApp, IBM, and EMC.

Tags: , , ,

This is the session almost-live-blog (no wireless signal available in the session room) of VSP1682, VMware vSphere Clustering Q&A. The panelists are Duncan Epping, Frank Denneman, and Chris Colotti, all of VMware.

Lots of notable names were present in the session—Jason Boche, Kendrick Coleman, Mike Foley, Andy Banta, fellow co-authors Forbes Guthrie and Maish Saidel-Keesing were among the ones I noticed, and I’m sure that there were even more that I didn’t notice. Clearly this session has received a lot of attention, due in no small part I’m sure to Duncan and Frank’s successful vSphere 5 Clustering book.

As the session gets started and they open the floor for questions, the audience is a bit reluctant to get started with participation. The first question from the audience is in regard to a large-scale HA environment: what is the actual usable amount of bandwidth that you can/should allocate to an environment? Duncan answers with an observation that I would echo: very few environments are network constrained, even in 1Gbps network environments. Frank expands upon that discussion with a mention that higher network speeds for vMotion will affect DRS and how it will help DRS keep the cluster workload balanced.

The second question regards shared storage and what best practices might exist. Duncan answers first; one key consideration is the vSphere version; for example, what sort of servers could affect your architecture because of HA dependencies in pre-vSphere 5 environments. The question is really more of a storage design question than a clustering question, to be honest; while storage design and clustering design are linked, they are relatively independent. Frank adds that EVC (Enhanced vMotion Compatibility) and LUN access is another consideration, especially with very large (32 host) clusters. Identical configuration on all the hosts will help the DRS algorithms more effectively balance the cluster workload.

The third question concerns VM swap files and the interaction with Storage DRS. Frank answers that; he mentions that by default Storage DRS has a built-in VMDK affinity rule that keeps all the VM’s virtual disks together. But what is the affect? Is there an effect on VM swap file performance or on Storage DRS? Frank mentions that he really doesn’t think there will be an impact. The attendee follows up with a clarifying question about placement of the swap files; Frank indicates that placing them on slower (cheaper) disks might be a viable approach. The discussion then evolves into the use of auto-tiering arrays with Storage DRS; the general recommendation is to disable I/O metrics when using Storage DRS with auto-tiering arrays.

The next question is if there is a dependence on VAAI by Storage DRS. Duncan answers; there is no dependence on VAAI or VASA on Storage DRS. That being said, VAAI would be preferential to help offload Storage vMotion operations invoked by Storage DRS. The audience member has a unclear understanding of VAAI and VASA; he believes that I/O metrics and information are passed from the storage array to vSphere by VAAI/VASA. That is, of course, incorrect; Duncan points this out by explaining (again) the use of the I/O injector to gather I/O information from the array. Future versions might leverage VASA/VAAI to gather information.

The next question is about why someone should use HA in vSphere 5 if they are reluctant to implement HA due to problems with previous versions? Duncan responds that earlier versions of vSphere had a strong reliance on DNS; this reliance on DNS was removed/addressed in vSphere 5. I would agree with Duncan’s response; name resolution was vitally important in HA environments prior to vSphere 5. Duncan mentions that the user will want to consider admission control policy and settings, but otherwise recommends that the user should enable HA. Duncan also goes into a more detailed description of storage heartbeating.

Next, a couple is asked about Storage DRS and VMs with virtual disks that live on different datastores. (The example given is database workloads with OS disks and data disks on separate tiers of storage.) What impact/benefit will this have with Storage DRS? Frank explains that Storage DRS uses both capacity (space available) and I/O metrics to determine recommendations. As a result, setting the appropriate latency on the datastore and datastore cluster will help drive SLAs, and suggests that using different tiers of disks might be unnecessary depending on the underlying disk structure. Or, as Duncan points out, you could use different Storage DRS datastore clusters.

Chris Colotti (who is acting as moderator) points out that Storage DRS is a new factor that will affect everyone’s designs.

The next question is why someone should use HA when they are using a load balancer. Duncan’s response is that, from an operational perspective, it helps address the failure of hardware nodes. Even with an F5 load balancer as this user has, someone still has to address the failure of the hardware node and restart VMs; HA can do that automatically even though hardware load balancers can help address traffic flows. Chris Colotti asks how many people are NOT using HA; a small number of people raise their hands.

Next, a user asks about the interaction of Microsoft clustering and vSphere HA/DRS. Duncan’s response is that most people disable HA/DRS for Microsoft clusters, and that VMware is working on technologies that could replace Microsoft clustering. In the end, though, the answer for Microsoft clustering is still that you have to disable HA/DRS on the VMs participating in the cluster.

The next question is about the interaction of multiple HA clusters accessing a single datastore and the impact of that design decision. Duncan mentions that he masks datastores on a cluster level, instead of exposing a datastore to multiple clusters. This user also asks about VM failure monitoring; not many people are using it (based on the number of hands raised when Duncan asks how many are using it). It’s not enabled by default, but it can help with guest OS failures. VMware has also opened the SDK for application monitoring. Frank adds that vCenter Server takes a guest OS console screenshot before resetting the VM, and Duncan answers a follow-up question about the interaction between VMware Tools and VM failure monitoring. The VM failure monitoring feature looks for network I/O, disk I/O, and the VMware Tools heartbeat to detect guest OS failure.

The next user asks about handling storage failures with vSphere HA; is there a way to handle storage array failure with vSphere HA? Duncan points out that HA will not address storage array failure. The customer mentions that EMC VPLEX will address the concern, but there are not any VMware software-based solutions to this need.

With regard to EVC, Frank highly recommends always enabling EVC on a cluster; there is no performance impact and it will help ensure processor compatibility between old and new hosts. In addition, enabling EVC on a cluster with VMs running could require some downtime in order for the CPU masks to get applied.

The next question is about VMFS-5 and the 2TB limit. What is the “sweet spot” with regards to datastore sizing? (I personally would say the answer depends on many different factors, not the least of which is the underlying storage architecture.) Frank asks the question back about how many VMs he’s comfortable placing on a single datastore. Many of the attendees are still using only a few VMs per LUN; I’m curious to know why this is the case. Is it I/O driven? Is it based on time to backup and time to recovery? That’s not clear to me, and if any readers would like to provide some feedback in the comments that would be great.

The next user asks about array-based snapshots and Storage DRS; how do the two interact? Replication, snapshots, deduplication, and thin provisioned LUNs are all affected by Storage DRS. You will definitely want to adjust operational procedures and how you use array-based features like snapshots and replication with Storage DRS. Frank recommends using Storage DRS for initial placement, but don’t use I/O metrics and don’t enable migrations.

The next user asks if Storage DRS and SRM work together; Duncan mentions that it breaks replication but doesn’t break SRM. I thought I recalled that you couldn’t use datastore clusters at all with SRM, but I don’t remember where I saw that. Anyone have more information on that?

There were a few more questions, but I had to shut down and prepare for my next session, which is my vSphere design session (VSP1926).

Tags: , , ,