Partner Exchange 2010 Session TECHBC0320

This is a liveblog for VMware Partner Exchange session TECHBC0320, “How VMware Leverages Microsoft Volume Shadow Services for Virtual Machine Snapshots”. The presenter is Paul Vasquez with VMware; he works within the Technical Alliances Organization at VMware with a focus on backups.

The session starts out with an overview of VMware snapshots followed by a quick overview of Microsoft Volume Shadow Copy Services.

Vasquez is careful to distinguish VMware snapshots from array-based snapshots, which is good since that seems to confuse a number of people. VMware snapshots can include the state of memory (optional), settings, and disk. Snapshots are taken at the VM level, and up to 32 snapshots can be taken. Over 20 snapshots can cause performance concerns and, in Vasquez’s words, “can cause undesirable results”.

In general, a snapshot will include all disks although there are ways to exclude disks from a snapshot.

Operations involving VMware snapshots include taking a snapshot (self-explanatory), reverting to a snapshot (reverts the VM to the snapshot state, the delta file remains until the snapshot is deleted), and deleting a snapshot (delta file is removed, VM continues running in the current state).

Some use cases for snapshots include: rollback capability for testing patches or updates; rollback for failed software installation; protection against unwanted results of OS reconfigurations or testing; backups (for creating consistent copies of a VM); and replication.

The delta file grows as-needed; over time, the delta file will grow larger and larger. Vasquez cautions attendees to be sure to plan datastore sizes to account for snapshots for VMs and the delta file growth caused by the changes to those VMs.

A good question was raised about read I/Os and the impact of snapshots (does

The presentation now moves on to a discussion of VSS. One component of VSS is the requestor; the requestor makes a request from a provider, and the writer provides information on how to provide information to a requestor. Providers are included with Windows and are responsible for intercepting I/O requests to create and represent volume shadow copies on the file system. There are also 3rd party providers. In this context of this discussion (VSS integration with VMware snapshots), VMware Tools is the requestor.

There is a wide range of applications that provide VSS support, including Exchange, SQL, SharePoint, Active Directory, BITS, DHCP, and WINS. The vssadmin list providers command will show all the providers. (Note that you won’t see the VMware Tools when you run this command; it is dynamically loaded only at snapshot time and then unloaded.)

The vssadmin list writers command will show a list of writers.

The general flow of operation with VSS runs like this:

  1. Requestor makes a shadow copy.
  2. The writer is told to freeze all I/O.
  3. The provider creates a shadow copy.
  4. The writer is told to “thaw,” or resume, I/O to the application.
  5. The requestor now has access to the shadow copy.

The writer can support multiple enumerations, or different ways of coordinating the creation of the shadow copy. Exchange, for example, supports Full (backs up databases, logs, and checkpoints; truncates logs), Copy (backs up databases, logs, and checkpoints; does not truncate logs), Incremental (backs up and truncates logs), Differential (backs up logs but does not truncate). Of these, VMware uses the Copy enumeration when requesting shadow copies. Supposedly, the reason this is the case is to prevent interfering with backup applications that aren’t aware that logs were truncated. In addition, when VMware calls VSS, all writers are engaged, so it’s not possible to selectively choose which VSS writers should be engaged (can’t engage VSS for Exchange but not SQL within the same VM, for example).

In the future, VMware Tools will offer granular control over which VSS enumeration is used. Granular control over which VSS writers can be engaged is also planned.

Vasquez now moves into a discussion of how VMware snapshots and VSS integrate together. When a VMware snapshot is taken, this is when VSS integration comes into play. Obviously, for VSS integration the VM must be powered on (the guest OS must be running in order for VSS to be operational).

Some form of quiescing is always used when a snapshot is taken (unless the VM is powered off). The VMware Sync driver provides a crash-consistent copy of the VM but doesn’t interact with applications. This option is available in vSphere 4.0 and can be used when no VSS support from the application is available. Obviously, there is VSS support (hence this session), and there are pre- and post-quiesce scripts that can be used to create homebrew solutions as well. Both VSS and the Sync driver can be enabled using VMware Tools.

VSS support is enabled in VMware ESX 3.5 Update 2 or higher.

Going back to the VSS flow earlier, an additional step is present before the writer resumes I/O to take the VMware snapshot. After the VMware snapshot is taken, the shadow copy created by the provider is discarded because it is no longer needed. Once again, Vasquez reminds attendees that the VMware Tools Requestor only supports the copy enumeration.

An attendee asked if any plans were in place to do quiescing at the VMFS layer (supposedly to assist with hardware-based snapshots); Vasquez responds that some form of VMFS quiescing would be helpful, but there are challenges with that arrangement that make it currently very difficult to actually achieve.

(Vasquez also commented on the end-of-life policy for the ESX Service Console, but I’ll hold on mentioning what was said until I verify the confidentiality of the statement.)

Some additional things to remember:

  • VMware Tools build must be 110268 or higher.
  • VMware Tools must be running and VSS must be functioning properly.
  • VSS Service must be set to Manual or Automatic.
  • ESX 3.5 Update 2 is required for VSS support.
  • Be sure VSS support is installed with VMware Tools.
  • Try not to keep VMware snapshots around for a long time. Manage snapshots carefully.
  • Sync driver can be used as a failback in the event VSS support fails.
  • VSS snapshot has a 10 second timeout. Rare cases could cause a failure of getting the VSS shadow copy.

Most of the information contained in this presentation are found in the current vSphere documents and in Microsoft’s VSS documentation. (I’ll update this post with URLs when possible.)

And that’s it for the session.

Tags: , , , ,

9 comments

  1. brandon’s avatar

    I’m very anxious to hear what was said about the service console end of life policy. We’re planning on moving to ESXi 4.0 with the upgrade to vSphere looming here soon. The biggest concerns aren’t coming from me or my partner in crime to support the environment, rather it comes from everyone else who has any say and some from management. Part of our justification for the move, aside from the security aspects alone, are that we need to learn to operate without the service console. We might as well start now. I don’t know if there is a time line or what it is that he said, but I’m extremely anxious to hear it, because if it confirms it is indeed going away in my lifetime, it will only add to the justifications for moving to ESXi.

  2. slowe’s avatar

    Brandon,

    Without breaking any NDAs, I can tell you that it will be within your lifetime. Get prepared.

  3. Paul’s avatar

    Brandon,
    I had the opportunity to attend a vSphere design workshop, and VMware’s stance for installation is now 1) ESXi embedded whenever possible 2) ESXi installable (choosing the hardware appropriate version for HP, Dell, or IBM) and a distant 3rd option is ESX “classic”.

    That being said, there are a couple of factors that might prevent someone using ESXi such as hardware compatibility or an agent that is required to be installed on the ESX server such as a management or backup agent.

  4. Russell’s avatar

    I have to say, troubleshooting ESX without a service console is among the most painful things I have ever attempted to do. I’m a little annoyed that I forgot to bring this up during PTAB since this is a hot button for myself and a couple fairly large customers.

  5. brandon’s avatar

    Thanks for your reply Scott. Btw your Mastering VMware vSphere 4 book was great :). I read it cover to cover. Odd how I can read technical stuff like that and enjoy it, while other ‘entertainment’ material dries out my eye sockets.

  6. Paul’s avatar

    Russell – I absolutely agree. Troubleshooting is much more cumbersome with ESXi….however, I can also appreciate VMware’s position with wanting to minimize the possible attack surface. Of course, with most security enhancements there are trade-offs. As for me, I’ll more than likely continue to use “classic”, and consider managing them as ESXi hosts.

  7. stuart’s avatar

    Hi Scott , great article. I agree with the new VSphere APIs there will be alot a interest in moving towards Image based backups. I think it should be noted that as it stands toady you will see no application quiesce in Windows 2008 leveraging the Vmware Tools VSS provider, I blogged about this here http://virtualy-anything.blogspot.com/2010/02/understanding-vss-in-vmware-backups.html

  8. Russell’s avatar

    I don’t buy the attack surface argument for a lot of reasons that really go outside the scope of this blog post commentary. Hopefully Scott can provide a post in which we can have a solid discussion; I don’t want to hijack his main point (VSS) but I do think its an important discussion.

Comments are now closed.