How to Provision VMs Using NetApp FlexClones

When properly implemented and configured, VMware Virtual Infrastructure can make provisioning new servers a task that takes only minutes.  In fact, in my own lab (running equipment that is, admittedly, several years old and woefully underpowered), I can provision new servers running Windows Server 2003 R2 in less than 10 minutes.  That’s pretty impressive.

As impressive as those numbers may be (and I’m sure there are readers out there with even more impressive numbers), if we leverage some vendor-specific storage functionality we can achieve some really impressive times.  For example, leveraging NetApp FlexClones could allow us to provision new VMs in seconds.  Let’s take a quick look at how that’s done.

In this article, I’m going to discuss how to use FlexClones for provisioning new VMs in a VMware VI3 environment.  This is not an exhaustive treatise on the subject, but rather an introduction to the process and some of the configuration that needs to take place in your environment.  (Disclaimer:  Use this stuff at your own risk.)

Configuring ESX Server

First, we need to change the configuration of ESX Server to enable it to see the FlexClones on the SAN.  The change we need to make is to enable resignaturing; that is, to enable ESX to recognize an existing VMFS datastore even if it is presented on a different LUN ID than the LUN ID it had when it was created.  When a VMFS datastore is created, ESX (or VirtualCenter) places a signature in the datastore that contains the LUN ID (among other information).  If this datastore is then presented back out with a LUN ID that doesn’t match the LUN ID in the signature, then it won’t be recognized by ESX Server.  Since we’ll be using FlexClones to make identical copies of VMFS datastores (including their signatures) and then present them out as new LUNs (with different LUN IDs than the original), we need to enable resignaturing in order for ESX Server to see the new LUNs.

There are two ways to enable resignaturing:

  • From the command line, type “esxcfg-advcfg -s 1 /LVM/EnableResignature” (you must be root)
  • From VirtualCenter, select the ESX Server, go to the Configuration tab, select Advanced Setings, choose LVM from the list on the left, and then change the value of LVM.EnableResignature to 1

Once this change is set, ESX will recognize LUNs in FlexClones as “snap-XXXXXXXX-name”.  You can easily rename them once they have been added to VirtualCenter.

Please note that this process can introduce some oddities in your storage discovery/creation process.  Make sure that you have the LUN properly recognized and configured for access by all applicable hosts before you start placing VMs on the LUN.

Creating/Preparing VMs for Cloning

One advantage that VirtualCenter’s cloning has over this technique is that the process of preparing a VM for cloning is all automated—VirtualCenter handles all that behind the scenes, launching SysPrep for Windows guests or using open source software for other guests.  All an administator has to do is just make sure that SysPrep is installed on VirtualCenter properly.

In this process, the guest OS preparation has to be done manually, and the placement of VMs onto the VMFS datastores has to be considered.  Since we will be making exact copies of the VMFS datastores, all VMs on that datastore will also be copied.  If you are sure that one of the cloned VMs will never be started up from the cloned VMFS, then you can leave it alone, but any guest OS that will be started up in the cloned datastore will need to be prepared first.  Again, for Windows guests, this means running SysPrep to generate new SIDs and reseal the operating system to factory defaults.

Let’s say you wanted to be able to quickly provision servers running Windows Server 2003 using FlexClones.  You’d need to first create a new VM and the accompanying VMDK files, selecting to put that onto a VMFS that is either a) empty and will contain only this VM; or b) contains VMs that will not ever be powered on after they are cloned.  You’d then need to install Windows Server 2003 on that VM, install VMware Tools (not required but very recommended), install any applicable patches or third-party software packages, and finally run SysPrep to prepare it for cloning.  After all those steps have been done, you can create the FlexClone.

Creating FlexClones on the Storage System

Please note that there is a tremendous amount of additional information pertaining to the use of Snapshots in VMware environments that I have not covered here.  I highly recommend TR 3428 from NetApp, which covers this information in detail, including best practices for volume configuration, Snapshot reserve, fractional reserve, etc.

Now, having said all that, and assuming that you’ve followed some of these guidelines, here’s how we go about creating FlexClones on the storage system.  (This assumes you’ve built a VM and prepared it for cloning as described in the previous section.)

  1. Logged into the storage system with appropriate permissions, take a snapshot of the FlexVol containing the LUN that has the VMFS datastore you want cloned.  You can call this Snapshot something like “base_clone_snapshot” or similar, but be sure to use a name that makes sense to you and helps you understand the purpose of this snapshot.  The command to do this would be:
    snap create fvol_master clone_base_snapshot

    This creates a Snapshot of the FlexVol “fvol_master” named “clone_base_snapshot”.

  2. Create a FlexClone based on the Snapshot you just created:
    vol clone create fvol_clone1 -b fvol_master clone_base_snapshot

    This creates a new FlexVol named “fvol_clone1”, which is based on the Snapshot named “clone_base_snapshot” in the FlexVol “fvol_master”.

  3. Because this is an exact copy of the original flexible volume, including LUNs and LUN maps, Data ONTAP will spit out some messages about LUNs being taken offline and such.  To fix this, unmap the LUN(s) in the new FlexClone and remap them with different LUN IDs:
    lun unmap /vol/fvol_clone1/lun_name igroupname
    lun map /vol/fvol_clone1/lun_name igroupname 3

    Obviously, substitute the appropriate LUN ID for the “3” in the above command line.  This remaps the LUN to the specified igroup with a new LUN ID and, assuming you’ve enable resignaturing, makes the LUN (which is a VMFS datastore) visible to ESX Server and VirtualCenter.

  4. Unless you want Snapshots of the FlexClone, disable scheduled Snapshots on the FlexClone using the “snap sched” command:
    snap sched fvol_clone1 0

    This disables scheduled Snapshots, but manual Snapshots are still allowed.  (To disable all Snapshots, you’d need to set the no_snap volume option.)

At this point, you now have the original VMFS datastore and any virtual machines contained therein (contained in the LUN on the original FlexVol), as well as an exact copy of that VMFS datastore (contained in the LUN on the FlexClone).

Registering the VMs

The VMs (comprised of the VMX, VMXF, NVRAM, and all VMDK files) were cloned along with the LUN and the FlexVol, but VMware doesn’t know they are there. In order for the VMs to be usable, we must first register them.

  1. Log into one of the ESX servers as root.  You may either SSH in as a normal user and su to root, or login at the console as root.
  2. Use the vmware-cmd utility to register the VMs.  Let’s assume that you called the FlexClone “san-lun-clone1” in VirtualCenter, and that a VM called “win1” exists on that VMFS datastore.  The command to use would look something like this:
    vmware-cmd -s register /vmfs/volumes/san-lun-clone1/win1/win1.vmx
  3. For each VM on the datastore that needs to be recognized by ESX (and has been properly prepared in advance, as noted above), repeat this process.  With a little work, it should be fairly easy to write a script that finds all the *.vmx files on a datastore and registers them.  (Anyone care to take up that challenge?)

At this point, you now have the following:

  • The original SAN LUN, with all the VMs stored there
  • A cloned SAN LUN, with all the same data as the original but occupying far less space than a traditional copy)
  • VMs registered and ready for use from both SAN LUNs

Having already enabled resignaturing, created and prepared the VMs, and taken the base snapshot, you could now easily create additional clones by simply creating the FlexClone and registering the VMs.  If you were to have a script that automated that process (perhaps using SSH shared keys or RSH to access the NetApp storage system from ESX), that entire process could be fairly easily automated.  I’ll leave that automation as an exercise for enterprising readers.

As a matter of best practice, please note that leaving resignaturing enabled (i.e., leaving the LVM.EnableResignature setting to 1) may lead to problems if LUNs are inadvertently re-signed. For long-term operation, I would advise users to disable resignaturing once cloned LUNs have been re-signed and are visible in the VI Client.

In future articles, we’ll take a closer look at the question of “Should I use FlexClones?” instead of “How do I use FlexClones?”.

Tags: , , ,

  1. Rick Kessler (Ardence Employee)’s avatar

    Are you planning to do an analysis on whether Flexcone or Ardence would be a better way to save on the storage?

  2. slowe’s avatar

    Rick,

    First, let me thank you for openly disclosing your affiliation. I appreciate your honesty and transparency.

    Second, I hadn’t planned on performing that kind of analysis, but let’s talk directly and see if we can arrange to get the Ardence software running in my test lab. At that point, we can do some comparisons and see what results we get. You can e-mail me via scott dot lowe at scottlowe dot org. Thanks!

  3. Aaron’s avatar

    I really appreciate your insights on the Netapp/VMware integration. I’m curious about your thoughts on using lun clone vs flex clone for deploying VMs. It seemed that using lun clones and RDM would be a good way to save space and money (since we don’t own Flex clone licenses) while still providing good performance and snapshot backups.

  4. slowe’s avatar

    Aaron,

    It’s funny you ask that–I’m currently working on an article that discusses LUN clones vs. FlexClones and when to use one or the other. For VM provisioning where the VMs will stick around for a while, I’d avoid LUN clones. For short-term VM provisioning, LUN clones will be OK. I’ll have more details in my upcoming article.

    You’ll also note that I referenced the use of LUN clones here as well:

    http://blog.scottlowe.org/2006/12/30/recovering-data-inside-vms-using-netapp-snapshots/

    In that scenario, we are only keeping the LUN clone around for a short time period, so using a LUN clone makes perfect sense. Again, I’ll have more details in my upcoming article. Stay tuned!

  5. Mike Zimmerman’s avatar

    I have a question regarding storage presented to the ESX server and eventually presented as storage to the vm’s on that ESX server. I am using Netapp iscsi storage and have mapped 5 LUN’s total to one IBM blade. One LUN is the ESX LUN which the server boots to and is using for storage and the other four are for four vm’s in which I am going to create on the ESX server. The ESX server boots to its LUN fine and under the “>Configuration>Storage Adapters” tab I can see that all 5 LUN’s are present and accounted for, but when I go to “>Configuration>Storage” none of the LUN’s are present. One more piece of info, I did try changing the LVM option to 1 in the VIC to no success. Anyone run into this problem before or have any suggestions?

    Thanks,
    Mike

  6. slowe’s avatar

    Mike! Good to hear from you.

    It sounds to me like you just need to create VMFS datastores on the other LUNs. The Storage section of the Configuration tab only shows you configured VMFS datastores. Try using the “Add Storage…” link up in the corner to add VMFS datastores on the other LUNs (which are already being recognized and listed in the Storage Adapters section).

    Setting LVM.EnableResignature to 1 isn’t needed in this instance. It’s only necessary if you are going to take a Snapshot of one of the volumes, then create a FlexClone of that volume and re-present that back to the ESX servers with a different LUN ID.

    Hope this helps!

  7. Mike Zimmerman’s avatar

    OK, that makes sense. Thanks for the advice! I actually am using Flexcones for the vm LUN’s, but it looks like the LUN that I cloned wasn’t created as a VMFS datastore. I am probably going to see if I can find another LUN that has been created as a VMFS datastore and clone it, or make my own rhel4 vm LUN and then clone it. Is there a way to use the “Add Storage” option without a loss of data on the LUN?

    Thanks,
    Mike

  8. slowe’s avatar

    Mike,

    If you are using FlexClones and the clones are of FlexVols containing LUNs that were formatted as VMFS datastores, then enabling the LVM.EnableResignaturing option (setting it to 1) would be required. Also, keep in mind that the LUNs in the cloned volumes will, by default, be offline and will need to be remapped and put back online again in order for hosts to see them. Because they can’t be remapped into the igroup with the same LUN ID, they’ll have to be presented back as a different LUN ID, and that’s where resignaturing comes in. Check to make sure the LUNs in the cloned FlexVols are mapped and online.

    And no, there is no way to use “Add Storage” without reformatting the LUN. If the LUN was already created as a VMFS datastore, then the resignaturing option should take care of it.

    Good luck!

  9. Mike Zimmerman’s avatar

    Thanks!
    It looks like the gold LUN I used to create the clones wasn’t made as a VMFS dayastore afterall so I am going back and creating my own “gold” LUN. Thanks again for your help and your quick replies!!

    Mike

  10. Andy Archer’s avatar

    I think it would make sense to add to here that LVM.EnableResignature set to 1 is probably not a good setting for long term use in case it unintentionally resigns VMFSs in the future. I would advocate resetting LVM.EnableResignature to 0 once the LUN in question has been resigned.

  11. slowe’s avatar

    Andy,

    Excellent point, you are absolutely correct. I will revise the article to suggest setting LVM.EnableResignature back to 0 after this process is complete. Thanks!

  12. Chauncey Willard’s avatar

    Scott, in your original post, you mentioned creating a script to register all .vmx that exists on a datastore.
    I wrote this one for a client and use it often:

    #!/bin/bash
    # this script registers every virtual machine found
    # on this server. By Chauncey Willard
    # Run this on each ESX hosts at the remote site

    # rescan the scsi bus to find new luns
    esxcfg-rescan vmhba32

    # create a text file that lists all files that end with .vmx
    # which should be all virtual machines
    cd /
    find -name *.vmx >/home/all_vmxs.txt

    # While loop that runs through every line of the file created above
    cat /home/all_vmxs.txt | while read LINE

    do
    # registers each virtual machine with VC
    /usr/bin/vmware-cmd -s register $LINE
    done

    That’s it. Works fine. tested many times. Chauncey

  13. mjerom’s avatar

    Thx, this article helped me a lot.
    I finally wrotte a script that can complete all the tasks;

    ###########################################
    #
    # Propos :
    # Deploiement d’un master en evironement virtuel (ESX) via FlexClone
    #
    # Pre requis :
    # Serveur VMWare ESX
    # FlexVol avec l’OS Master (Syspreped) + igroup “esx”
    # NAS NetApp avec FlexClone
    # SSH parametre entre les deux
    #
    ## memo ssh
    ## netapp
    ##secureadmin setup ssh
    ##secureadmin enable ssh2
    ##
    ## from ESX
    ##esxcfg-firewall -e sshClient
    ##vi /etc/ssh/ssh_config
    ##change Ciphers aes256-cbc,aes128-cbc
    ##to Ciphers aes256-cbc,aes128-cbc,3des-cbc
    ##ssh @IPNetApp
    ##accepter la cle puis sortir (^D)
    ##ssh-keygen -t dsa -b 1024
    ##mount @IPNetApp:/vol/vol0 /mnt
    ##mkdir -p /mnt/etc/sshd/root/.ssh
    ##more /root/.ssh/id_dsa.pub >> /mnt/etc/sshd/root/.ssh/authorized_keys
    ##vi /mnt/etc/sshd/root/.ssh/authorized_keys (otter root@esx..)
    ##umount /mnt
    ##

    ###########################################

    # VARIABLES Globales
    NB_CLONE=”100″;
    # VARIABLES NetApp
    IP_NETAPP=”172.18.16.254″;
    MASTER_VOL_NAME=”vol1″;
    MASTER_SNAPSHOT_NAME=”snap_master”;
    CLONE_VOL_NAME=”clone”;
    IGROUP_NAME=”esx”;
    # VARIABLES ESX
    MASTER_VM_NAME=”master-vm”;
    #NEW_RESSOURCE_POOL=”";
    DATASTORE_NEW_NAME_PREFIX=”clone”;
    NEW_VM_NAME_PREFIX=”clone”;
    PATH_2_VMX_PREFIX=”/vmfs/volumes”;

    ###########################################
    # NetApp side
    #

    # Create Master Snapshot
    ssh $IP_NETAPP “snap create -V $MASTER_VOL_NAME $MASTER_SNAPSHOT_NAME”

    i=”0″
    while (( $i < $NB_CLONE )); do

    # Clone from Master Snapshot
    CLONE_VOL_NAME=”$CLONE_VOL_NAME$i”;
    ssh $IP_NETAPP “vol clone create $CLONE_VOL_NAME -b $MASTER_VOL_NAME $MASTER_SNAPSHOT_NAME”;

    # Set the Snapshot Scheduler to 0
    ssh $IP_NETAPP “snap sched $CLONE_VOL_NAME 0″;
    # Map Cloned Lun
    CLONED_LUN_PATH=”/vol/$CLONE_VOL_NAME/lun0″;
    ID_CLONED_LUN=$((i+1));
    ssh $IP_NETAPP “lun map $CLONED_LUN_PATH $IGROUP_NAME $ID_CLONED_LUN”;

    # Bring Cloned Lun Online
    ssh $IP_NETAPP “lun online $CLONED_LUN_PATH”;

    i=$((i+1));
    done

    ###########################################
    # ESX side
    #

    # Enable LVM resignature
    /usr/sbin/esxcfg-advcfg -s 1 /LVM/EnableResignature

    # Rescan Storage Adapters
    for HBA in `vmware-vim-cmd hostsvc/summary/hba | grep vmhba | awk ‘{print $1}’`; do
    vmware-vim-cmd hostsvc/storage/hba_rescan $HBA
    done

    # Refresh Storage
    vmware-vim-cmd hostsvc/storage/refresh

    # Set Cloned Datastore’s name
    vmware-vim-cmd hostsvc/datastore/listsummary | egrep name | awk -F\” ‘{print $2}’
    echo “Nom du (des) datastore(s) clone(s)”
    read DATASTORE_CLONED_NAME

    # Change Cloned Datastore’s Name & Register the Cloned VM inside
    i=0
    while (( $i < $NB_CLONE )); do
    DATASTORE_NEW_NAME=”$DATASTORE_NEW_NAME_PREFIX$i”;
    NEW_VM_NAME=”$NEW_VM_NAME_PREFIX$i”;
    PATH_2_VMX=”$PATH_2_VMX/$DATASTORE_NEW_NAME/$MASTER_VM_NAME/$MASTER_VM_NAME.vmx”;
    vmware-vim-cmd hostsvc/datastore/rename $DATASTORE_CLONED_NAME $DATASTORE_NEW_NAME;
    vmware-vim-cmd solo/registervm $PATH_2_VMX_PREFIX $NEW_VM_NAME $NEW_RESSOURCE_POOL;
    i=$((i+1));
    done

    # Disable LVM resignature
    /usr/sbin/esxcfg-advcfg -s 0 /LVM/EnableResignature

    sorry for the french comments.