May 2007

You are currently browsing the monthly archive for May 2007.

When properly implemented and configured, VMware Virtual Infrastructure can make provisioning new servers a task that takes only minutes.  In fact, in my own lab (running equipment that is, admittedly, several years old and woefully underpowered), I can provision new servers running Windows Server 2003 R2 in less than 10 minutes.  That’s pretty impressive.

As impressive as those numbers may be (and I’m sure there are readers out there with even more impressive numbers), if we leverage some vendor-specific storage functionality we can achieve some really impressive times.  For example, leveraging NetApp FlexClones could allow us to provision new VMs in seconds.  Let’s take a quick look at how that’s done.

In this article, I’m going to discuss how to use FlexClones for provisioning new VMs in a VMware VI3 environment.  This is not an exhaustive treatise on the subject, but rather an introduction to the process and some of the configuration that needs to take place in your environment.  (Disclaimer:  Use this stuff at your own risk.)

Configuring ESX Server

First, we need to change the configuration of ESX Server to enable it to see the FlexClones on the SAN.  The change we need to make is to enable resignaturing; that is, to enable ESX to recognize an existing VMFS datastore even if it is presented on a different LUN ID than the LUN ID it had when it was created.  When a VMFS datastore is created, ESX (or VirtualCenter) places a signature in the datastore that contains the LUN ID (among other information).  If this datastore is then presented back out with a LUN ID that doesn’t match the LUN ID in the signature, then it won’t be recognized by ESX Server.  Since we’ll be using FlexClones to make identical copies of VMFS datastores (including their signatures) and then present them out as new LUNs (with different LUN IDs than the original), we need to enable resignaturing in order for ESX Server to see the new LUNs.

There are two ways to enable resignaturing:

  • From the command line, type “esxcfg-advcfg -s 1 /LVM/EnableResignature” (you must be root)
  • From VirtualCenter, select the ESX Server, go to the Configuration tab, select Advanced Setings, choose LVM from the list on the left, and then change the value of LVM.EnableResignature to 1

Once this change is set, ESX will recognize LUNs in FlexClones as “snap-XXXXXXXX-name”.  You can easily rename them once they have been added to VirtualCenter.

Please note that this process can introduce some oddities in your storage discovery/creation process.  Make sure that you have the LUN properly recognized and configured for access by all applicable hosts before you start placing VMs on the LUN.

Creating/Preparing VMs for Cloning

One advantage that VirtualCenter’s cloning has over this technique is that the process of preparing a VM for cloning is all automated—VirtualCenter handles all that behind the scenes, launching SysPrep for Windows guests or using open source software for other guests.  All an administator has to do is just make sure that SysPrep is installed on VirtualCenter properly.

In this process, the guest OS preparation has to be done manually, and the placement of VMs onto the VMFS datastores has to be considered.  Since we will be making exact copies of the VMFS datastores, all VMs on that datastore will also be copied.  If you are sure that one of the cloned VMs will never be started up from the cloned VMFS, then you can leave it alone, but any guest OS that will be started up in the cloned datastore will need to be prepared first.  Again, for Windows guests, this means running SysPrep to generate new SIDs and reseal the operating system to factory defaults.

Let’s say you wanted to be able to quickly provision servers running Windows Server 2003 using FlexClones.  You’d need to first create a new VM and the accompanying VMDK files, selecting to put that onto a VMFS that is either a) empty and will contain only this VM; or b) contains VMs that will not ever be powered on after they are cloned.  You’d then need to install Windows Server 2003 on that VM, install VMware Tools (not required but very recommended), install any applicable patches or third-party software packages, and finally run SysPrep to prepare it for cloning.  After all those steps have been done, you can create the FlexClone.

Creating FlexClones on the Storage System

Please note that there is a tremendous amount of additional information pertaining to the use of Snapshots in VMware environments that I have not covered here.  I highly recommend TR 3428 from NetApp, which covers this information in detail, including best practices for volume configuration, Snapshot reserve, fractional reserve, etc.

Now, having said all that, and assuming that you’ve followed some of these guidelines, here’s how we go about creating FlexClones on the storage system.  (This assumes you’ve built a VM and prepared it for cloning as described in the previous section.)

  1. Logged into the storage system with appropriate permissions, take a snapshot of the FlexVol containing the LUN that has the VMFS datastore you want cloned.  You can call this Snapshot something like “base_clone_snapshot” or similar, but be sure to use a name that makes sense to you and helps you understand the purpose of this snapshot.  The command to do this would be:
    snap create fvol_master clone_base_snapshot

    This creates a Snapshot of the FlexVol “fvol_master” named “clone_base_snapshot”.

  2. Create a FlexClone based on the Snapshot you just created:
    vol clone create fvol_clone1 -b fvol_master clone_base_snapshot

    This creates a new FlexVol named “fvol_clone1”, which is based on the Snapshot named “clone_base_snapshot” in the FlexVol “fvol_master”.

  3. Because this is an exact copy of the original flexible volume, including LUNs and LUN maps, Data ONTAP will spit out some messages about LUNs being taken offline and such.  To fix this, unmap the LUN(s) in the new FlexClone and remap them with different LUN IDs:
    lun unmap /vol/fvol_clone1/lun_name igroupname
    lun map /vol/fvol_clone1/lun_name igroupname 3

    Obviously, substitute the appropriate LUN ID for the “3” in the above command line.  This remaps the LUN to the specified igroup with a new LUN ID and, assuming you’ve enable resignaturing, makes the LUN (which is a VMFS datastore) visible to ESX Server and VirtualCenter.

  4. Unless you want Snapshots of the FlexClone, disable scheduled Snapshots on the FlexClone using the “snap sched” command:
    snap sched fvol_clone1 0

    This disables scheduled Snapshots, but manual Snapshots are still allowed.  (To disable all Snapshots, you’d need to set the no_snap volume option.)

At this point, you now have the original VMFS datastore and any virtual machines contained therein (contained in the LUN on the original FlexVol), as well as an exact copy of that VMFS datastore (contained in the LUN on the FlexClone).

Registering the VMs

The VMs (comprised of the VMX, VMXF, NVRAM, and all VMDK files) were cloned along with the LUN and the FlexVol, but VMware doesn’t know they are there. In order for the VMs to be usable, we must first register them.

  1. Log into one of the ESX servers as root.  You may either SSH in as a normal user and su to root, or login at the console as root.
  2. Use the vmware-cmd utility to register the VMs.  Let’s assume that you called the FlexClone “san-lun-clone1” in VirtualCenter, and that a VM called “win1” exists on that VMFS datastore.  The command to use would look something like this:
    vmware-cmd -s register /vmfs/volumes/san-lun-clone1/win1/win1.vmx
  3. For each VM on the datastore that needs to be recognized by ESX (and has been properly prepared in advance, as noted above), repeat this process.  With a little work, it should be fairly easy to write a script that finds all the *.vmx files on a datastore and registers them.  (Anyone care to take up that challenge?)

At this point, you now have the following:

  • The original SAN LUN, with all the VMs stored there
  • A cloned SAN LUN, with all the same data as the original but occupying far less space than a traditional copy)
  • VMs registered and ready for use from both SAN LUNs

Having already enabled resignaturing, created and prepared the VMs, and taken the base snapshot, you could now easily create additional clones by simply creating the FlexClone and registering the VMs.  If you were to have a script that automated that process (perhaps using SSH shared keys or RSH to access the NetApp storage system from ESX), that entire process could be fairly easily automated.  I’ll leave that automation as an exercise for enterprising readers.

As a matter of best practice, please note that leaving resignaturing enabled (i.e., leaving the LVM.EnableResignature setting to 1) may lead to problems if LUNs are inadvertently re-signed. For long-term operation, I would advise users to disable resignaturing once cloned LUNs have been re-signed and are visible in the VI Client.

In future articles, we’ll take a closer look at the question of “Should I use FlexClones?” instead of “How do I use FlexClones?”.

Tags: , , ,

Microsoft has been hyping up its Windows Hypervisor (“Viridian”) for quite some time now, talking about how the Viridian feature set is going to leapfrog functionality that is currently available on the market.  Now, in this article posted this morning, Microsoft has revealed that they have to pull three key features from Viridian in order to meet its delivery schedule.

Of the three features being cut, the most notable to me is live migration.  VMware users will know this better as VMotion, the ability to move a live VM from one physical host to another with no downtime.  This is a key enabling technology, and the fact that Microsoft won’t have this functionality is a huge win for VMware (and, to a lesser extent, XenSource).

But live migration isn’t the only feature on the cutting block:

The initial release of Viridian also won’t support on-the-fly, or “hot,” adding of memory, storage, processors or network cards. And it will support computers with a maximum of 16 processing cores—for example, eight dual-core chips or four quad-core chips.

I agree with the article’s author that limiting support to 16 cores isn’t a huge deal, at least not now.  A year from now?  Hard to say for sure if it will be an issue then, and if I were VMware or Xen I’d make sure my software didn’t have that limit.  Even if it doesn’t matter, let’s face it:  marketing is half the battle, and saying that your product can handle more horsepower than Microsoft’s product can’t hurt.

Likewise, “hot adding” resources isn’t a big deal today.  The article points out that Xen has this capability today; VMware does not.  (As an aside, I’m curious to know if Xen’s hot-add functionality is only supported for modified guests.  Anyone know?)  Note that the article is incorrect in pointing out that hot-add support is mitigated somewhat by live migration; that would be true only for adding resources to the host, not the guest.  Without hot-add support, we would still require downtime at the guest level.

It’s not all good for VMware, though.  In my humble opinion, VMware needs to stay vigilant and not discount Microsoft (or Xen) as a competitor.  If they get complacent, then they will most assuredly lose their leadership position.  VMware needs to continue to innovate and drive the market to new areas that were previously considered impossible (such as the experimental Record/Replay support in the newly released VMware Workstation 6.0).  What is possible now with Record/Replay?  How could this functionality be used in ways that people aren’t considering today?  Can we continue to add value to ACE and VDI deployments by combining these technologies?  It probably would be a good idea for them to release VDM (Virtual Desktop Manager), currently available only through a professional services engagement, to the public as a product as well.

VMware has a tremendous lead in the virtualization market, but smart, nimble, and powerful competitors are waiting for a misstep.  Here’s hoping there isn’t one.

Tags: , , ,

Dead PowerBook

Back in September of 2003, I made the switch from Windows XP Professional on an HP laptop to Mac OS X (10.2, or “Jaguar”, at the time) on a 15“ 1GHz PowerBook G4.  Over the next three years, I upgraded the laptop to Panther (using the ”Archive and Install“ method) and Tiger (using a clean build), and throughout it all the laptop performed without any issues.  I used it everyday up until the day I purchased my Core 2 Duo-based MacBook Pro.  At that time, I gave the laptop to my daughter to use at college.

Upon getting home from work yesterday, my daughter came to me and said, ”Dad, can you look at my laptop?  It won’t start up.“  Convinced that the problem was a loose nut behind the keyboard (think about that—you’ll get it in a minute), I set out to fix the laptop.  Quite some time later, I came to my final conclusion regarding the status of the laptop:  it was dead, the victim of a failed hard drive.

Well, a failed hard drive is never a desired event, but in the grander scheme of things this was actually good thing.  After all, it could have been the motherboard, or the screen display, or any number of other things that could have failed.  At least the hard drive is (relatively) inexpensive and straightforward to replace.  After all, had it been the display the entire laptop would have been shot.  At least this way we can still salvage the laptop.

So, sometime in the next few days, I’ll need to purchase a new laptop hard drive and crack open the case of my PowerBook G4 to replace the failed drive and then re-install Tiger and all her applications.  (Fortunately for her, no critical data was stored on the laptop. Thank goodness for that, because she hadn’t been backing anything up.)

I’ve started looking for replacement hard drives, but does anyone have any recommended vendors or models to consider?  Price is more important than capacity or performance here…we have to do this on a budget.  Any suggestions or recommendations would be greatly appreciated.

UPDATE:  I found a new Hitachi 60GB 5400RPM drive from Other World Computing for about $68 (after tax and shipping) and installed it a couple of nights ago.  The process only took about 10 minutes and went flawlessly.  I’m now in the process of reinstalling all the software, patches, updates, etc., which is a far more time-consuming process.

Tags: , , ,

HP c-Class Training

I’m in training this week in Canada for the HP c-Class BladeSystem.  That means two things:  1) I was unable to attend VMware TSX 2007 in Las Vegas (bummer!), and 2) blog posts may be a bit less frequent than normal.  (Of course, I don’t really have a “normal” blog posting routine, but you get the idea.)

The HP c-Class BladeSystem is HP’s latest blade offering, replacing the earlier p-Class enclosure and blades with a new enclosure (10U instead of 6U), new blades (both half-height and full-height blades, up to 16 half-height blades per enclosure), new interconnects (both Ethernet and SAN), and an entirely redesigned internal architecture.  In my opinion, it’s a pretty impressive offering.

With this new BladeSystem, HP offers the following:

  • Up to 16 half-height blades (the BL460c or BL465c) in 10U of rack space; each half-height blade can go up to dual socket/dual core configurations
  • Up to 8 full-height blades (the BL480c, the BL685c, or the BL860c) in 10U of rack space; the BL480c offers dual socket/dual core (and quad core) configurations; the BL685c offers quad socket/dual core configurations.  (The BL860c is an Itanium-based blade.)
  • Mix and match full-height/half-height blades within an enclosure as needs dictate (with a few caveats, of course)
  • A variety of interconnect options for both Ethernet (including a Cisco switch that runs IOS—great for guys like me who are familiar with IOS) and Fibre Channel (both Cisco MDS and Brocade SAN switches are available)
  • A variety of mezzanine expansion options that add capabilities to the blades, like additional NICs or Fibre Channel connectivity

With all these options and all the various ways these pieces can be put together, planning and deploying one or more of these enclosures really demands some professional services assistance.  That’s great for guys like me who work in the professional services arena, but it’s going to take some work to get customers to realize that this isn’t the typical server deployment.  Yes, the blade servers themselves are very similar to your typical rack-mount server, but when you combine the blade servers with the interconnects and the mezzanine cards—the sum of the whole becomes more complex than the individual parts.

It’s also going to place more of a burden on the sales team to properly position solutions based around the c-Class BladeSystem to customers.  In some ways, that will be a greater challenge than dealing with the customers.  (Those of you that work in technical pre-sales roles will appreciate that comment.)

This generation of the blade enclosures and the blade servers themselves also addresses one key concern I had with earlier generations and using them as a virtualization platform:  a lack of NICs.  With the c-Class, I can now provision quad socket/dual core AMD Opteron-based server blades (the BL685c) with up to 12 NICs (four onboard plus two quad-port NICs in the mezzanine slots), and still have room for Fibre Channel connectivity.  That’s pretty impressive and makes for quite a platform for VMware ESX Server, if you ask me.

I’ll post more information here as the class progresses.  If anyone else has any experience (good or bad) with the c-Class enclosures, I’d love to hear about it in the comments.

Tags: , ,

Newer entries »