blog.scottlowe.org

The weblog of an IT pro specializing in virtualization, storage, and servers

Archive for December, 2007

Hyper-V Architectural Issue

December 31st, 2007 by slowe

Alerted to this information by Alessandro at virtualization.info, it appears that Hyper-V will not boot VMs via a SCSI virtual disk.  Instead, Hyper-V virtual machines will only boot from an IDE virtual disk.

In case you’re wondering why this really matters—as I was when I first saw this headline—this post from Tony Voellm at Microsoft may shed some light on the issue:

The IDE controller implements a well-known IDE controller and this means there is extra processing before the I/O is sent to the disk. This processing occurs in vmwp.exe (a user mode process that exists for each started VM. More on this in a later post). Once the IDE emulation is complete the I/O is sent into the Root Partition’s I/O Stack. I/O completion requires a trip back to vmwp.exe.

And, as confirmed in the comments to that article, Tony confirms that Hyper-V VMs will be required to have the OS installed on a virtual IDE disk.

This latest revelation comes hard on the heels of a less-than-favorable review of Hyper-V by InfoWorld, where Hyper-V—which is supposed to compete with ESX Server—is instead compared with VMware’s hosted product VMware Server.  This is in addition to numerous delays and feature cuts.  I guess Hyper-V’s start is going to be rockier than I thought.

UPDATE:  Thanks to some information from Ben Armstrong in the comments, it appears that the performance impact from the use of virtual IDE disks may not be as significant as suspected.  Check out Ben’s comment below and read the blog posting referenced in his comment for more information.

Category: Virtualization | 3 Comments »

VMware HA Failover Capacity Changes

December 31st, 2007 by slowe

Continuing the discussion regarding VMware HA failover capacity started in this article and continued in this follow-up article, it appears that VMware has added the ability to modify the “slot size” used in calculating VMware HA failover capacity as part of ESX Server 3.5 and VirtualCenter 2.5.

Alerted to this VMware KB article by Duncan Epping of Yellow Bricks in this posting on his site, there’s a reference in the PDF from VMware that discusses a new option for setting the default “slot size”.  To quote from Duncan’s site:

If no VM reservations are set in a cluster VMware HA assumes cluster-wide average CPU and memory reservation sizes of 256 Mhz and 256 MB to use in admission control calculations. Alternative values can be specified instead…
 
Add the das.vmMemoryMinMB = <value> and das.vmCpuMinMHz = <value> option/value pairs to the cluster’s settings where <value> represents the desired values in terms of MB and MHz. Higher values will reserve more space for failovers.

So this looks like it’s allowing us to specify how VMware HA should calculate the default slot size, but as in so many areas of VMware HA there is precious little documentation.  I like VMware HA; I really do.  But VMware needs to get somebody on the ball to document the exact configuration and operation of VMware HA so that this solution becomes less of the “black box” that it is today.  As it stands currently, many customers are forgoing the benefits of VMware HA because it can’t be reliably and consistently configured and debugged.

<aside>Cases in point: I was at a meeting before Christmas with a customer who is having problems with their VMware HA clusters and we can’t find anyone—inside or outside of VMware—that can speak definitively about VMware HA, how it should be configured, or how it operates.  Back in October, I blogged about problems with isolation response, and still haven’t gotten those problems resolved.  C’mon, VMware, don’t drop the ball!</aside>

If anyone can shed some light on these new settings—I plan on testing them in the lab as soon as possible—that would be very useful.  In the meantime, I encourage everyone to check out the PDF linked in the VMware KB article on VMware HA best practices.  And, just for fun, check out this white paper on the VMware HA VM failure monitoring functionality that’s new in ESX Server 3.5.

Category: Virtualization | 6 Comments »

Hyper-V Off to a Rocky Start

December 27th, 2007 by slowe

While VMware was busy launching VMware Infrastructure 3 version 3.5, Microsoft was giving the virtualization industry an early Christmas present in the form of the first Hyper-V (formerly “Viridian” and “Windows Server Virtualization”) beta.  This beta was a bit of a surprise in that it was delivered slightly ahead of schedule; that in and of itself is a surprise given Microsoft’s track record with delivering products on time.  Can anyone say “WinFS” or “Vista”?

I applaud Jeff Woolsey and the Windows Server Virtualization team (their blog is here) for their efforts in delivering Hyper-V ahead of schedule.  Well, they’re ahead of schedule with regard to the last proposed release date, anyway.  Unfortunately, it appears that Hyper-V is off to a rocky start.

InfoWorld reviewed the Hyper-V beta back on December 19.  Here’s a brief excerpt from the article:

I wasn’t disappointed — everything worked right out of the box. From there, I had the system ready to handle virtual machines in a matter of minutes. A few minutes later, I ran into problems.

I encourage you to read the full article.  While the reviewer, Paul Venezia, has both good and bad things to say about the Hyper-V beta, he does reiterate that it is definitely a beta product.  This is important to remember when discussing Hyper-V—while we can discuss the product’s rough edges, we also need to remember that the product isn’t finished yet.

The most telling comment about the Hyper-V beta from the InfoWorld article is this one:

From what I’ve seen, Microsoft’s Hyper-V is roughly analogous to VMware Server 1.0, although not as polished. It doesn’t appear to be a significant challenge to VMware’s Virtual Infrastructure and ESX Server products, and given the fact that VMware Server is free, runs on Linux and Windows, and is considerably more mature, it’s questionable how many infrastructures will benefit from using Hyper-V over VMware Server. Hyper-V is certainly behind the curve, but shows that Microsoft sees the need to be competitive in this space. Only time will tell whether Microsoft can catch up to the virtualization leaders, or be forced to settle for a secondary role.

This quote got picked up by John Troyer over at the VMTN Blog, who again reiterates Hyper-V’s beta status.  I’m glad to see that John, at least, isn’t bashing Hyper-V.  That’s good, because if past performance is any indication Microsoft starts out slow but ramps up quickly.  VMware will want to stay vigilant to keep ahead of the 800-pound gorilla.  Bashing the competition isn’t the best way to stay ahead of the competition.

If any readers have direct experience with the Hyper-V beta, please post your knowledge and thoughts in the comments below.  Thanks!

Category: Virtualization | 8 Comments »

Latest VDI Article Published

December 20th, 2007 by slowe

SearchVMware.com has published my latest VDI article, a closer look at the integration of the connection broker with VirtualCenter:

n my last article on virtual desktop infrastructure (VDI), I discussed the three main components of a VDI solution: the virtualization servers and supporting infrastructure, the hosted operating system (OS) instances, and the connection broker. In this article, I’d like to take a more detailed look at the connection broker and some of the functionality that brokers provide in a VDI deployment.

A few more VDI-centric articles are in the works, so stay tuned here and to SearchVMware.com.  Thanks for reading!

Category: Virtualization | No Comments »

A Short Break

December 18th, 2007 by slowe

There will be a short break in blogging here while I enjoy some vacation time out of town with my family.

I appreciate all the readers who visit the site (or read the feed) on a regular basis.  I’d like to take this time to thank each and every one of you for reading.  According to FeedBurner, we’re now up to about 800 readers subscribed to the feed!  Please continue to share the site with others who may also find it useful.

I’ll be back with more content following the Christmas holiday.

Until then, I hope that God’s blessings are upon everyone this coming Christmas season.  Merry Christmas!

Category: General | 2 Comments »

OK, I Had to Comment on This

December 11th, 2007 by slowe

Naturally, an article titled “Why would you not choose Microsoft Virtual Server…?” is going to grab my attention.  Microsoft Virtual Server is a decent enough product in the hosted virtualization space, but certainly doesn’t compete with VMware ESX Server or other solutions based on a Type 1 bare metal hypervisor.

Apparently, this image sums up the argument against VMware Infrastructure 3 and for Microsoft Virtual Server.  Go look at the image, then come back and finish reading this article.

OK, done now?  Good.  Let’s take a closer look at the information presented in this article:

  • On the line listed “Migration,” there’s a checkmark in the column for the Microsoft solution.  Since Microsoft’s solution—today in the form of Virtual Server, late next year in the form of Hyper-V—lacks any form of live migration (and you can’t count “Quick Migration”), this table must be referring to cold migrations.  That being the case, then you don’t need the extra charges listed in the first column.
  • If that is not the case and we are referring to live migration functionality, then the checkmark needs to be removed from Microsoft’s column.  And, I would ask you this question: how much is it worth to be able to move a workload from one physical host to another physical host during the midst of the workday with no interruption in service?
  • Microsoft’s solution lacks any form of multi-chassis dynamic resource management, so that line must be referring to in-chassis resource management.  All versions of VMware products, from VMware Server to VMware Workstation to VMware Fusion to VMware Infrastructure 3, have resource management abilities.  On the other hand, if we are referring to the ability to dynamically distribute workloads across multiple physical hosts, then the checkmark in Microsoft’s column must be removed.  Microsoft’s solution doesn’t have that ability.  How do you quantify the ability to have all your workloads distributed evenly across the physical hosts—dynamically?
  • What about the things that aren’t listed on this table, like transparent page sharing?  Or some of the features in VI3 3.5 like Distributed Power Management (DPM)?  How do you quantify the value of these types of features?

Don’t get me wrong, Microsoft Virtual Server is a decent solution at a great price—free, last time I checked.  Of course, VMware Server is also a free hosted virtualization product.  But as I’ve said a million times already, you can’t compare Virtual Server and ESX Server.  They are two different classes of products, with different feature sets and intended at different markets.  When Hyper-V finally makes its debut late next year, then we can really discuss the merits of Microsoft’s hypervisor-based solution against the other hypervisor-based solutions, including VMware, that are available on the market.

Until then, articles such as this one are, in my humble opinion, useless.

Category: Microsoft, Virtualization | 10 Comments »

More Discussion on VMware HA Failover Capacity

December 7th, 2007 by slowe

A few other bloggers have picked up VMwarewolf’s article about calculating VMware HA failover capacity, which I wrote about a few days ago.

Thomas Bishop over at scalethemind.com (a fellow Planet V12n blogger) has this to say in his blog posting:

As I would expect, HA errors on the side of caution when determining the capacity (uses the host with the least amount of RAM and the guest with the most amount of RAM as the basis of the calculation).

I can certainly see his point; after all, VMware HA is all about planning for unexpected downtime.  How is VMware HA going to know which VM is going to fail, and which hosts—if any—will have capacity to run the failed VM(s)?  From that perspective, VMware HA almost must take a worst-case scenario approach in order to be prepared for a situation in which the VM with the largest amount of configured or reserved memory must be restarted on a host with the least amount of physical RAM.

Unfortunately, the white paper to which Thomas linked (found here on VMware’s web site) doesn’t do a very good job of providing any additional detail on the calculation of VMware HA failover capacity; in fact, it seems to contradict VMwarewolf’s settings to a certain extent.  For example, take this statement from the tech doc:

When computing required failover capacity, HA first considers the host with the largest capacity to run virtual machines with the highest resource requirements.

Unless I’m reading that statement incorrectly, that flies directly in the face of VMwarewolf’s posting, which states just the opposite.  However, the document goes on to say:

HA might therefore be quite conservative in its estimates if the hosts in your cluster have a wide variance in the individual resources they provide.

In addition, the tech doc recommends the use of more uniform systems in HA clusters, so as to avoid issues such as what we’ve been discussing (where a 32GB host might be treated as a 16GB host for the purposes of calculating VMware HA failover “slots”).  Otherwise, organizations may find themselves in this boat and VMware HA won’t be able to accurately protect them against physical host failure.

I’ll be sure to post more information here as soon as I have anything new to share.  Likewise, if anyone can shed some definitive information to corroborate VMwarewolf’s statements—just to validate them and ensure us that we aren’t creating a storm of discussion over nothing—that would be great.

Category: Virtualization | 4 Comments »

New VLAN Article at SearchVMware.com

December 7th, 2007 by slowe

SearchVMware.com has published another article of mine, this one on the various VLAN configurations within VMware Infrastructure 3 (VI3), the differences between each of them, and when each configuration may be appropriate.

Here’s the obligatory teaser excerpt from the article:

When VMware gurus talk about the use of virtual LANs (VLANs) with VMware Infrastructure 3 (VI3), they are usually referring to the use of VLAN trunks. There are, however, three other types of VLAN configurations VI3 uses: virtual switch tagging (VST), external switch tagging (EST) and virtual guest tagging (VGT).
 
This tip is your guide to VST, EST and VGT, covering what they are and when to use them.

Read the full article here.

Between this latest VLAN article, an earlier VLAN article published on SearchVMware.com, a VLAN article published here on my site, and the latest discussion of the use of the native VLAN, I’m trying to make sure everyone has the information they need to understand and use VLANs in their VI3 implementation.  If there are other networking-related articles you’d like to see, please shoot me an e-mail and let me know, or post your ideas/suggestions in the comments below.  Thanks!

Category: Networking, Virtualization | 1 Comment »

Managing LUN Space Requirements with NetApp Storage

December 5th, 2007 by slowe

If you’ve worked with Network Appliance storage before, you’re probably already familiar with the idea of snap reserve (storage space set aside to accommodate for Snapshots) and fractional reserve (used with LUNs).  I’m going to hold the in-depth discussion of why you need snap reserve and fractional reserve for a different day, but I did want to pass on these commands that were shared with me by a colleague of mine.  These Data ONTAP commands, available with Data ONTAP 7.2 or later (some commands are available in Data ONTAP 7.1), will help you manage the space requirements for LUNs on a NetApp storage area network (SAN).

I’ll try to explain the commands along the way, but I would recommend you review the documentation available from the NOW site for more complete information.

vol options <volname> fractional_reserve 0

This command sets the fractional reserve to zero percent, down from the default of 100 percent.  Note that fractional reserve only applies to LUNs, not to NAS storage presented via CIFS or NFS.

snap autodelete <volname> trigger snap_reserve

This sets the trigger at which Data ONTAP will begin deleting Snapshots.  In this case, Snapshots will start getting deleted when the snap reserve for the volume gets nearly full.  The current size of the snap reserve can be viewed for a particular volume with the “snap reserve <volname>” command.

snap autodelete <volname> defer_delete none

This command instructs Data ONTAP not to exhibit any preference in the types of Snapshots that are deleted.  Options for this command include “user_created” (delete user-created Snapshot copies last) or “prefix” (Snapshot copies with a specified prefix string).

snap autodelete <volname> target_free_space 10

With this setting in place, Snapshots will be deleted until there is 10% free space in the volume.

snap autodelete <volname> on

Now that the Snapshot autodelete options have been configured, this command will actually turn the functionality on.

vol options <volname> try_first snap_delete

When a FlexVol runs into an issue with space, this option tells Data ONTAP to first try to delete Snapshots in order to free up space.  This command works in conjunction with the next command:

vol autosize <volname> on

This enables Data ONTAP to automatically grow the size of a FlexVol if the need arises.  This command works hand-in-hand with the previous command; Data ONTAP will first try to delete Snapshots to free up space, then grow the FlexVol according to the autosize configuration options.  Between these two options—Snapshot autodelete and volume autogrow—you can reduce the fractional reserve from the default of 100 and still make sure that you don’t run into problems taking Snapshots of your LUNs.

If you have a NOW login, you can get more information on Snapshot autodelete here; more information on volume autogrow is available here.  Be aware that SnapDrive may require different settings in order to accommodate its functionality, as it moves LUN management out of the storage system and onto the host.  Finally, the values presented here are only examples; be sure to use values that are appropriate for your environment.

Credit for compiling this list goes to my colleague Chauncey Willard.  Good work!

Category: Storage | 5 Comments »

Calculating VMware HA Failover Capacity

December 4th, 2007 by slowe

Most readers probably know that VMware High Availability (or VMware HA) is the feature of VMware Infrastructure 3 that allows for virtual machines (VMs) to be rebooted on another available host in the event of an unexpected host failure.  In these types of scenarios, a physical host goes down unexpectedly, typically due to hardware failure, and with it go a bunch of VMs.  With VMware HA, these downed VMs will reboot on a different physical server in the HA cluster, thus minimizing downtime.

I had always considered that the “failover capacity,” i.e., how the number of VMs that could be supported in an HA cluster with a failed host, was calculated by VMware HA in an intelligent fashion similar to that used by VMware Distributed Resource Scheduling (VMware DRS).  In other words, VMware HA would look at the needs of the downed VM, consider what is available across the various hosts, and then place virtual workloads accordingly.  Sadly, that is not the case.

This article, titled HA Failover Capacity, by a VMware technical support engineer—“VMwarewolf”—provides more detailed information on how failover capacity is actually calculated.  What actually happens is that VMware HA calculates a number of “slots” based on the least amount of RAM installed in a server in the cluster divided by the most amount of RAM configured for any VM in the cluster.  In the article, the example is given of a server that has 16GB with at least one VM that is configured for 2GB of memory.  That would create 8 slots (16GB / 2GB = 8 slots) for VMware HA.

That in and of itself is bad enough, since not all VMs will require 2GB, but here’s where it gets worse.  After calculating the number of “slots” available on the smallest server in the cluster, it then extrapolates the total number of slots in the cluster using the number from that smallest server.  So if one server in the HA cluster has 16GB but the remaining three have 64GB, all four servers will be treated as having only 16GB for the purposes of calculating HA “slots”.  So, instead of the three bigger servers coming up with 32 slots, they’ll show up as having 8 slots.  Ouch!

Be sure to keep this in mind when creating VMware HA clusters and planning for fault tolerance.

Also, if you aren’t reading VMwarewolf’s stuff, you may want to start.  He (or perhaps she?) is posting some good stuff.

Category: Virtualization | No Comments »