blog.scottlowe.org

The weblog of an IT pro specializing in virtualization, storage, and servers

Archive for September, 2007

Nifty NFS-VMware Trick

September 27th, 2007 by slowe

I can take absolutely zero credit for this idea; it came completely from this aticle by Nick Triantos.  But the trick is so absolutely cool, so incredibly useful, and yet so obvious (once you read it, you’ll smack yourself in the head and say, “Why didn’t I think of that?”) that I just had to say something about it.

The use of NFS is getting more and more attention (I blogged about it briefly a few days ago) as a primary storage technology for VMware deployments.  Although NFS lacks the raw throughput of Fibre Channel, once you start loading up VMs in a datastore NFS begins to look more and more attractive.  But performance is only part of the allure here, especially when using something like a Network Appliance storage system with its Snapshot functionality.  (Yes, other vendors can do the same kinds of things.  Substitute your favorite vendor or filesystem here, if you so desire.  I would imagine you could do something similar with ZFS.)

The basic gist of the article (I do encourage you to go read it; I’ve already added it to my del.icio.us bookmarks) is to use NetApp Snapshots to gain access to VMware’s VMDK files (even while the VM is running), and Linux with the Linux-NTFS driver to mount virtual machine disk files over NFS for file-level backups of both Windows and Linux guest VMs.  Now that’s something not even VCB can do (VCB file-level backups are limited to Windows guests).  Pretty cool, if you ask me.

Category: Virtualization, Storage | 11 Comments »

Apple and VMware…or Xen?

September 27th, 2007 by slowe

For a company that wants their virtualization technology to be ubiquitous, it would seem to make sense that VMware needs it to run on every major host operating system.  Right now, VMware has Windows and Linux covered.  But what about OS X?  There is a tremendous amount of attention being paid to OS X right now, from many different sides, and Apple seems to be pushing OS X in a number of different directions (such as using OS X as the basis for the iPhone, and rumors of future OS X-based iPods circulating).  In my mind, it seems to make a lot of sense that both Apple and VMware could benefit from a closer relationship.

Think about it:  Extending VMware ACE to include Mac OS X would now mean that VMware could have secure VMs running on pretty much any significant x86-based operating system from any significant manufacturer.  The endpoint becomes irrelevant.  Have a contractor that runs OS X?  Not a problem, we can extend a secured, policy-controlled VM to his/her Mac laptop without any issues.  Pocket ACE in action with all three major x86 host operating systems covered means that you can truly take your computing environment anywhere.  It’s a powerful thought.

Similarly, bringing VMware Player to Mac OS X gives Mac users out there exposure to the same wide range of virtual appliances that Windows and Linux users can currently access.

Not to be left out, I’m sure there are Xserve users out there that would love to have VMware’s mature hosted virtualization technology running on their Mac OS X Server-based systems in the form of an OS X version of VMware Server.  Anyone care to run OS X Server, Windows Server, and Linux all on a single piece of hardware in your datacenter?

“Wait a minute, Scott,” you say. “Apple won’t let us virtualize Mac OS X.”

Who said anything about virtual OS X?  You’re right, of course; Apple has yet to budge on that front.  However, that thought does lead me to my next thought:  what will Apple do if VMware (or Parallels) doesn’t provide the virtualization technology for their platform?

Apple has a history of integrating open source projects into Mac OS X; consider the FreeBSD-based underpinnings, the Apache web server, the Postfix mail server, and so forth.  What’s to stop Apple from integrating the Xen hypervisor?

Sun is integrating Xen as xVM; Microsoft’s Windows Hypervisor (which is now available for public preview, by the way—I plan to have a look at it very soon) bears many architectural similarities to Xen, and of course Citrix will be using Xen in some significant way now that it’s purchased XenSource for $500M.  Why not Apple?  Why not integrate Xen into the Apple code base?  Apple can integrate Xen into their code base, release the open source bits as part of Darwin, and create their own virtualization solution.  Apple controls the hardware base, after all, so it wouldn’t be all that terribly difficult to write Xen-optimized drivers for alternate operating systems running under Mac OS X.  I would imagine it would also be much easier to control the virtualization of Mac OS X if it were occurring on a version of OS X with Xen integrated.

So am I just crazy?  Tell me what you think.

Category: Macintosh, Virtualization | 6 Comments »

NFS for VMware Storage

September 21st, 2007 by slowe

I thought that I had blogged here before about using NFS for VMware storage, but it appears that I have not.  (I guess that’s one of the downfalls of a fairly long-running weblog—you blog about some things too often and not at all about other topics.)  In any case, following some of the VMworld breakout sessions last week, NFS is getting a lot more attention these days as the storage protocol for VMware.

A couple of recent blog entries on this topic caught my attention:

Eisler’s NFS Blog - VMware over NFS?
Storage - VMware over NFS

Network Appliance seems to be talking the most about NFS for VMware, which kind of makes sense given their history in NFS.  I’m using NFS in our lab (which uses NetApp storage systems) and have had nothing but positive experiences thus far.  I have not yet had the opportunity to conduct any performance tests, but I do plan to try to work up some numbers on NFS vs. software-based iSCSI.  I can’t, unfortunately, compare to Fibre Channel as I have no FC infrastructure in the lab (yet).

I’d love to hear feedback from any readers that might be using NFS for VMware storage.  What have your experiences been?

Category: Virtualization, Storage | 10 Comments »

Feature….or Flaw?

September 20th, 2007 by slowe

In the end, I’ll leave it to the readers to decide whether this functionality I uncovered today (which is not entirely secret, per se, just not widely discussed) is “feature” or “flaw”.  I’m kind of split on the issue myself.

Here’s the story.  I’m working on a VDI (virtual desktop infrastructure) project for a customer, and this application the customer uses to help manage their desktops (both physical and virtual) kept popping up a security warning dialog box every time a user logged in.  We needed a way to suppress the warning so that the dialog didn’t keep coming up over and over as users roamed within the pool of available desktops.  What’s stranger is that it only seemed to happen on some desktops, but not others, even when the systems were absolutely identical.  We were baffled.

Finally, this minor comment in Microsoft KB889815 got me to thinking:

This behavior is new in Windows XP SP2 because of the addition of the Attachment Execution Services (AES). Every program that is run by using the ShellExecute() API passes through AES. AES considers the downloaded update file to be from the Internet Zone. Therefore, AESdisplays the Open File - Security Warning dialog box. AES examines the file to see whether the file has a file stream of the type Zone.Identifier. Then AES determines what zone the file is from and what level of protection to apply when the file is run.

Ah…file streams!  I’d learned long ago (when reading Inside the Windows NT File System, by Helen Custer) about alternate data streams.  The idea is that a single file in NTFS can store different streams of data.  At the time, I advocated the use of data streams for use by Office, such as for storing different versions of the file.  Unfortunately, Microsoft never seemed to capitalize on the idea…until recently.

The situation here is that when you download a file from the Internet using Internet Explorer, IE automatically creates an alternate data stream named “Zone.Identifier”.  The contents of this alternate data stream are this:

[ZoneTransfer]
ZoneID=3

I’m not sure when this behavior started (with which version of Internet Explorer and/or which version of Windows), but I do know—according to the KB article above—that as of Windows XP Service Pack 2, the presence of this alternate data stream helps Windows determine what kind of security policy to apply to the file.  What does this mean?  Let me walk through it with you:

  1. You download an executable file from the Internet.  You are familiar with the site from which you are downloading this file, but the site is in the Internet Zone (Zone 3).
  2. When you download the file, IE automatically attaches the alternate data stream and embeds the zone information in this file.
  3. You copy this executable to a location on your local hard drive and you run it.  You don’t expect to be prompted for anything, since the executable is running from your local hard drive.
  4. Windows checks the alternate data stream, sees the embedded IE security zone, and applies a security policy appropriately—in this case, popping up a security dialog box prompting the user to confirm the action.

Now, forever for the life of this file (at least as long as it remains on an NTFS file system), it will have the IE security zone embedded with the file and Windows will apply security policies based on that information.

So, in this particular case, this company had downloaded this executable from the vendor’s site, then distributed it to hundreds of PCs throughout the organization.  Presumably, somewhere along the way some of them lost the alternate data stream (perhaps they were copied to a non-NTFS partition at some point; this organization did have Novell also), and those were the ones that weren’t popping up the security warning.  Those that did have the alternate data stream had the warning that the user had to acknowledge—despite the fact that the executable was running from the user’s local C: drive.

To fix it, we used the Streams utility from Sysinternals (now part of the Borg…er, Microsoft collective).  This tool allowed us to identify and remove the alternate data stream.

So here’s where this discussion turns to the title: is this a feature, or a flaw?  Clearly, Microsoft intended this as a security feature.  I could see tagging the file on the system where the file was downloaded, but having the alternate data stream follow the file throughout the network (for the duration of the file’s life, as long it remains on NTFS) seems just crazy.  Not to mention that Microsoft provides zero ways to identify, track, or remove alternate data streams.  That’s right—there’s no way to find or track alternate data streams on a single PC, much less find or track alternate data streams across a bunch of PCs on a network.

So again I ask: feature, or flaw?  I’d love to hear your thoughts.

Oh, yes, I almost forgot—for more information on identifying alternate data streams or viewing the contents of alternate data streams, see this site.

UPDATE:  I forgot to mention that Firefox does not add the alternate data stream to files downloaded from the Internet.  This makes sense, of course, given that Firefox doesn’t use the idea of “security zones” and is designed to be a cross-platform browser.  Nevertheless, I thought it important to point this out.

Category: Microsoft | 8 Comments »

One Potential Issue in AD Integration Scenarios

September 17th, 2007 by slowe

Regular readers of this blog know that I like to work on integrating various systems into Active Directory.  I’ve written a couple of articles on the issue:

Linux-AD Integration, Version 4
Solaris 10-AD Integration, Version 3
Active Directory Integration Index

These articles have been pretty successful and from what I understand have helped a fair number of people integrate their non-Windows systems into Active Directory for simplified user management and authentication.  Occasionally, though, we run into the odd issue that isn’t quite so straightforward to resolve.

For example, I recently had a reader (let’s call him Johnny) who was having a difficult time getting the Linux-AD integration to work.  The “ldapsearch” and “kinit” commands worked fine, but “getent passwd” or “getent group” failed with no output.  The users in Active Directory did indeed have UNIX attributes added to their accounts.  There were no firewalls between the non-Windows systems and the Active Directory domain controllers, and there did not appear to be any connectivity issues whatsoever (this further underscored by the fact that “ldapsearch” successfully returned LDAP search results from AD, and “kinit” successfully obtained a Kerberos ticket from AD).  We were stumped.

Johnny and I traded e-mails back and forth a few times, until finally Johnny found his error and notified me about what had been happening.  As I read the description about the problem, I realized that this may be a problem that is affecting a lot of users, and may, in fact, have stumped some of you out there reading right now.  Here’s the details.

The method that I suggest using for AD integration uses two parts:

  • First, we use Kerberos to obtain a Kerberos ticket from an Active Directory domain controller (also a Kerberos key distribution center, or KDC).  This handles the authentication side of things and prevents the password from crossing the wire at any point in time.
  • Next, we use LDAP to centrally store account information, such as UID number, GID number, home directory, login shell, etc.  This is the part that typically requires schema extensions (although there is a workaround for that) and using this technique ensures that we don’t have to manage accounts individually on each Linux server.

This approach doesn’t work without both pieces.  The Kerberos authentication takes care of the password, but without account information logins still fail.  So if Kerberos works but LDAP doesn’t, logins will fail.  If Kerberos doesn’t work but LDAP is fine, logins will fail.  So part of troubleshooting this configuration is isolating where the problem lies.  In this particular case, “kinit” worked fine—no error was returned and “klist” showed a valid Kerberos ticket.  So the problem had to be with LDAP.  But where?  The “ldapsearch” command worked fine.

The problem lie with the /etc/ldap.conf file.  See, the nss_ldap libraries (which are responsible for using LDAP—and other sources, as defined in /etc/nsswitch.conf—as the backend information database for account information) are controlled by this file, but “ldapsearch” does not use it.  Specifically, the error was with the account that is used to bind (or connect) to Active Directory to perform the searches.

There are two ways of specifying this account in /etc/ldap.conf.  You can use the full DN, which looks something like “cn=Scott Lowe,cn=Users,dc=example,dc=com” or “cn=John Smith,ou=Marketing,ou=Departments,dc=example,dc=com”.  Alternately, you can use the universal principal name (UPN), which looks something like an e-mail address, such as “slowe@example.com” or “john.smith@example.com”.  In this particular case, Johnny (our reader with the problem) was using the full DN, but he was using the wrong attribute in the DN.  Here’s the information he had:

First Name: John
Last Name: Smith
Full Name: John Smith
Display Name: John Smith
UPN: jsmith@example.com
SAM Account Name (downlevel logon name): jsmith
Object name: jsmith

Which of these do you suppose should be used in the DN?  Full name?  No.  Display name?  No.  It must be the object name, in this case “jsmith”.  You can double-check your object name (or CN) using ADSI Edit or a similar utility.  You could use Active Directory Users and Computers, but that’s typically the confusing part.  In any case, once Johnny fixed the syntax for the bind account then “getent passwd” and “getent group” worked like a champ.

How do we avoid this kind of issue?  Simple: just use the UPN instead of the full DN.  This syntax works just as well and avoids the potential problem of using the wrong name when building the full DN.

Category: Interoperability | 7 Comments »

Finally Home from VMworld 2007

September 15th, 2007 by slowe

After hitting VMworld 2007 in San Francisco hot and heavy (and trying to liveblog as many sessions as possible, as well as the keynotes), my wife and I took yesterday to do some additional sightseeing beyond what we’d done on Sunday prior.

Armed with one day MUNI pass for each of us, it was on the buses and the cable cars all over San Francisco, hitting the major highlights along the way:

  • the Bay Bridge;
  • the Port of San Francisco building and the Ferry building tower;
  • Pier 39;
  • the Golden Gate Bridge;
  • Golden Gate Park (specifically, the Conservatory of Flowers); and
  • riding a cable car downtown.

We wrapped up the evening with a large pizza from Blondie’s Pizza, and then returned to the hotel (the San Francisco Marriott on 4th Street) to pack and get ready for the return trip home.

Today, we traveled home, stopping off in Chicago along the way.  And, for the first time in many, many trips…our return trip was not delayed, canceled, or otherwise troubled in any way!  Someone was praying for a safe trip back for us, and I’d like to thank whoever that was.

It was a great trip, and I have tons of photos from yesterday (I may upload them somewhere; I haven’t decided yet), but in the end I’m glad to be home.  Thanks to everyone who helped make this trip a great one!

Category: General | No Comments »

VMworld 2007 Top Support Issues Session

September 13th, 2007 by slowe

My last session of Day 3, and of the entire conference, was a super-session on the top support issues and how to resolve them.  For someone who wasn’t already familiar with some of the Service Console command-line utilities (such as esxcfg-vswitch, esxcfg-vmknic, esxcfg-vswif, etc.), this was a great session.  For someone already pretty comfortable with these tools, the session was less helpful.

The session centered around six or seven top support issues:

  1. Unable to connect to the Service Console
  2. NICs in a bond are not in the same broadcast domain
  3. Expanding a VMDK when there is an existing snapshot
  4. Corrupt snapshot (.VMSD) file
  5. Corrupt snapshot
  6. Adding extents to VMFS volumes
  7. Recovering a VMFS partition

Possible causes of problem #1 include deleting the vSwitch that houses the vswif.  To fix the problem, you can probably just switch out NICs (and use esxcfg-nics to unassign and reassign NICs to the appropriate vSwitch), adjust VLAN properties (using the esxcfg-vswitch command to modify the port group), or recreate the vswif interface (using the esxcfg-vswif command).  In some cases, it may be necessary to completely recreate the networking configuration.  The process for completely rebuilding the networking configuration looks like this:

  1. Use esxcfg-vswitch to delete all vSwitches
  2. Create a new vSwitch using esxcfg-vswitch
  3. Create a new port group for the Service Console (again, using esxcfg-vswitch)
  4. Link a physical NIC to the vSwitch
  5. Create a vswif interface (using esxcfg-vswif) and configure it with the correct IP address, subnet mask, and default gateway

Many Service Console issues relate to changing the Service Console configuration.  It’s recommended to create a “backup” Service Console connection before modifying the primary connection, in case something doesn’t work as expected.

Problem #2 I wasn’t so clear on.  I know what a broadcast domain is, but my experience with the “network hints” that are displayed in VirtualCenter and with esxcfg-info have been less than accurate.  I guess the key take-away here is to make sure that NIC members in a bond are on the same VLAN or subnet, and that the network hints might help you determine if they are not.

Problem #3, expanding a .VMDK with an existing snapshot, is a bit more interesting.  In this case, it’s necessary to parse the delta disk (look for the line starting “RW”), take that value, and replace the value in the base disk’s matching “RW” line with the value taken from the delta disk.  It’s then necessary to commit the snapshot using vmware-cmd.

With problem #4, the .VMSD file that tracks snapshots and snapshot relationships has become corrupt.  Fortunately, it’s reasonably easy to correct:  delete the .VMSD, create another snapshot (which then recreates the .VMSD), and then commit all snapshots.  As part of this process, you lose the ability to selectively rollback to a specific snapshot, but there is no data loss.

Problem #5 is similar, but the actual snapshot—the delta disk—is now the one that is corrupt.  This typically occurs when the VMFS partition gets full (remember that a snapshot involves the use of a delta or “differencing” disk that stores all the changes to the base disk, and this can fill up a SAN LUN over time).  In this case, there will be data loss.  To fix the problem, we move the last delta file (or delete it) and edit the .VMX file to point back to the last intact snapshot.  All changes since that last intact snapshot will be lost.  It’s recommended that all snapshots should then be committed using vmware-cmd.

With regards to VMFS volumes and adding extents, it’s very important to perform a rescan (using esxcfg-rescan or VirtualCenter) on all ESX hosts after adding an extent.  This is because when an extent is added its added on that host only, and if you then add an extent to that same VMFS partition from a different host, the VMFS partition will get corrupted.

Finally, the presenter went into some detail on recovering lost VMFS partitions.  If it’s only the partition information that’s missing, that can be recovered using fdisk.  If the partition was formatted….well, that’s a much different story.  Backups are the best response there.

We wrapped up the session with some support “best practices”:

  • Backups!
  • Run vm-support (or gather diagnostics logs via VirtualCenter VI Client) periodically
  • Implement change control in your environment
  • Record the date and time when something happens, as this will help when trying to correlate log data

And with that, the session ended, and so did VMworld 2007.  I was both thankful it was over, and yet disappointed that it had to end so soon.

Category: Virtualization | 2 Comments »

VMworld 2007 IP-Based Storage Sessions

September 13th, 2007 by slowe

I just wrapped up two different sessions on IP-based storage, one on iSCSI configuration and one on performance characteristics and comparisons between iSCSI and NFS.

I couldn’t liveblog the first session because it was too crowded (no room to type on my laptop) and I couldn’t get a wireless signal from the VMworld 2007 network.  There are, however, a couple key points from the session that stick out in my mind:

  • ESX Server does not currently support MCS (multiple connections per session) or jumbo frames, two key optimizations that can really help with iSCSI performance.  There is no word yet on when those shortcomings will be addressed; personally, I’m hoping that VMware fixes them in ESX 3.5.
  • There is, apparently, some way of performing manual load balancing of iSCSI LUNs to help improve performance.  The speakers did not go into any great details, and I was unable to speak with one of the presenters, Jon Hall, after the session.  He did, however, invite me to contact him via e-mail, so I’ll post more information on that once I’ve had some communication with him.

Most of the rest of the information presented in that session was pretty straightforward and was information I’d already seen.  All in all, it was a decent session, but I didn’t as much information from the session as I had hoped I would.

After lunch, I returned to the Moscone Center for a session titled “NFS and iSCSI - Performance Characterization and Best Practices.”  I was really hoping to get some additional best practices on using NFS and iSCSI and on maximizing performance with these IP-based storage solutions.

The session started with some performance characteristics with ESX Server today vs. ESX Server 3.0.1; basically, it was an update of some performance data presented last year at VMworld 2006.

These updated performance statistics are intended to show the results of optimizations that have been incorporated into ESX Server.  This includes optimizations like improved and more accurate CPU accounting (this improves load balancing across VMs), improved PAE support, minimized NUMA overhead, improved CPU cost per I/O, increased maximum transfer sizes, and the ability in handle more concurrent I/Os.

As a result, software iSCSI sees a range of improvements since ESX Server 3.0.1, as high as 15% for 8K block writes, with reductions in latency across the board and reducions in CPU utilization as well.  Read operations will show the greatest improvements.

Hardware iSCSI sees dramatic improvements in smaller block sizes, but the larger block sizes are essentially unchanged.  The same goes for latency, and the reductions in CPU utilization for hardware iSCSI shares the same characteristics as for software iSCSI (but keep in mind that the absolute change—as opposed to percentage change—will be greater for software iSCSI).  With hardware iSCSI, mixed read-write operations will benefit more than just read options.

Differences between VMFS and RDM (raw device mapping) are inconsequential (less than 2.5%); the only significant difference is CPU utilization, where VMFS requires more CPU time than RDM.

Comparisons of hardware iSCSI, software iSCSI, and NFS with regards to throughput show figures that are not entirely unexpected.  NFS is slightly slower than both flavors of iSCSI, and has greater latency than the iSCSI flavors.  However, all of the measured figures were in milliseconds, so it’s not terribly significant.

Moving into performance best practices, the presenter started with the storage array itself, and provided the typical list of items to consider:  total spindle count, number of spindles allocated for use, RAID level and stripe size, storage processor specifications, read/write cache sizes, and caching policies.  This is all pretty standard information that is applicable in sizing a correct storage solution, independent of a virtualization implementation.  (I would use those same counters to size a Microsoft Exchange storage solution or an Oracle storage solution, for example.)

Since we are talking IP-based storage, networking configuration comes to play here, including such things as the network topology, switches, NICs, flavor of iSCSI (hardware/software).  Similarly, things about the ESX Server host like CPU speed and number of CPU cores, overall system architecture, bus speed, I/O subsystems, and memory configuration all play a part in determining performance of IP-based storage solutions.

Finally, it’s important to understand the characteristics of the workload(s), such as I/O sizes, read/write patterns, and dependence upon aggregate throughput or latency.

<aside>OK, can we move past this stuff now?  This is all basic stuff that isn’t necessarily specific in any way to virtualization.  I want to see best practices for using IP-based storage with VMware!</aside>

To increase the overall throughput, using multiple NFS mount points may improve aggregate throughput as the cost of slightly higher CPU cost.  NFS export options can affect performance as well.  (OK, which options?  Telling us that without telling us which options is kind of like leaving us hanging.)

iSCSI digests may or may not have an impact on performance; iSCSI header digests have little or no impact; turning off iSCSI data digests can improve performance.

The presenter went over some additional troubleshooting tips, and the slide briefly mentioned the vsish command.  I hadn’t heard of that command; anyone know of where I can find some additional information on vsish?

Wrapping up the session, the presenter went through a few scenarios involving performance troubleshooting with both iSCSI and NFS.  Overall, I did not find this session to be nearly as helpful as I had hoped it would be, the presenter was not engaging, and the presentation did not provide the kind of detailed information that I felt should have been included.  (Examples: mentioning that some NFS export options affect performance, but failing to mention which options, or stating that there is a VMware knowledge base article about a topic but failing to provide the KB article number or URL).

Category: Virtualization, Storage | 6 Comments »

VMworld 2007 VCB Solutions Session

September 13th, 2007 by slowe

This session is titled “Best Practices for Architecting VMware Consolidated Backup Enabled Solutions,” and it’s being led by a couple of different presenters, one of whom is Dan Anderson, who also led the VCB session on Partner Day and leads the VCB labs (and led the VCB labs last year at VMworld 2006).  So far, it looks like this session won’t be a repeat of Monday’s Partner Day session, which is good.

The session started out with a review of VCB architecture, VCB components, and the interplay between the various portions of the VCB solution—the VCB proxy server, the pre-freeze and post-thaw scripts, file-level backups versus full VM backups, etc.  This stuff I already knew and had already been covered in detail in previous sessions.

The discussion then moved into a mention of the various items that affect the design of a VCB solution.  This would include things like the SAN architecture (Fibre Channel vs. iSCSI), software components (VCB version, VC version, ESX version, third-party backup software version), recovery mechanism, backup types, etc.  Dan mentioned again the command-line interface for VMware Converter that was mentioned on Monday; I really need to dig up that information so that I can explore that possibility in greater detail.

It was stated again today that there are bad candidates for VMware snapshots—high I/O, high transaction VMs are bad candidates for snapshots, and therefore are bad candidates for use with VCB.  (Recall that VCB uses snapshots to unlock the base VMDK for use by the VCB proxy, so a VM that is a bad candidate for snapshots is therefore a bad candidate for VCB-enabled backup solutions.)

I ended up leaving the session early because it turned out that a great deal more of the information that John and Dan were presenting was identical to the information that was presented on Monday at the partner session.  This is not a reflection on the presenters, or the session materials; I’d just already seen most, if not all, of the materials before.

Category: Virtualization | No Comments »

VMworld 2007 Day 3 Keynote Liveblog

September 13th, 2007 by slowe

Here we are again back in the Moscone Center for the keynote of Day 3 of VMworld 2007.  Unlike yesterday’s keynote from Cisco’s John Chambers, today’s keynote is expected to be much more exciting and more revealing.  (Make no mistake:  John Chambers is a very skilled speaker, but my personal opinion is that yesterday’s keynote was long on hype and short on reality.)  Taking the stage today is VMware co-founder and Chief Scientist, Mendel Rosenblum.  His keynote last year revealed VMware’s work on capturing the live execution stream, a feature that made its way into VMware Workstation and will, presumably, also find its way into the enterprise platform as well.  Expected this year is news about the next version of VMware Server.  We shall see.

The session started with some video clips from various attendees here at the conference this year.  My favorite clip was the tongue-in-cheek comment about snapshotting the conversation with the spouse, so you could roll back when you say something wrong.  Come on, we’ve all done it—how many times have we said something we wish we could take back?  (Also, I just have to toss my vote in:  Karthik Rau does not look like the guy from Heroes.)

Mendel started the keynote by taking a very high-level review of virtualization, talking about what virtualization is and how it works.  We then move from talking about virtualization in abstract terms into talking about virtualization as it relates to VMware’s Virtual Infrastructure.    By adding this level of indirection (as it’s known in computer science), Virtual Infrastructure enabled such things as consolidation (running multiple instances of a workload on a single piece of hardware), VMotion (live migration of workloads between physical hosts), and DRS (load balancing of workloads across groups of physical hosts).  It also makes it far easier to add capacity, and the introduction of ESX Server 3i is supposed to further streamline that process.

In embracing these technologies, problems have cropped up.  One of these problems is VMotion compatibility, and now hardware vendors are stepping up to help address this problem.  AMD has Extended Migration and Intel has FlexMigration, which will enable the VMware software to erase VMotion boundaries.  Mendel also officially introduced “Storage VMotion,” which is essentially the reverse of VMotion.  In Storage VMotion, the CPU execution remains on the same physical host but the virtual disks are being moved from one storage area to a different storage area.  Mendel and Kit then proceeded to provide a demo of Storage VMotion.  In the demo, they moved the disks for a running Oracle VM from one VMFS datastore to a different datastore.

Obviously, this functionality opens a number of new doors.  As Mendel mentions, this greatly streamlines migrations for organizations that lease storage, helps us dynamically adjust the SAN utilization, etc.  This is pretty powerful stuff.

The keynote next moves on to the idea of virtual appliances.  One of the key issues with virtual appliances, however, is the distribution of virtual appliances, and Mendel discussed the idea of “streaming” virtual appliances down to clients, which they are referring to as “instant on”.  The idea here is that only those blocks that are absolutely necessary for the operation of the VM.  Using a small executable and some prefetching technology, this allows VMware to have virtual appliances that can be utilized almost immediately, without having to wait for the entire appliance to download.  While very valuable in using downloadable virtual appliances, one of the most exciting and potentially powerful use cases is the use of this technology in VDI scenarios.  Recall that I predicted, based on last year’s keynotes, that VDI would evolve to include check-out/check-in functionality, and it looks like this is the vehicle whereby this kind of functionality can be delivered.

Mendel next moved on to the idea of high availability in the datacenter.  The idea of VMware HA allows us to recover VMs when a physical host fails.  In his keynote, he alludes to the idea of detecting failures within the guest (the host was still running) and restarting it on another host.  That’s nice, but what is really powerful is what Mendel discusses next, and the idea of using record/replay as a high availability solution, known as “continuous availability.”  The idea here is that an execution stream for a VM is being captured and redirected to a secondary VM, live and in real-time.  This creates two VMs in lockstep with each other.  In the event of a hardware failure, the secondary VM will instantaneously fails over.  (I suppose I don’t need to tell anyone that this could make a good replication technology as well.)  For the final demo this morning, they demonstrated the continuous availability functionality.

The demo involved a pair of virtualized Exchange servers with some clients running LoadSim to generate a load against the virtual Exchange servers.  The setup created “mirrored VMs”.  As a demonstration of the technology, Mendel pulled the power plug on one of the ESX servers, and the secondary VM and clients barely even noticed the failure.  Very nice technology!

What are the hard IT problems that VMware should address?  Mendel challenged attendees to think about the advantages that the virtualization layer offers, and to think about how the virtualization layer can be used to create “checkbox simple” solutions to these hard IT problems.

In addition to driving the hardware more efficiently, Mendel also wants to be able to utilize power more efficiently as well, and he eluded to future functionality that might lend itself to more effective power management.

In summary, Mendel believes that the benefits of virtualization are only starting to be realized, and that VMware has only scratched the surface of the valuable functionality that the virtualization layer offers.

No announcements on VMware Server, though, which was expected by more than a few people.

Category: Virtualization | 2 Comments »