blog.scottlowe.org

The weblog of an IT pro specializing in virtualization, storage, and servers

Archive for June, 2007

ESX iSCSI Basic Configuration from the CLI

June 29th, 2007 by slowe

Sometimes, I find it better/faster/easier to perform tasks from the command-line interface (CLI) than going through a GUI.  So, the other day, I needed to setup a new VMware ESX Server for iSCSI storage, and thought I’d document the commands I used to set that up.

Unfortunately, the commands weren’t able to do all the configuration I needed—I couldn’t find the commands to let me set iSCSI security (CHAP username and password), and I needed iSCSI security for the target to which I was connecting.  However, these commands should work just fine for a basic iSCSI configuration.

Here are the commands I used:

esxcfg-swiscsi -e
Enables the software iSCSI initiator.

esxcfg-firewall -e swISCSIClient
Configures the ESX Service Console firewall (iptables) to allow the software iSCSI traffic.

vmkiscsi-tool -D -a 192.168.100.50 vmhba40
Sets the target IP address for the vmhba40 adapter (the software iSCSI initiator).

esxcfg-rescan vmhba40
Rescans for storage devices on vmhba40.

I’m sure there are more commands available; for more information, you can refer to Mike Laverick’s excellent Guide to ESX 3 Service Console.  In addition, please note that the ESX Server software iSCSI initiator is simply the open source version of an old Cisco iSCSI initiator for Linux, so you can use that command reference as well (I believe this information is also applicable).  Just preface the commands from the CuddleTech article with “vmk” and it should work just as listed.

One oddity:  you may find that some of the command-line tools will report a Cisco IQN, but after further configuration (and especially after configuration from VirtualCenter) it will switch to a VMware IQN.  This may wreak havoc with iSCSI targets on which LUN presentation is based on IQN names, so plan accordingly.

Category: Virtualization, Storage | No Comments »

NFS Help

June 28th, 2007 by slowe

I like to think that I’m a fairly intelligent guy, able to pick up most things reasonably quickly given the opportunity.  After all, I transitioned from a Windows-only SE into an SE with a good reputation for VMware ESX Server, various Linux flavors, Mac OS X, and some Cisco configuration (hey, if you can do GRE tunnels with IPSec encryption, you’re not too shabby with IOS).  But I’m having a real problem with NFS.

I know, it seems silly, but I just can’t wrap my head around how it works.  In particular, the NetApp implementation of NFS and the /etc/exports file that Data ONTAP uses seems to be very different than the way you would configure NFS on Linux or Solaris.  Even when I go through the FilerView GUI to configure an NFS export, it doesn’t seem to work the way I expect.  To be fair, I’m sure this is just a lack of understanding on my part and not necessarily a flaw or drawback in the NetApp implementation.

Take this example.  I recently added another old F840 storage system to my lab at the office, and will begin setting up a demo SnapMirror environment to show to customers (SnapMirror with VMware on NetApp is going to be slick!).  I thought I’d also start performing some NFS testing as well; I’m particularly interested in thin provisioning the VMDKs on a thin provisioned FlexVol via NFS.  So I create a new FlexVol and then proceed to configure a new NFS export.  After walking through the NFS export wizard in FilerView and specifying my MacBook Pro’s IP address as having both read/write access and root access, I mount the export and proceed to try to copy an ISO image file.  Denied!  Huh?  Checking the properties, I see that I only have read-only permissions.  What’s up with that?

I try several other variations, and all of them provide the same result.  How can that be?  If my host’s IP address is provided read/write access, why do I have read-only access?  Is one option overriding another?  How do the options interact with each other?  I’m sure these are silly/easy questions for those well-versed in NFS, but for whatever reason I’m having a hard time here.

If anyone could share some enlightening information, I’d certainly appreciate it.

Category: Networking, Storage | 22 Comments »

Statistically Secure

June 27th, 2007 by slowe

I’ll start out by saying that I am neither a security expert nor a statistician.  With that disclaimer in hand, I wanted to briefly share my thoughts on the “days of risk” assessment that has recently been used to compare the security of Windows, Linux (Red Hat and SuSE), Mac OS X, and Sun Solaris.  Before continuing, I encourage you to have a look at the actual report itself, along with a few related articles:

In summary, the Days-of-Risk (DoR) assessment showed that Microsoft patched vulnerabilities in Windows more quickly than Red Hat, Novell, Apple, or Sun patched vulnerabilities in their products.  This is true even when only High Severity issues are taken into consideration, although the gap between Microsoft and the other vendors narrowed in that analysis (with the exception of Sun).

OK, that’s all well and good, but we all know statistics can be made to show just about anything.  I’m not saying that Mr. Jones deliberately limited his data to present a favorable outcome for Microsoft; Microsoft has done a very admirable job of improving their security responsiveness, and in that regard the other vendors would do well to improve their own responsiveness to the disclosure of security vulnerabilities.  No, my thoughts are more centered on the question: Is this data the right data to accurately and objectively represent the security profile of an operating system?

I would contend that, in addition to DoR, information on the following areas would also need to be included in order to more accurately depict an operating system’s security profile:

  • Number and severity of exploits published or otherwise made available for vulnerabilities
  • Number of viruses, trojans, rootkits, or other malware readily available or in active circulation

Now, before you say something like “Well, of course Windows is going to have more viruses and more exploits because it has a larger installed base!”, let me also say that these values should be correlated and weighted according to the installed base of the operating system as well.  This allows the values to account for the fact that Windows is in use by a much larger base of users than Linux, Solaris, or Mac OS X.

Again, I’m not a statistician, but surely there’s a way to correlate this data (including DoR) and start presenting some sort of objective guide, based on measurable facts, regarding the security of an operating system.  Then the vendors (Microsoft, Apple, Novell, Sun, Red Hat, and others) can stand on equal ground and be able to make some sort of reasonable comparison regarding the security of each product.  Isn’t that what we really need anyway?

Category: Security | 2 Comments »

Optimizing iSCSI Traffic with ESX

June 26th, 2007 by slowe

A response in this VMTN forums thread by Paul Lalonde got me to thinking about iSCSI traffic, network designs, and the software initiator provided with ESX Server.  The statement was this (in response to questions about how ESX uses network links to communicate with an iSCSI storage array):

In a single server environment, 802.3ad would only offer failover. A single ESX box would only ever use one network path for iSCSI traffic.

In my lab, I’ve setup a Network Appliance storage system with a virtual interface (a “VIF” in NetApp parlance), which is essentially 802.3ad link aggregation (in fact, newer versions of Data ONTAP can use LACP to build link aggregates).  On the ESX side, I’ve created Gigabit EtherChannels and configured the vSwitches to use IP hash load balancing, with the thought that this would help improve network utilization.  But after reading that statement (and following up on some other related threads; see these del.icio.us bookmarks), I started wondering if there was a better way to architect the network for iSCSI traffic from ESX Server.

I have some ideas, and have already started working on implementing and testing those ideas in the lab.  As soon as I have more information, I’ll share it here.  In the meantime, any iSCSI gurus out there care to share their network designs for optimizing ESX-iSCSI traffic?

Category: Virtualization, Storage | 10 Comments »

Giving iTerm a Try

June 26th, 2007 by slowe

It invariably occurs that, regardless of whether I’m on a computer running Microsoft Windows Server 2003, some flavor of Linux or UNIX, or Mac OS X, I open a command prompt at some point or another during my session.  I can’t really explain why; it’s just how I work.  I find it faster to type “ipconfig /all” to get the networking configuration on a Windows box than right-clicking on My Network Places, selecting Properties, right-clicking the network connection, selecting Properties…you get the idea.  I suppose it dates from my MS-DOS days, where I didn’t really have any other choice but to use the command line.

And yet, even in this day of highly functional graphical user interfaces, I still find myself using the command line to get stuff done.  Since Mac OS X is my primary OS (regular readers know that I switched about four years ago and now carry a Core 2 Duo-based MacBook Pro), that meant that I spent a fair amount of time using Terminal.app, the built-in terminal program that ships with the Mac OS X operating system.  Within the last couple of days, though, I decided to give iTerm, an open source Mac OS X terminal program, a try to see if it would work as a Terminal.app replacement.

Why?  The easiest answer would be that I am a “new software junkie”; I’m constantly looking for applications that provide newer/better ways of doing things.  (Yet, interestingly enough, I am resistant to change once I have a routine that works:  I didn’t like VirtueDesktops when I first started using it, but then it became a natural part of my workflow; I had a hard time getting used to Quicksilver but now rely heavily on it; and after being forced to lose virtual desktops due to Tiger incompatibilities and instead use Exposé, I now find it hard to go back to virtual desktops.)  I came across a few references to iTerm and its AppleScript support and Quicksilver module and thought, “Why not?”  So I decided to give it a try.

After downloading and installing iTerm, the very first thing I did was turn off the blasted (er, brushed) metal interface.  (Have I ever mentioned I don’t like brushed metal interfaces?)  I then spent a few minutes customizing iTerm to look like I wanted it (which was basically like Terminal.app), and then started using it.  OK, it supports tabs, that’s kind of cool, but I very rarely use tabs (even tabbed Web browsing is very rare), so I’m not really sure that helps much.  Again, having grown accustomed to having lots of windows open (lots of browser windows, Terminal windows, Finder windows, whatever), I’m just not really into the idea of tabbed interfaces, and I generally turn them off.  But tabs are a big selling point for iTerm, and without a need for tabs, why bother to switch at all?

That was a good question, and I didn’t (and still don’t, to some extent) have an answer for it.  iTerm seems just as fast as Terminal, they both can be configured to work similarly and look similarly.  So why switch?  The only reason I can find so far is the idea of iTerm’s bookmarks, which seem to be similar in function to Terminal.app’s .term files.  The idea is that you can open a bookmark that automatically runs a specific command (like opening an SSH session to a server or initiating an FTP connection).  Each of these bookmarks could run in a new tab (by default) or in a new window (although I haven’t figured out how to make it open in a new window by default—anyone out there have any suggestions?).  OK, that’s kind of handy, and I suppose I could get used to that, but the UI for managing bookmarks is absolutely atrocious.  Here’s hoping that portion of the program gets some attention from the developers soon.

<aside>Please don’t mistake my dislike for certain portions of the iTerm interface for a lack of appreciation for the developers’ hard work in creating iTerm.</aside>

So, aside from the bookmarks feature, I haven’t yet really found that one outstanding reason to replace Terminal.app with iTerm.  I can tell you what would make switching a no-brainer, though, in order of preference:

  1. Ability to set bookmarks to open in a new window by default (can use Option key to reverse the behavior, like now)
  2. Ability to access iTerm bookmarks via Quicksilver (the current Quicksilver module does not appear to have that functionality)
  3. Spotlight support for iTerm bookmarks

Having these features incorporated into iTerm would make switching to iTerm from Terminal.app a foregone conclusion.

What about other “iTerm switchers” out there?  Any suggestions or tips for me, or things I might be missing?  I’d love to hear from you.

Category: Macintosh | 3 Comments »

Link State Tracking in Blade Deployments

June 22nd, 2007 by slowe

It’s common in blade deployments to use multiple Ethernet switches in the blade chassis to provide network redundancy (I’ll refer to these as “chassis switches” moving forward).  For example, in both the IBM BladeCenter H and the HP BladeSystem c-Class, we can provision multiple chassis switches so that half of the NICs on the blades connect to one chassis switch and the other half connect to the other switch.  Within the OS, we load NIC teaming software to provide automatic failover if one of the links goes down.  In this scenario, if one of the chassis switches fails then traffic will automatically fail over to the other switch.

In cases like this, everything works as advertised.  But what about when the chassis switch stays up, but the uplink from that switch to the outside world goes down (perhaps the upstream switch went down or the link was unplugged)?  In that case, the link from the chassis switch to the blade’s NIC is still up, and therefore the NIC teaming software in the OS does not know that a problem has occurred and will not move the traffic to the other link.  In situations like this, we need to implement link state tracking.

<aside>Astute readers will recognize that link state tracking is actually applicable in any server deployment—not just a blade server deployment—where the servers connect to a distribution switch and not the core.  I’m just going to focus on blade server deployments here, but the configuration would be much the same, if not exactly the same, in non-blade server deployments.</aside>

Link state tracking is pretty easy to configure; you define one or more upstream ports and one or more downstream ports.  The upstream port(s) are the ports that uplink to the rest of the network; in a blade server deployment, this would be the ports (or port groups) that connect to the network backbone.  The downstream port(s) are the ports that connect back to the servers.

Here’s an example.  We have a Cisco chassis switch that has a GigabitEtherChannel port group defined as an uplink out to the outside world:

interface Port-Channel1
description Uplink to network backbone
switchport trunk encapsulation dot1q
switchport trunk native vlan 2
switchport trunk allowed vlan 2-4094
switchport mode trunk
link state group 1 upstream

Note the “link state group 1 upstream” command, which marks this port channel as an upstream port.  If all the links in this port channel go down (thus making the port channel itself go down), then the switch will notify downstream ports in the same group to mark themselves as down also.

The member ports of this port channel would not have the “link state” command present:

interface GigabitEthernet0/18
description Port group member for uplink to network
switchport trunk encapsulation dot1q
switchport trunk native vlan 2
switchport trunk allowed vlan 2-4094
switchport mode trunk
channel-group 1 mode on

So for the ports on the same chassis switch that are connecting to the servers in the chassis, we have this configuration:

interface GigabitEthernet0/10
description Web server NIC
switchport access vlan 2
switchport mode access
link state group 1 downstream
spanning-tree portfast

Note the “link state group 1 downstream” command, which marks this port as a downstream port from the Port-Channel1 interface.  If Port-Channel1 goes down (because all the member links in Port-Channel1 also went down), then GigabitEthernet0/10 will also go down.  Because GigabitEthernet0/10 went down, the NIC teaming software running in the OS on the blade will fail the traffic over to a different NIC, presumably a NIC that connects to the redundant chassis switch.

You’ll also need the global “link state track 1” global command to enable link state tracking (thanks for the clarification, Matt!).

Because of the nature of blade deployments, this sort of configuration is particularly applicable in blade deployments, but also applies in other situations as well (as mentioned earlier).  I hope this is useful!

Category: Networking | 19 Comments »

Cliff, I’ve Used VMware in Production

June 21st, 2007 by slowe

“How many people have deployed VMware or Xen or even Microsoft Virtual Server in a real production environment?”  That’s the question Cliff Saran’s IT FUD blog asked yesterday.  It’s an interesting question to ask an engineer such as myself who specializes in VMware deployments for customers.  And while I can’t give out the names of some of my customers, I can tell you that more than a couple of them are using VMware Virtual Infrastructure 3 in production environments right now.

Here are some examples of how these customers are using VMware:

  • One customer has over 20 dual-processor server blades (older HP p-Class blades, by the way—trying to get them to transition to the newer c-Class blades) running ESX Server 3.0.1 in a big DRS cluster hosting over 120 virtual servers.  These virtual servers are application servers, middleware servers, Exchange front-end servers, and Citrix Presentation Servers, to name a few.
  • Another customer, still early in their VI3 deployment, has a three-node DRS/HA cluster running a variety of workloads, such as Microsoft Office Project Server and a couple other application servers, on VMware ESX Server.
  • One very small customer I have is using Virtual Server to host a few workloads, including a middleware server for a web-based application with an SQL backend.

And these are just the examples that I know about.  What about all the other VMware engineers in my company, not to mention all the other VARs with strong VMware practices?  And Cliff says that he’s having a hard time finding references?  That doesn’t make sense to me.

While I love virtualization (and VMware products in particular), I’ll be the first to admit that virtualization is not the “be all/end all” that some make it out to be.  Is it useful in many organizations?  Yes, absolutely.  Will it fit in every situation?  No, it won’t.  There will be some situations where virtualization is not the answer.  However, those situations are fairly limited, and growing more limited by the day.

What about you?  Cliff wants stories of people using VMware in production environments, so let’s give ‘em to him.  Tell us about your production VMware environment below in the comments.

Category: Virtualization | 13 Comments »

More on CPU Masking

June 19th, 2007 by slowe

I just added another server to my VMware ESX Server farm, and since this farm is being built from leftover, donated servers that aren’t being used elsewhere (such is the curse for a non-revenue generating test environment), I don’t have the luxury of ensuring that all the servers in the farm have the same (or compatible) CPU families.  As I’ve discussed in earlier blog postings (Sneaking Around VMotion Limitations and VMotion Compatibility), we can use custom CPU masks to help address the issue of different CPUs in the same ESX server farm.

Because this is a non-production environment, I don’t have to worry too much about the fact that VMware doesn’t support these types of custom CPU masks.  With that in mind, I thought it might be helpful to again walk through the process of identifying the differences in the CPUs and creating custom CPU masks to address those differences.  Of course, I don’t recommend doing this in production environments.

The first step is to gather the information from the CPUs themselves.  Richard Garsthagen’s VMotionInfo utility should provide all the information you need, but I have found it easier to just get the raw data myself using the CPU ID boot CD.  This does require that you reboot the ESX server (thus taking it down), but given that we can use VMotion to move guests to a (known compatible) peer, this is not a huge issue in my mind.

Using the “Verbose” boot option of the CPU ID boot CD, here is the data gathered for the three servers in my farm.  The value in parentheses is the value returned by CPU ID; I’ve added the binary value after that for comparison.

Server1 ID1EAX (0x00000F43)  0000 0000 0000 0000 0000 1111 0100 0011
Server2 ID1EAX (0x00000F25)  0000 0000 0000 0000 0000 1111 0010 0101
Server3 ID1EAX (0x000006F6)  0000 0000 0000 0000 0000 0110 1111 0110

As you can see, converting the hexadecimal values (the ones in parentheses returned by the CPU ID boot CD) to binary shows that the differences lie in the last 3 bytes.  That in and of itself doesn’t really help that much until we combine the information above with the standard CPU mask for that register.  Here’s the same information again, only this time with another line showing the standard CPU mask:

Server1 ID1EAX (0x00000F43)  0000 0000 0000 0000 0000 1111 0100 0011
Server2 ID1EAX (0x00000F25)  0000 0000 0000 0000 0000 1111 0010 0101
Server3 ID1EAX (0x000006F6)  0000 0000 0000 0000 0000 0110 1111 0110
Standard CPU Mask ID1EAX     XXXX HHHH HHHH XXXX XXXX HHHH XXXX XXXX

The CPU mask shows that only the bits from byte 3 (third from the right) are significant.  This is indicated by the “H” in the mask, which (according to the legend in the Virtual Infrastructure Client) means that the guest will see the value of that register and that the value of the register must match for a successful VMotion.

<aside>I have not yet determined if the standard CPU masks are guest OS dependent, but I suspect that they are.  Be sure to check the standard CPU mask against a VM configured for the guest OS that you will actually be running, or it may not work as you expect.</aside>

To fix this, we need to add a custom CPU mask that masks that bits that are different.  Here’s the same information again, this time with a custom CPU mask and an effective CPU mask:

Server1 ID1EAX (0x00000F43)  0000 0000 0000 0000 0000 1111 0100 0011
Server2 ID1EAX (0x00000F25)  0000 0000 0000 0000 0000 1111 0010 0101
Server3 ID1EAX (0x000006F6)  0000 0000 0000 0000 0000 0110 1111 0110
Standard CPU Mask ID1EAX     XXXX HHHH HHHH XXXX XXXX HHHH XXXX XXXX
Custom CPU Mask ID1EAX       ---- ---- ---- ---- ---- 0--0 ---- ----
Effective CPU Mask ID1EAX    XXXX HHHH HHHH XXXX XXXX 0HH0 XXXX XXXX

Based on the information in this Intel documentation, this register (ID 1 EAX) is the CPU identification register, and masking these bits will make it difficult (or perhaps even impossible) for ESX Server (or VirtualCenter) to correctly identify the CPUs in your host servers.  This is the underlying reason these kinds of changes aren’t supported:  no one really knows what impact this will have.

This is just the ID1EAX register, though; we must now repeat this process for the other registers:

Server1 ID1ECX (0x0000641D)  0000 0000 0000 0000 0110 0100 0001 1101
Server2 ID1ECX (0x00004400)  0000 0000 0000 0000 0100 0100 0000 0000
Server3 ID1ECX (0x0004E3BD)  0000 0000 0000 0100 1110 0011 1011 1101
Standard CPU Mask ID1ECX     RRRR RRRR RRRR RRR0 00XR R0H0 000H 0RRH
Custom CPU Mask ID1ECX       ---- ---- ---- -0-- ---- --0- ---0 -0-0
Effective CPU Mask ID1ECX    RRRR RRRR RRRR RRR0 00XR R000 0000 00R0

Server1 ID81ECX (0x00000000) 0000 0000 0000 0000 0000 0000 0000 0000
Server2 ID81ECX (0x00000000) 0000 0000 0000 0000 0000 0000 0000 0000
Server3 ID81ECX (0x00000000) 0000 0000 0000 0000 0000 0000 0000 0001
Standard CPU Mask ID81ECX    RRRR RRRR RRRR RRRR RRRR RRRR RRRR RRRX

Server1 ID81EDX (0x20000000) 0010 0000 0000 0000 0000 0000 0000 0000
Server2 ID81EDX (0x00000000) 0000 0000 0000 0000 0000 0000 0000 0000
Server3 ID81EDX (0x20100000) 0010 0000 0001 0000 0000 0000 0000 0000
Standard CPU Mask ID81EDX    RRXR RRRR RRRH RRRR RRRR HRRR RRRR RRRR
Custom CPU Mask ID81EDX      --0- ---- ---0 ---- ---- ---- ---- ----
Effective CPU Mask ID81EDX   RR0R RRRR RRR0 RRRR RRRR HRRR RRRR RRRR

You’ll note that there is no custom mask for ID81ECX; that’s because the default CPU mask hides all the significant bits (which in this case is only bit 0).

With these changes in place, I was able to successfully VMotion several different guest operating systems (including OpenBSD 4.1, Solaris 10 x86 Update 3, and Windows Server 2003 R2) between the servers.  Your mileage may vary, however, and keep in mind that these changes are unsupported.  Use them at your own risk!  (I have to say that because if I didn’t someone would invariably blame me for making one of their guest VMs crash.  With that disclaimer out of the way, allow me to state that I haven’t yet seen any problems as a result of the custom CPU masks.)

As always, please feel free to add any thoughts or corrections in the comments below.

Category: Virtualization | 5 Comments »

Moving ESX Into Firmware

June 15th, 2007 by slowe

The rumor that a slimmed down version of ESX Server, supposedly called “ESX Lite,” is being developed for placement into a server’s firmware, is circulating the Internet (see here or here).  Most of these reports link back to a story on SearchServerVirtualization which quotes sources close to VMware stating:

According to several sources close to VMware, ESX Lite is real and currently under development. The new lightweight hypervisor would be installed directly on the motherboard, simplifying the deployment of an ESX host and ensuring 100% hardware integration.

Virtualization.info adds that the rumor is apparently confirmed.

This is a smart move.  It’s smart because it derails Microsoft’s attempts to marginalize the hypervisor by bundling it with the operating system (via Windows Server Virtualization, aka “Viridian”).  It’s smart because it expands the hypervisor market in new directions that no one else has yet tapped, helping VMware retain mindshare about its technical leadership and innovation.  It’s smart because it’s the hardware vendors that have the most to lose via virtualization, and by partnering with them you remove potential future opponents.

However, it’s also a very risky move.  What if “ESX Lite” doesn’t (or can’t) perform as well as “full” ESX Server?  What if embedding the hypervisor into a server’s firmware causes VMware to lose visibility?  After all, customers wouldn’t be buying solutions from VMware any longer, because these would be integrated with the hardware.  It would be prudent for VMware to create a kind of “VMware Inside”-type program that maintains their visibility in the overall solution, even though it’s all being purchased through Dell, HP, or IBM.

It will be very interesting to see how this plays out, and which of the hardware vendors are “on board” with the effort.

UPDATE:  In a recent blog posting, Gordon Haff agrees with me regarding this approach by VMware as a move to forestall the “built in” virtualization functionality that exists or will soon exist in most operating systems.

Category: Virtualization | 5 Comments »

Leaving ESX for Virtual Server?

June 15th, 2007 by slowe

According to this article, cost and configuration issues are spurring at least one organization to ditch VMware ESX Server for Microsoft Virtual Server.  Here’s a quote from the article:

Today, Quigley — a senior network engineer — told me that his firm, Total Quality Logistics, LLC, will be migrating over to Microsoft Virtual Server 2005 R2, over the next 45 days. TQL — up to now a VMware shop — will only use ESX in the lab. ESX is too expensive to upgrade and requires more training and resources than TQL can deliver, Quigley said.

So, let me see if I have this straight.  Virtual Server, which is given away for free, is cheaper than ESX Server.  Well, that seems fairly obvious—it’s hard to get any cheaper than free, isn’t it?  So is it really possible to make a viable comparison between two products based on price when one product (Virtual Server) is given away for free because Microsoft makes all its money elsewhere (Windows), but the second product (ESX Server) costs money because it is one of VMware’s core products?  Or do you have to consider other factors as well?

When it comes to comparing Virtual Server and ESX Server, Microsoft wants you to consider the price of Virtual Server (free) against ESX Server (not free).  But when it comes to comparing Linux (free) and Windows (not free), Microsoft’s on the other end of the stick, and they want you to consider other factors—total cost of ownership, training, etc.  So do we consider factors other than just cost, or not?

I think most everyone would agree that with regards to features, ESX Server (as part of a Virtual Infrastructure 3 implementation) clearly outshines Virtual Server.  After all, Virtual Server doesn’t have VMotion (live migration), DRS, HA, resource pools, or any such features.  With regards to cost, Virtual Server is as inexpensive as it comes.  ESX Server wins on features, Virtual Server wins on cost.

In this case, we now need to consider the cost of administering or managing your virtualization solution.  Some would say that Virtual Server is easier to manage than ESX Server.  I guess that depends greatly upon who is managing the environment.  I know of a VMware administrator who single-handedly manages 22 ESX Servers hosting more than 150 virtual machines.  Would this administrator consider Virtual Server “easier to manage” than ESX Server?  Probably not.  On the other hand, a Windows system admin whose grown his/her career managing Windows boxes won’t find ESX Server as easy to manage as a Windows server.  It’s really all a matter of experience and perspective.

So rather than taking this story and trying to tear it down, let’s recognize it for what it is.  One organization evaluated Virtual Server and ESX Server based on the criteria important to that organization (cost and management), and felt that Virtual Server was the right choice for them.  It won’t be the right choice for every organization, just as ESX Server isn’t the right choice for all organizations.  And it’s not validation that Virtual Server is better than ESX Server, as the definition of “better” will change from organization to organization.  For many organizations, the cost of VMware licenses are outweighed by the savings introduced by higher consolidation ratios and the enterprise-class feature set.

Personally, I think that most, if not all, of the issues that were described in that article could have been resolved with a correct configuration of ESX Server, but that’s just my personal opinion.

Category: Virtualization | 10 Comments »