iSCSI

You are currently browsing articles tagged iSCSI.

Continuing the FCoE Discussion

A few weeks ago I examined FCoE in the context of it’s description as an “I/O virtualization” technology in my discussion of FCoE versus MR-IOV. (Despite protestations otherwise, I’ll continue to maintain that FCoE is not an I/O virtualization technology.)

Since that time, I read a few more posts about FCoE in various spots on the Internet:

Is FCoE a viable option for SMB/Commercial?
Is the FCoE Starting Pistol Aimed at iSCSI?
Reality Check: The FCoE Forecast

Tonight, after reading a blog post by Dave Graham regarding FCoE vs. InfiniBand, I started thinking about FCoE again, and I came up with a question I want to ask. I’m not a storage expert, and I don’t have decades of experience in the storage arena like many others that write about storage. The question I’m about to ask, then, may just be the uneducated ranting of a fool. If so, you’re welcome to enlighten me in the comments.

Here’s the question: how is FCoE any better than iSCSI?

Now, before your head explodes with unbelief at the horror that anyone could ask that question, let me frame that question with more questions. Note that these are mostly rhetorical questions, but if the underlying concepts behind these questions are incorrect you are, again, welcome to enlighten me in the comments. Here are the framing questions that support my primary question above:

  1. FCoE is always mentioned hand-in-hand with 10 Gigabit Ethernet. Can’t iSCSI take advantage of 10 Gigabit Ethernet too?
  2. FCoE is almost always mentioned in the same breath as “low latency” and “lossless operation”. Truth be told, it’s not FCoE that’s providing that functionality, it’s CEE (Converged Enhanced Ethernet). Does that mean that FCoE without CEE would suffer from the same “problems” as iSCSI?
  3. If iSCSI was running on a CEE network, wouldn’t it exhibit predictable latencies and lossless operation like FCoE?

These questions—and the thoughts behind them—are not necessarily mine alone. In October Stephen Foskett wrote:

And iSCSI isn’t done evolving. Folks like Mellor, Chuck Hollis, and Storagebod are lauding FCoE at 10 gigabit speeds, but seem to forget that iSCSI can run at that speed, too. It can also run on the same CNAs and enterprise switches.

If those Converged Network Adapters (CNAs) and enterprise switches are creating the lossless CEE fabric, then iSCSI benefits as much as FCoE. Dante Malagrino agrees on the Data Center Networks blog:

I certainly agree that Data Center Ethernet (if properly implemented) is the real key differentiator and enabler of Unified Fabric, whether we like to build it with iSCSI or FCoE.

Seems to me that all the things that FCoE has going for it—10 Gigabit speeds, lossless operation, low latency operation—are equally applicable to iSCSI as they are functions of CEE and not FCoE itself. So, with that in mind, I bring myself again to the main question: how is FCoE any better than iSCSI?

You might read this and say, “Oh, he’s an FCoE hater and an iSCSI lover.” No, not really; it just doesn’t make any sense to me how FCoE is touted as so great and iSCSI is treated like the red-headed stepchild. I have nothing against FCoE—just don’t say that it’s an enabler of the Unified Fabric. (It’s not. CEE is what enables the Unified Fabric.) Don’t say that it’s an I/O virtualization technology. (It’s not. It’s just a new transport option for Fibre Channel Protocol.) Don’t say that it will solve world hunger or bring about world peace. (It won’t, although I wish it would!)

Of course, despite all these facts, it’s looking more and more like FCoE is VHS and iSCSI is Betamax. Sometimes the “best” technology doesn’t always win…

Tags: , , ,

This session provided information on running Hyper-V with NetApp storage. The first part of the session focused primarily on Hyper-V basics, such as VHD types (dynamically-expanding, fixed-size, passthrough, differencing), partition alignment (which can only be guaranteed with fixed-size VHDs, by the way), SCVMM 2008, Windows Failover Clustering support, and such. If you’re interested in details on those topics, I suggest you have a look at my coverage of Microsoft Tech-Ed 2008 back in the summer.

The second part of the session delved into some NetApp-specific information:

  • NetApp has a PVR-only tool called HyperVIBE that helps to coordinate storage array Snapshots with the hypervisor, providing VSS integration to quiesce the VMs before taking a Snapshot on the NetApp array. This is only supported on Server Core and requires a special release of SnapDrive 6.0. (It’s only available via PVR, so don’t go searching the NetApp web site for a free download.)
  • The various members of the SnapManager family—SnapManager for SQL, SnapManager for Exchange, and SnapManager for Sharepoint—are all fully supported on Hyper-V, but only for iSCSI LUNs.
  • NetApp SnapDrive 6.x is supported both on Hyper-V hosts as well as guest VMs. On the parent partition, it can manage both Fibre Channel LUNs and iSCSI LUNs; on a child partition, it can only manage iSCSI LUNs.
  • Version 5.x of the Host Utilities Kit is strongly recommended for use with Hyper-V, and supports Fibre Channel, iSCSI, and mixed connections. It runs on either the parent or child partition, although it seems to me that it would only make sense to run it on the parent partition.
  • Data ONTAP DSM 3.2R1 is the supported and recommended DSM for MPIO support with Hyper-V. On the parent partition, it supports and manages Fibre Channel, iSCSI, and mixed paths, but in a child partition it only supports iSCSI paths. It’s also only supported in child partitions running a server OS (so no Windows XP or Windows Vista support in child partitions).

For more information, readers can refer to TR-3701 and TR-3702. Note that updated versions of TR-3702 are expected to be released in the coming months to address additional product integrations.

Tags: , , , , , , , , ,

I’ve had this link sitting in my “Articles To Read” list for quite some time, but—to be perfectly honest—I’ve been just too busy to really do anything about it. Now that a hectic few weeks has wrapped up and I have a small breather before the next hectic few weeks, I wanted to comment briefly on Doug Gourlay’s discussion of FCoE versus MR-IOV.

First, some background: For those that aren’t familiar, FCoE is Fibre Channel over Ethernet, a T11 standard for running Fibre Channel Protocol over Ethernet, specifically 10 Gigabit Ethernet. More information on FCoE is found here. MR-IOV is Multi-Root I/O Virtualization, a PCI SIG specification for using PCI Express (PCIe) to connect and share multiple devices. More information on MR-IOV can be found here. MR-IOV is a multi-server extension to Single-Root I/O Virtualization, or SR-IOV.

Like Doug, I’ll put in a disclaimer that I haven’t read the report to which he’s referring in his article, either. However, as an individual who has done some research on the topic of I/O virtualization, I will say that anyone who compares FCoE to MR-IOV is comparing apples to oranges. These two technologies, in my mind, are designed to address two different problems.

FCoE provides the ability to use a single physical transport—10 Gigabit Ethernet, in this case—for Fibre Channel Protocol (FCP) as well as TCP/IP, iSCSI, and other Ethernet-borne protocols. This allows for the creation of a unified fabric, a single physical transport that carries all the various kinds of traffic that Ethernet-based Local Area Networks (LANs) and Storage Area Networks (SANs) carry separately today. Via the IETF Converged Enhanced Ethernet (CEE) standard—adopted by Cisco as Data Center EthernetTM—FCoE will ultimately have the same low, predictable latency and error-free operation that FCP enjoys today. FCoE is not, however, designed or architected to do anything other than allow FCP to run over Ethernet. It’s not intended to be a server interconnect technology. (Unless I’m missing something?)

MR-IOV, on the other hand, is intended to play in the server interconnect field. Its purpose is not to allow FCP to run over Ethernet, or to allow FCoE, iSCSI, and other TCP/IP protocols share the same physical connections. MR-IOV’s purpose is to allow multiple servers to share PCIe-based devices, like a FC Host Bus Adapter (HBA), or an iSCSI HBA, a 10 Gigabit Ethernet network interface card (NIC), or a video capture card. MR-IOV is intended to provide I/O virtualization, regardless of what type of I/O that might be. As long as the I/O runs across a PCI Express bus, MR-IOV comes into play.

I’ve heard multiple people refer to FCoE as an I/O virtualization technology, but I just don’t agree. FCoE only applies to FCP over Ethernet. It doesn’t apply to iSCSI. It doesn’t apply to video traffic, or audio traffic, or HTTP traffic. It only applies to FCP over Ethernet. While I might allow that FCoE does allow for a form of virtualization, by virtualizing the physical transport beneath FCP, I would not call it I/O virtualization. Further, FCoE and MR-IOV are complementary. You could use MR-IOV to share a single Converged Network Adapter (CNA), which provides FCoE and 10 Gigabit Ethernet functionality, among multiple servers. In this situation, what’s providing the I/O virtualization: MR-IOV, which is allowing multiple servers to use a single I/O card, or the CNA, which is putting the traffic onto the converged fabric?

I’m probably missing something huge here, some vital piece of information that would make sense why FCoE and MR-IOV would be considered competitive standards/specifications. Without that information, though, it just doesn’t make any sense to me to compare these two different yet complementary technologies. Someone want to enlighten me?

UPDATE: I’ve corrected my use of “Data Center Ethernet” to Converged Enhanced Ethernet (CEE) when referring to the IETF standard. As correctly pointed out in the comments, Data Center EthernetTM is a Cisco trademarked term referring to their implementation of CEE.

Tags: , , , , , ,

This week’s Short Take is a collection of links and articles that I’ve seen over the last few weeks (or longer ago, in some cases!) that I thought others might find interesting or useful. Enjoy!

  • Alessandro broke the news to the general public about some anticipated new virtualization features that are expected to make their debut in Windows Server 2008 R2, expected sometime in 2010. Microsoft announced live migration for Hyper-V back at the beginning of September, so that part was already known. Now coming from Alessandro’s article is the announcement that Microsoft is developing a cluster file system, similar to VMFS, called Cluster Shared Volumes (CSV). Personally, this wasn’t a big surprise to me as a contact of mine leaked this to me a while ago. Hopefully this won’t hit Sanbolic too hard, whose Melio FS and Kayo FS solutions were intended to fill this gap (as discussed here and here).
  • As fully expected, VMware and Microsoft trade lots of barbs back and forth about VMware ESX vs. Hyper-V and vice versa. Out of the various exchanges, I found the “Too Dry and Crunchy” exchange—now quite old, having been published back at the end of September—the most entertaining. It started here with a barb from VMware about how Hyper-V with Server Core, the recommended configuration from Microsoft for virtualization hosts, is “not the Windows you know.” They compared Hyper-V on Server Core to ESXi and, not surprisingly, found ESXi to be easier and faster to install. What was really surprising though, was the response from James O’Neill in which he essentially agreed: Server Core isn’t “the Windows you know.” While he does love Server Core, James also recognizes that Server Core is not the right fit for every workload, and that management processes and procedures may need to change when using Server Core. Personally, I’m glad to see James recognizing and being honest about the limitations (or caveats) of Server Core. If only all vendors were so honest about their own products…one day, perhaps.
  • Duncan points out a great PDF on the definitions of various memory statistics. Readers may find that useful in understanding the various counters within VirtualCenter.
  • This VMware KB article outlines a potential VMware HA problem with multiple Service Console interfaces.
  • Andy Leonard picked up this VMware KB article that I bookmarked via Delicious.com and discussed how VMware’s recommendations and NetApp’s recommendations seem to run counter to each other. Personally, I’m inclined to follow VMware’s recommendations after the little snafu with NetApp’s NFS file locking suggestion.
  • This is a cool article on the use of ZFS and iSCSI to create clones in storage instead of at the virtualization layer. This is interesting because it’s being done with Solaris and ZFS, but it’s functionally equivalent to FlexClones with NetApp, which I’ve discussed before (see here, here, and here). Accordingly, ZFS clones will suffer from all the same limitations as NetApp FlexClones.
  • And while we’re on the topic of Sun and NetApp, what’s the deal with the recent patent rulings in the ZFS vs. WAFL lawsuit? If I’m reading this update correctly, it looks like some of the core WAFL patents from NetApp are being invalidated. Is Sun going to win this thing?

That does it for now. Thanks for reading!

Tags: , , , , , , , , , , ,

Readers who have installed and configured VMware ESX in a storage area network (SAN) environment know that all the VMware ESX servers in an environment need to see the same LUNs with the same LUN IDs. This is necessary in order to avoid problems with VMFS resignaturing.

Similarly, readers who are familiar with configuring and managing NetApp storage arrays will know that NetApp igroups (initiator groups) are the mechanism whereby a host—or a group of hosts—are granted access to see a particular LUN on a specific LUN ID.

Because the igroup configuration is core to how LUNs are presented to hosts, and because VMware ESX has specific configuration requirements with regards to LUN presentation, it’s necessary to take a closer look at strategies for how NetApp igroups should be configured and managed in a VMware ESX environment. There are basically two approaches:

  1. Create a single igroup for all the VMware ESX hosts in the environment, then map LUNs to LUN IDs using that single igroup.
  2. Create a single igroup for each VMware ESX host in the environment, and then map LUNs to LUN IDs for each igroup.

Obviously, each approach has its advantages and disadvantages:

Using a Single Initiator Group:

  • Adding a new LUN to the entire group requires only one change: mapping a LUN to a LUN ID for that one initiator group.
  • Similarly, only a single change is required to remove a LUN from the entire group of VMware ESX servers, by removing the one group-LUN ID map.
  • Storage administrators and VMware ESX administrators are assured that all the VMware ESX hosts will see the same LUNs with the same LUN IDs because all the hosts are placed into one group-LUN ID map. There is very little possibility for error.
  • On the downside, there’s no way to prevent a particular host from seeing a particular LUN. All hosts in the initiator group will see the LUN.
  • The storage administrator kind of “loses track” of which hosts see the LUNs. Because all the initiators are thrown into the same group, it’s more difficult to track down which hosts see a particular LUN. This is less true for iSCSI—where the hostname is often embedded in the IQN—but more prevalent for Fibre Channel, as the initiator group only contains World Wide Port Names (WWPNs). Mapping WWPNs to actual servers requires some additional steps.

Using Multiple Initiator Groups:

  • It’s easier for storage administrators to match hosts to initiators, because each host has its own initiator group on the NetApp storage array.
  • The storage administrators and VMware ESX administrators have greater flexibility in determining which hosts see which LUNs, so it’s possible to have a LUN visible to some VMware ESX servers and not others.
  • There’s greater room for error in accidentally mapping a LUN to a different LUN ID for one or more of the hosts, which can lead to an inability to access the VMFS datastore on that LUN.
  • Multiple changes are required to add or remove a LUN from the entire set of VMware ESX servers; each LUN will have to be individually modified.

One could also mix-and-match these approaches to a certain extent; for example, an organization could employ a “hybrid” model that has the full set of base LUNs exposed to all servers via one large initiator group, but other LUNs exposed on a more granular basis via smaller initiator groups. Since an initiator can be included in more than one initiator group (as long as the initiator group uses the same OS type), this gives some additional flexibility.

I guess the purpose of this post is less to explain the different ways of using initiator groups and more to try to generate some discussion around the various ways they can be used. Hopefully, the initial explanation will be helpful to some readers, but what I’d really like to see is some more advanced and experienced readers sharing their strategies for using initiator groups in larger VMware ESX environments and what “best practices” they may be employing.

Tags: , , , , , ,

HP Buys LeftHand Networks

I just got word this morning from a co-worker that HP has announced it will buy LeftHand Networks for about $360 million. The official HP news release can be found here on the HP web site.

It will be interesting to see how HP integrates the LeftHand offerings into their existing storage product lines—the All-in-One (AiO), Modular Smart Array (MSA), and Enterprise Virtual Array (EVA) product lines. Based on HP’s news release, it looks like they envision the LeftHand products fitting in between the AiO/MSA at the low end and the EVA at the high end.

With the purchase of EqualLogic by Dell and today’s acquisition of LeftHand by HP, it looks like all the small iSCSI-focused startups are getting acquired by system vendors. Does this signal a trend?

Tags: , , ,

NetApp recently published a white paper summarizing some tests they ran to compare storage protocol performance in a VMware Infrastructure environment. The white paper, TR-3697, compares the storage performance of Fibre Channel, software iSCSI, and NFS against a couple of different NetApp storage systems.

I won’t go into all the sordid details here—you can read the white paper yourself—but the end results look something like this:

  • Fibre Channel provided the highest throughput and the lowest processor utilization of all the storage protocols.
  • Software iSCSI provided only slightly lower throughput than Fibre Channel (not more than 9% or 10% less than Fibre Channel depending upon the specific tests being run). However, software iSCSI consistently showed the highest CPU utilization on the ESX hosts.
  • NFS showed throughput on the same levels as software iSCSI (again, not more than about 9% or 10% less than Fibre Channel depending upon the tests being run) and had higher CPU utilization than Fibre Channel. However, the CPU utilization was lower than with software iSCSI.

While overall performance was roughly comparable between all three storage protocols, depending upon the tests being run, the host CPU utilization was a different story entirely. In some cases, software iSCSI’s CPU utilization was as much as 80%—that’s right, almost double—that of Fibre Channel. In no cases did the CPU utilization drop below 40% higher than Fibre Channel. Keep in mind these numbers are relative to Fibre Channel. So if Fibre Channel used 200MHz of host CPU power and software iSCSI used 360MHz of host CPU power, that’s an 80% relative increase. We don’t know, unfortunately, how this translates into actual host CPU usage; in my mind, that’s a key piece of information that really should have been included. I’m puzzled as to why it’s not included.

NFS fared better; at its worst, the tests showed NFS running CPU overhead 40% greater than Fibre Channel. At its best, NFS looked like it was only requiring about 15% more CPU overhead than Fibre Channel (keep in mind the comments made above regarding relative utilization). Of course, NetApp loves to push the NFS; the document adds the extra sell for NFS:

While NFS does not quite achieve the performance of FC and has a slightly higher CPU utilization, it does have some advantages over FC that should be considered when deciding which protocol to deploy. Running on a standard TCP/IP network, NFS does not require the expensive Fibre Channel switches, host bus adapters, and Fibre Channel cabling that FC requires, making NFS a lower cost alternative of the two protocols. Additionally, operational costs are low with no specialized staffing or training needed in order to maintain the environment. Also, NFS provides further storage efficiencies by allowing on-demand resizing of data stores and increasing storage saving efficiencies gained when using deduplication. Both of these advantages provide additional operational savings as a result of this storage simplification.

I suppose I can’t blame them; NFS is one of their strong points, so they’ll naturally lean that direction.

There are a few key things that I need to say about this document, though:

  1. Benchmark tests can be made to say just about anything. It’s all in the types of tests that you run and the parameters of those tests. I’m not saying that NetApp specifically skewed the tests in any way; what I am saying, though, is that users need to take these types of benchmark tests as a general guideline and not the definitive word.
  2. While NetApp does highlight the “operational savings” of NFS, what they fail to mention is the added complexity of scaling NFS traffic as the environment grows. Fibre Channel multipathing in a VMware environment is very robust, and I expect that the Round Robin pathing policy will move from “experimentally supported” to fully supported rather quickly. This makes it quite easy to scale the FC connection, although to be honest that probably won’t be necessary. However, to scale the NFS connection, you need multiple NFS exports with multiple IP addresses, link aggregation via LACP/802.3ad/EtherChannel and switches that support cross-switch link aggregation, and possibly multiple VMkernel ports on different IP subnets. This is described, by the way, in the latest revision of TR-3428, also from NetApp. (As a side note, I believe that these scaling issues would affect any NFS storage vendor and are not specific to NetApp in any way.)
  3. If you look at VMware’s development, you will see that Fibre Channel gets the goods the earliest. iSCSI and NFS were only added in VMware Infrastructure 3, whereas Fibre Channel support has been around in ESX for much longer. Storage VMotion support went to Fibre Channel first. VCB support went to Fibre Channel first. SRM support went to both iSCSI and Fibre Channel, but not NFS. Fibre Channel multipathing is, as I mentioned already, quite robust; iSCSI multipathing and NFS multipathing aren’t quite so robust. All these things considered, there could be a sound business case to use Fibre Channel in spite of cost savings from iSCSI (especially software iSCSI, given the added CPU overhead) or NFS. That’s something that each individual organization will need to decide for themselves.

By the way, I know the gentleman that wrote this technical report and he’s a straight-up guy. I respect him. So, don’t take any of my comments or thoughts to imply anything beyond the fact that I’m simply presenting my thoughts around the data contained in this document. You should also know that I am a fan of using NFS for VMware, but I don’t necessarily believe that it is the “slam dunk” that it’s often presented to be.

UPDATE: I’ve made some corrections to the interpretations of the CPU utilization numbers in response to some of the comments below.

Tags: , , , , , , , ,

Here’s Virtualization Short Take #12, a collection of links I’ve gathered over the last week or so and my thoughts on them. Enjoy!

  • For those that missed it in the Release Notes, VMware added support for Storage VMotion and 10Gb Ethernet with iSCSI SANs, as outlined in this VI Team blog entry. I went back and reviewed the Release Notes and didn’t see this listed anywhere, so this is news to me. Of course, I already knew that Storage VMotion worked just fine with iSCSI, but this added formal support for iSCSI.
  • Virtualfuture.info published some good recommendations for running Citrix in a VI3 environment. If you run Citrix Presentation Server…er, XenApp…in a VI3 environment, these tuning tips may prove quite handy.
  • VMware’s Virtual Reality blog posted an entry on some of the architectural advantages of VMware Infrastructure in comparison to the two leading competitors, Xen (any Xen-based solution) and Hyper-V. Many of the things listed as advantages by VMware are severe points of contention with the other vendors, such as the direct vs. indirect I/O model. Ultimately, time will tell which model was the best; I honestly don’t know enough about the deep dark internals to really state which is better. One thing I am glad to see pointed out is the true comparison of hypervisor sizes; Microsoft can say all they want that Hyper-V is only 600K in size and therefore is the “thinnest” hypervisor, but the truth of the matter is that Hyper-V can’t run without Windows Server 2008 in the parent partition. As a result, it doesn’t really matter how “thin” Hyper-V is, does it?
  • Via Mike Laverick, I learned that Microsoft may have brought up the whole 64-bit hypervisor vs. 32-bit hypervisor argument yet again. Mike used a snippet from this Microsoft Virtualization Team Blog entry; in reading it myself, I don’t get quite the same 64-bit vs. 32-bit that Mike picked up. That’s good, because I didn’t want to have to go there again. Personally, the tone I picked up from the whole article was one of educating people far too accustomed to Virtual Server/VirtualPC and trying to educate them on how Hyper-V is different.
  • Virtualization analyst Chris Wolf recently posted an entry in which he questioned if Apple would capitalize on the opportunity that virtualization is creating. It’s an interesting scenario, one that is similar to a scenario that I discussed a couple of years ago in a piece titled “Application Agnosticism.” In that article, I suggested that seamless host-guest interactions with virtualization software (now implemented by VMware as Unity and by Parallels as Coherence) would usher in a new wave of computing. I suggested that Mac OS X was ahead of the curve because of its ability to run native OS X applications, UNIX applications, X11 applications, Windows applications via WINE (or the commercial variant CrossOver Office), and applications from any other operating system via virtualization. Sounds like I may have been a bit ahead of my time!
  • Chad continues discussing VMware HA with another post on some additional configuration options for HA. Also check out the comments with links to even more information on HA’s advanced configuration options.
  • This VMware KB article has some good information on getting LUN identification information. The breakdown of the command-line output from esxcfg-mpath is particularly helpful (and for that reason I’ve added it to my del.icio.us bookmarks).
  • Rich of VM /ETC shares with us a “Doh!” moment he had when he saw this simple method for identifying VMs with snapshots. Sometimes it’s the simplest solutions that evade us the longest. Here’s what I want to know: Aaron, what exactly does “/HEADDESK” mean, anyway?
  • This article at SearchNetworking.com brings to light some of the challenges networking professionals face with server virtualization. I do agree with one point made in the article regarding the mapping of applications—what the end users really care about—to the networking infrastructure. VMware’s support for CDP in recent versions of VMware Infrastructure is a step in the right direction, but there is still more work to do for sure. I’m not so sure about the rest of the points in the article, but I may be an exception to the norm; I was a CCNA for a while (on track for CCNP) and have done my fair share of Cisco configurations, so I’m no stranger to the networking world. The use of VLANs to ease configuration in a server virtualization environment seems just second nature to me. Also, I did note that the author indicated that “server administrators sometimes inappropriately configure the switches to create a loop” (referring to vSwitches in ESX). How exactly does that happen? I’ve never seen a way to link two vSwitches together without using a VM.

As always, readers’ thoughts are welcome in the comments!

Tags: , , , , , , , , , , ,

Like everyone else in the virtualization world (except for perhaps the folks in Palo Alto, CA), there’s a lot of Hyper-V stuff crossing in front of me.

This time it’s an article on storage options for Hyper-V, written by Jose Barreto. (You’ll recall that I referenced Jose’s clustering article a few days ago.) Out of the wide variety of blogs coming out of Microsoft, Jose’s is one that I have really, truly found informative and helpful. The home page for his blog is here.

Jose also wrote a follow-up article on Hyper-V’s storage options where he discussed booting from iSCSI.

Great work, Jose! Keep it coming.

Tags: , , , , ,

What I had hoped to be able to publish today would be an article describing how to configure and use ESX’s software iSCSI initiator as a failover path for Fibre Channel, so that if the Fibre Channel fabric completely failed VM traffic would automatically failover to software iSCSI. I thought that this would be a great, low-cost way to add another layer of redundancy to your VMware ESX environment.

Unfortunately, I can’t make it work. Here’s the setup I’ve been using for testing:

  • A 200GB LUN visible to ESX over both Fibre Channel (FC) and software iSCSI
  • A VM, stored on this LUN, running Windows Server 2003 R2

Initial tests led me to believe that it would indeed work. I verified that both the FC path as well as the iSCSI path were listed as separate paths for the same LUN. Without placing any load on the VM, I pulled the FC connection from the back of the server. The VM stayed up, and I was able to browse the local hard drive inside the VM. Network connectivity remained active. And the “Manage Paths” dialog box even showed the FC connection as “Dead” and the iSCSI connection as On/Active. Given that information, it seemed like all was good.

Determined to verify that it was working as I expected, I trotted out a copy of IOmeter and tried to repeat the tests. This time around, though, the tests did not go quite so well. IOmeter showed that disk throughput stopped, and the VI Client locked up. I repeated this set of tests a couple of times, and each time—while IOmeter was running—I ran into issues.

Based on these results, I’m inclined to say that one of two things is true. Either:

  1. I did something very, very wrong; or
  2. ESX isn’t quite right to support automatic failover between FC and software iSCSI.

Has anyone else tried this, or am I the only one? If you have tried it, did it work? If so, what steps did you have to take—if any—to make it work properly?

Tags: , , , , ,

« Older entries § Newer entries »