What I had hoped to be able to publish today would be an article describing how to configure and use ESX’s software iSCSI initiator as a failover path for Fibre Channel, so that if the Fibre Channel fabric completely failed VM traffic would automatically failover to software iSCSI. I thought that this would be a great, low-cost way to add another layer of redundancy to your VMware ESX environment.
Unfortunately, I can’t make it work. Here’s the setup I’ve been using for testing:
- A 200GB LUN visible to ESX over both Fibre Channel (FC) and software iSCSI
- A VM, stored on this LUN, running Windows Server 2003 R2
Initial tests led me to believe that it would indeed work. I verified that both the FC path as well as the iSCSI path were listed as separate paths for the same LUN. Without placing any load on the VM, I pulled the FC connection from the back of the server. The VM stayed up, and I was able to browse the local hard drive inside the VM. Network connectivity remained active. And the “Manage Paths” dialog box even showed the FC connection as “Dead” and the iSCSI connection as On/Active. Given that information, it seemed like all was good.
Determined to verify that it was working as I expected, I trotted out a copy of IOmeter and tried to repeat the tests. This time around, though, the tests did not go quite so well. IOmeter showed that disk throughput stopped, and the VI Client locked up. I repeated this set of tests a couple of times, and each time—while IOmeter was running—I ran into issues.
Based on these results, I’m inclined to say that one of two things is true. Either:
- I did something very, very wrong; or
- ESX isn’t quite right to support automatic failover between FC and software iSCSI.
Has anyone else tried this, or am I the only one? If you have tried it, did it work? If so, what steps did you have to take—if any—to make it work properly?


8 comments
Comments feed for this article
Trackback link
http://blog.scottlowe.org/2008/04/28/fibre-channel-to-software-iscsi-failover-failures/trackback/
Monday, April 28, 2008 at 8:47 am
Alphageek
Response at:
http://www.infrageeks.com/groups/infrageeks/weblog/7c424/Fibre_Channel_to_Software_iSCSI_Failover_Failures.html
Monday, April 28, 2008 at 11:16 am
Nick Triantos
Theoretically, this config should work, although, it’s not supported by VMware and by extension the majority, if not all, storage vendors.
Although, it sounds like a good solution, customers tend to be leary using different stacks. In fact we’ve supported this type of config for standalone windows environments and some UNIX environments (i.e HP-UX) for sometime now.
Try setting the Disk TimeOutValue(default=10″) in the Windows registry to be higher than the FC HBA driver timeout. It’s quite possible the Disk Class driver’s timing out the requests prior to the path switch been completed, especially if you’re queuing a whole bunch of requests…which you are.
Monday, April 28, 2008 at 11:44 am
Duncan
one of our trainers tested this a while back and he told me that it worked. but if he stressed it with iometer i doubt. will asks it tomorrow if I see him.
Monday, April 28, 2008 at 3:24 pm
slowe
Duncan,
I thought it worked, too, until I ran the tests with IOmeter. I’m going to keep trying, and I’ll let you know if I get any different results.
Nick,
It seemed to me like it should work, too. I was able to manually fail over the paths as long as IOmeter wasn’t running.
As for the DiskTimeOut value, I’ll try that, but the error really seems to be at the ESX level. The Windows VM just kept on chugging.
Thanks!
Monday, April 28, 2008 at 6:52 pm
william bishop
Scott, what do the logs tell you from that timeframe?
Tuesday, April 29, 2008 at 6:59 am
Jon
I have seen this behaviour when a failover of two fiber paths. Even if the error looks to be in the ESX, you can try to change this registry entry (being a w2003):
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk\TimeOutValue
BTW, what do the event logs in the w2003 are saying?
If you can try with a linux VM you probably will see the root (/) partition going read only (for example Red Hat Linux 4 Update 4 needs this rpm http://kb.vmware.com/KB/51306 in order to prevent going on Read Only in the event of a fail over)
Hope this helps, congratulations for the blog!
Jon
Monday, June 9, 2008 at 12:46 pm
Glenn Dekhayser
Scott-
Just a theoretical here; what about putting in a FC-to-iSCSI bridge, so that you’re not failing over to a new stack? I think that would simplify this situation, by making the iSCSI path appear to exist in the FC world. Of course, you’d probably need to build an additional seperate fabric, which means additional FC card (ugh), but it would technically be supported right? Really expensive, but then again, we’re talking FC SANs!
-Glenn
Monday, June 9, 2008 at 4:23 pm
slowe
Glenn,
Yes, that’s certainly possible, but what I was trying to achieve was having another layer of redundancy with what was already provided “out of the box,” so to speak. I plan to continue testing to see if there is any way to make this work.
I appreciate your comment. Thanks for reading!