During an upgrade of a server running ESX Server 3.0.0 to ESX Server 3.0.2, we also moved the server to a new server room on a new subnet. The upgrade itself was uneventful and took only a few minutes (as I had expected), but what happened afterward caught me a little off-guard, as did the eventual solution.
We needed to change the IP address of the service console, so after the upgrade was complete I simply edited the /etc/sysconfig/network-scripts/ifcfg-vswif0 file to include the new IP address, restarted networking, and went on about my way. Everything seemed fine; the ESX host responded across the network, responded properly within VirtualCenter, powered on the VMs, etc. In hindsight, I probably should have used the esxcfg-vswif command instead of editing the configuration file directly, but as they say, “Hindsight has 20/20 vision.â€
It wasn’t until a few minutes later that we realized we were unable to connect to any VM’s console. When we tried to open a VM’s console, we received an error message to the effect that the “host had responded incorrectlyâ€. Strangely enough, this problem only seemed to affect VI client installations; we were able to connect without any problems from the VirtualCenter server itself.
Thinking that perhaps we had run into an ACL on one of the network switches, I tried opening a telnet connection to TCP port 902 on the VirtualCenter server. That worked just fine, so that eliminated the possibility of a router/switch ACL blocking the traffic, and also eliminated the possibility that a host-based firewall on the VirtualCenter server was causing the problem. (A second review a couple minutes later verified again that Windows Firewall was not running and therefore could not be the problem.) It wasn’t DNS name resolution; both the VC server and VI clients were able to resolve the hostnames of all the ESX servers as well as the individual guest VMs.
“Aha!†I thought. “I need to restart mgmt-vmware because I didn’t restart that service after changing the IP address.†Alas, that didn’t work either.
Finally, a Google search turned up this thread and this thread from the VMTN Community Forums, both of which referenced the /root/anaconda-ks.cfg file. An Anaconda kickstart file causing the problem? It didn’t make any sense to me, but just for kicks I made the following changes:
- Edited the /root/anaconda-ks.cfg file to show the correct IP addressing information for the Service Console
- Edited the /etc/sysconfig/network file to have the right gateway IP address (strangely enough, the Service Console seemed to be routing traffic correctly even with an incorrect gateway IP address)
- Restarted networking and the mgmt-vmware services
Lo and behold, the VM consoles now worked perfectly. I’m still not sure which of the changes actually corrected the problem; I hope to be able to try to recreate this problem in the lab and more closely determine what the exact cause and resolution were. When I have some additional information, I’ll post it here.
Anyone else run into this problem?
Tags: ESX, Networking, Virtualization, VMware


5 comments
Comments feed for this article
Trackback link
http://blog.scottlowe.org/2007/08/22/error-connecting-to-vm-console/trackback/
Wednesday, August 22, 2007 at 2:35 pm
thomas
Hi Scott,
I agree, anaconda-ks.cfg shouldn’t have anything to do with the issue. This file stores all of the parameters used when installing the host (anaconda is the installation app).
I would be very confident in saying the issue was in your /etc/sysconfig/network file. And yes, your hindsight observation of using ‘esxcfg-vswif’ is dead on (I learned the hard way as well).
Thursday, August 23, 2007 at 11:58 pm
Mark
Hi Scott,
You just saved me from hours of looking around. Mine was definitely just the anaconda-ks.cfg file. Made the correction and the console now works fine. Now to figure out why HA won’t play nice.
Cheers
Mark
Friday, August 24, 2007 at 6:33 pm
slowe
Mark,
Have a look at your DNS settings. I have found that DNS problems will definitely cause problems with HA. In some cases, in fact, I’ve resorted to hard-coding entries in the /etc/hosts file on the ESX servers, so that a DNS outage doesn’t cause problems with HA/DRS.
Monday, August 27, 2007 at 12:54 am
Mark
Hi Scott,
Thanks got it going DNS was all fine except that I had the hosts listed in capitals and dns was all in lower case. Changing the host to all lower case seemed to do the trick. Talk about picky.
Cheers
Mark
Sunday, September 9, 2007 at 12:11 pm
Mike
I can confirm that anaconda-ks.cfg also needs to be changed. I moved an existing server to a different IP address (same subnet so no change to GW) using esxcfg-vswif. After a network restart evething was working (ssh, scp, VI client) excep the VM console. I changed the server address in anaconda-ks.cfg and rebooted. The VM console then worked normally.
Mike.