PXE Booting VMware ESX 4.0

I recently had the opportunity to work on a proof of concept (PoC) in which we wanted to help a customer streamline the processes needed to deploy new hosts and reduce the amount of time it took overall. One of the tools we used in the PoC for this purpose was PXE booting VMware ESX for an automated installation. Here are the details on how we made this work.

Before I get into the details, I’ll provide this disclaimer: there are probably easier ways of making this work. I specifically didn’t use UDA or similar because I wanted to gain the experience of how to do this the “old fashioned” way. I also wanted to be able to walk the customer through the “old fashioned” way and explain all the various components.

With that in mind, here are the components you’ll need to make this work:

  1. You’ll need a DHCP server to pass down the PXE boot information. In this particular instance, I used an existing Windows-based DHCP server. Any DHCP server should work; feel free to use the Linux ISC DHCP server if you prefer.
  2. You’ll need an FTP server to host the kickstart script and VMware ESX 4.0 Update 1 installation files. In this case, I used a third-party FTP server running on the same Windows-based server as DHCP. Again, feel free to use a Linux-based FTP server if you prefer.
  3. You will need a TFTP server to provide the boot files. The third-party FTP server used in the previous step also provided TFTP functionality. Use whatever TFTP server you prefer.

Make sure that each of these components is working as expected before proceeding. Otherwise, you’ll spend time troubleshooting problems that aren’t immediately apparent.

Preparing for the Automated ESX Installation

First, copy the contents for the VMware ESX 4.0 Update 1 DVD—not the actual ISO, but the contents of the ISO—to a directory on the FTP server. Test it to make sure that the files can be accessed via an anonymous FTP user.

Also go ahead and create a simple kickstart script that automates the installation of VMware ESX. I won’t bother to go into detail on this step here; it’s been quite adequately documented elsewhere. You’ll need to put this kickstart script on the FTP server as well.

At this point, you’re ready to proceed with gathering the PXE boot files.

Gathering the PXE Boot Files

The first task you’ll need to complete is gathering the necessary files for a PXE boot environment.

First, copy the vmlinuz and initrd.img files from the VMware ESX 4.0 Update 1 ISO image. Since I use a Mac, for me this was a simple case of mounting the ISO image and copying out the files I needed. Linux or Windows users, it might be a bit more complicated for you. These files, by the way, are in the ISOLINUX folder on the DVD image.

Next, you’ll need the PXE boot files. Specifically, you’ll need the menu.c32 and pxelinux.0 files. These files are not on the DVD ISO image; you’ll have to download Syslinux from this web site. Once you download Syslinux, extract the files into a temporary directory. You’ll find menu.c32 in the com32/menu folder; you’ll find pxelinux.0 in the core folder. Copy both of these files, along with vmlinuz and initrd.img, into the root directory of the TFTP server. (If you don’t know the root directory of the TFTP server, double-check its configuration.)

You’re now ready to configure the PXE boot process.

Configuring the PXE Boot Environment

Once the necessary files have been placed into the root directory of the TFTP server, you’re ready to configure the PXE boot environment. To do this, you’ll need to create a PXE configuration file on the TFTP server.

The file should be placed into a folder named pxelinux.cfg under the root of the TFTP server. The filename of the PXE configuration file should be named something like this:

01-<MAC address of network interface on host>

If the MAC address of the host was 01:02:03:04:05:06, the name of the text file in the pxelinux.cfg folder on the TFTP server would be:

01-01-02-03-04-05-06

The PoC in which I was engaged involved Cisco UCS, so we knew in advance what the MAC addresses were going to be (the MAC address is assigned in the UCS service profile).

The contents of this file should look something like this (lines have been wrapped here for readability and are marked by backslashes; don’t insert any line breaks in the actual file):

default menu.c32
menu title Custom PXE Boot Menu Title
timeout 30
 
label scripted
menu label Scripted installation
kernel vmlinuz
append initrd=initrd.img mem=512M ksdevice=vmnic0 \
  ks=ftp://A.B.C.D/ks.cfg
IPAPPEND 1

You’ll want to replace ftp://A.B.C.D/ks.cfg with the correct IP address and path for the kickstart script on the FTP server.

Only one step remains: configuring the DHCP server.

Configuring the DHCP Server for PXE Boot

As I mentioned earlier, I used the Windows DHCP server as a matter of ease and convenience; feel free to use whatever DHCP server best suits your needs. There are only two options that are necessary for PXE boot:

066 Boot Server Host Name (specify the IP address of the TFTP server)
067 Bootfile Name (specify pxelinux.0)

In this particular example, I created reservations for each MAC address. Because the values were the same for all reservations, I used server-wide DHCP options, but you could use reservation-specific DHCP options if you wanted different boot options on a per-MAC address (i.e., per-reservation) basis.

The End Result

Recall that this PoC was using Cisco UCS blades. Thus, in this environment, to prepare for a new host coming online we only had to make sure that we had a PXE configuration file and create a matching DHCP reservation. The MAC address would get assigned via the service profile, and when the blade booted then it would automatically proceed with an unattended installation. Combined with Host Profiles in VMware vCenter, this took the process of bringing new ESX/ESXi hosts online down to mere minutes. A definite win for any customer!

Tags: , , , , ,

  1. Colin’s avatar

    Hi Scott

    Great article.

    Which 3rd party FTP server did you use?

    Cheers
    Colin

  2. slowe’s avatar

    I believe it was 3CDaemon:

    http://support.3com.com/software/utilities_for_windows_32_bit.htm

    Again, any applicable server would work.

  3. Saunders’s avatar

    Will this procedure work the same on ESXi as well as it works on traditional ESX?

    Thanks for the great article,

    Saunders

  4. solgae’s avatar

    I used UDA 2.0 when deploying ESX – very easy to set up and setting up multiple hosts.

    One thing I’d like to add is that if your service console is not on the native VLAN, and you are connecting your service console NIC to the trunk port, your NIC will need to be set up (usually in BIOS) to tag the VLAN tags during the PXE boot-up stage. Otherwise, you’ll have to make sure the service console VLAN is untagged.

    For me, I was deploying ESX on an HP blade environment with Virtual Connect. I basically followed one of your articles, and when I assign the shared uplink profile to the server profile, I set the service console VLAN as untagged. This allowed the blades to get to the DHCP server and PXE boot as needed.

  5. slowe’s avatar

    Saunders,

    I haven’t tested ESXi Installable in this sort of setup (yet). Have a look at this PDF from VMware’s site; it provides more information:

    http://www.vmware.com/pdf/vsp_4_pxe_boot_esxi.pdf

    If you do some testing, let me know what results you uncover. I’ll do likewise. Thanks!

  6. Jason Boche’s avatar

    Good article Scott. I traversed the easy path and use UDA in my lab.

    As a side note, the 2/24 vCalendar entry addressed configuring a MS DHCP server for PXE booting.

    Keep the good stuff coming!
    Jas

  7. slowe’s avatar

    Solgae,

    You are absolutely correct about the native (untagged) VLAN.

  8. Ceri Davies’s avatar

    Solgae,

    I don’t think that just configuring the NIC with a VLAN for PXE is good enough. Weasel (the ESX4 installer) doesn’t support VLANs during the early stages (or at all), so when pxelinux.0 hands off to Weasel, you’ll find that it’s unable to retrieve the kickstart file.

    As far as I’m aware, there are only two ways around this:
    a) Use the native VLAN;
    b) Put the kickstart file on an ISO and don’t use PXE.

    I’ve recently had to do b) as, for some reason or another, our network guys couldn’t get a native VLAN to work alongside trunked VLANs (heterogenous equipment; suspect they could have fixed it given enough time, but there wasn’t any spare) and we had to tag everything. There was no way to tell Weasel to use a trunked VLAN to retrieve the kickstart file.

    I have filed an RFE to have Weasel support VLANs at this early stage, but so far, it does not.

    Rather than hardcode all the network settings on the ISO, I hacked up the kickstart script to prompt for details; will write it up at some point, along with my experiences with the Dell M1000e chassis.

  9. Andy’s avatar

    Just as an FYI for the network guys reading this. You only have to set the correct VLAN as native (untagged) on the trunk port for the NIC you are going to use for the PXE. Anything “upstream” of that can stay tagged in any way you want it.

    Great article Scott.

  10. russell’s avatar

    Interesting aside;

    If you find yourself staring at a screen that briefly says “no COS NICs defined by user” followed by a “Press to reboot” you probably have an issue in your kickstart config file.

    Hit alt F4 (or some other virtual console) and you can hit enter for a console. Then you can check out the logs in /var/log to find out what’s whacky. Ran into some issues this morning where the VMware documentation was incorrect (specifically on the syntax of the part command’s –onfirstdisk option; it incorrectly states that you =ARGUMENTSGOHERE when there are no arguments to feed it.)

  11. Jeramiah Dooley’s avatar

    As the customer for whom the PoC was being done, I can attest that it worked great. Even better was the integration between the PXE booting and the boot profiles within the UCS Manager, however. By setting every blade to do a dual boot, SAN and then PXE, we had a way to handle brand new blades being deployed as well as existing blades being rebooted. For unconfigured blades, the PXE boot handled the vSphere install onto a presented LUN (make sure it’s LUN 0!), and then after a reboot the blade booted directly to the SAN and brought up the console.

    We took a single blade from newly installed to ready to register with vCenter in 24 minutes, and we installed, configured, presented storage to and provisioned customer networks on 32 hosts, broken into four separate HA/FT capable clusters, in less than 5 hours. We were looking for operational efficiency out of the PoC and we definitely found it! It all started with the PXE booting, so a big thanks to Scott and the whole team for their efforts.

  12. John Kennedy’s avatar

    Having done this myself a time or two, I’ve fallen back on Tftpd32 (http://tftpd32.jounin.net/). Yes, it’s a Windows app, but it incorporates DHCPD, TFTPD, and SYSLOG support. It’s handy for labs, quick POCs where the customer isn’t particularly interested in the details of PXE but rather wants to see ESX installed quickly, etc. UDA is fantastic for doing installs, very easy and I recommend it highly.

  13. Jose Ruelas’s avatar

    quick question: should not the post be called PXE Install instead of PXE boot??

    kind regards
    Jose Ruelas

  14. Niels’s avatar

    It is possible to use vlan’s during a PXE deployment. Add vlanid=### to the APPEND line, and of you go.

    Mrepo (http://dag.wieers.com/home-made/mrepo/) can be handy to create a installation repository (for ESX or other RPM based Linux distro’s), it also copies the relevant PXE files and puts them on the right tftp spot.

    Best regards,
    Niels

  15. paul’s avatar

    can you help me trouble shoot the following. ftp auto installation esx4. get to 97%, then aborts. script is installing everything else, my last resort was modifying isolinux.cfg

    95% complete-making the initial ramdisk
    error: unable to regenerate initrd, system in unknown state. please
    check log files for errors and correct before rebooting

    error: failed to update /boot/trouble.
    error: /usr/sbin/esxcfg-boot failed, examine /var/log/vmware/esxcfg-boot

    95% complete-writing GRUB to the master boot record
    97% complete-boot setup
    No.
    installation aborted

    [Errno 2] no such file or directory: ‘/mnt/sysimage/boot/initrd.img’
    see /var/log/esx_install.log

    press to reboot

  16. Yiping’s avatar

    Does it have to use FTP for ks.cfg? Would HTTP or NFS work ?

  17. slowe’s avatar

    Yiping,

    In theory, there’s no reason you couldn’t use HTTP or NFS. I tested it with FTP, so that’s what I described.

  18. windows tips’s avatar

    I am having ther same issue as the ones reported by Paul…auto installation esx4. get to 97% and right after that it stops…please can anyone troubleshoots this?

  19. Marcos’s avatar

    I need help. My situation is as follows.

    I have ESX servers running on Blade Center HP. The connections with my Data Center are all in trunk mode.

    I have a network card (NIC) of NOC and SOC in this where run my Service Console (Mode Trunk).

    I have the ESX virtualized servers, Citrix Provisioning Manager, XenApp to create a VDI solution for my end users.

    My DHCP server is a Network Appliance Cisco (Catalyst Switch) and it set the options 66 and 67 providing IP from my TFTP server (Citrix Provisioning) and Boot File (.bin).

    When the station is initialized, the Appliance cisco provides a IP for the machine but it can not connect to the TFTP server, which is in another VLAN. I have a VLAN to station and a VLAN for servers.

    Should I create another VLAN just for the PXE and include this VLAN on the native vlan on the network settings for the trunk port for the service work? Should create another virtual switch in ESX without a tag vlan?

  20. Frans’s avatar

    When the auto installation ends at 97%, then their must be an error bij the %post install. Can you post the settings from the %post config

  21. Brian’s avatar

    Ceri,

    Any updates on your M1000e experience w/PXE?

  22. Heidi Vuolera’s avatar

    I also am having problems being stuck at 97% , has there been an update or fix for this ?

Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>