Setting Up a Remote Rescue Environment with OpenWRT and PXE

A client of mine called recently with a busted computer. There was a power outage at his site and upon powering up one of their machines he was greeted with the dreaded "No bootable devices found" text error message right after the Dell splash screen. I suspected hard drive failure but wanted to confirm it. Problem is that I can't easily get on site with this client so I looked to OpenWRT and a network PXE boot to try and boot Linux on the system so I could troubleshoot remotely.

Goal

To be able to access a system remotely to troubleshoot when the installed operating system won't boot. While a graphical interface (via RDP or VNC) would be nice, a command-line interface is the requirement.

Prerequisites

To do a PXE network boot, we need a few things:

  1. DHCP Server - that can send commands to systems to do a PXE boot
  2. PXE Bootloader - some firmware that can download and start a Linux image
  3. TFTP Server - newer UEFI systems can boot over HTTP, but all the old BIOS and early UEFI systems only support TFTP for downloading the bootloader
  4. HTTP Server - while technically not required for PXE, HTTP is easier to use and troubleshoot than FTP

This solution will use:

  1. OpenWRT - I've tested variations of this on various OpenWRT releases over the years, but this particular config I tested on 18.06.1
  2. DHCP/TFTP - dnsmasq, the built-in DNS/DHCP server in OpenWRT, already has TFTP support as well!
  3. HTTP - lighttpd or uhttpd, the built-in HTTP/web servers in various OpenWRT distributions
  4. iPXE - a very capable PXE firmware that I prefer over syslinux's PXELINUX due to its easier and more versatile configuration
  5. SystemRescueCD - a rescue-focused Linux distribution that comes with lots of diagnostic tools pre-installed, a command-line interface, and boot parameters that trigger a remotely-accessible SSH instance. I tested SystemRescueCD version 6.1.1.

A note about UEFI

I've not tried UEFI network booting yet. I've only used the legacy BIOS PXE methods. iPXE can be built for use in UEFI cases: use snponly.efi (uses EFI's built-in SNP network stack - preferred) or ipxe.efi (has built-in network card drivers - used usually if booting from USB or CD). More description of these options and filenames is on iPXE's website. Also this looks like a good set of iPXE and EFI instructions but I haven't tested them myself yet.

Compatibility and Security Concerns

Be aware that TFTP has NO access-controls and everything in the /tftp directory will be public to everyone who can reach your TFTP server. By default, the firewall on OpenWRT will allow all connections from the LAN network to TFTP and block all from the WAN.

PXE booting (by default) has NO authentication - clients just blindly accept code from the remote source and execute it. There are some workarounds that add some authentication bits, but in a legacy PXE boot scenario, it is essentially ripe for a hacker with physical access to the network to send remote code for execution at boot. To mitigate this (slightly), I setup the client systems to only do PXE boots on demand (require a keypress at boot) so that they aren't asking for remote code all the time.

The remote rescue image's root password is sent plaintext! It's specified in the ipxe.cfg config which is pulled without encryption. Use the image quickly for rescue then reboot.

On compatibility:

  • I discovered a WiFi-enabled credit card processor that wouldn't boot and network correctly when we sent it the DHCP option for a PXE boot. So our solution will include a way to disable the PXE boot for certain devices.
  • SystemRescueCD is not terribly lightweight... It's squashfs file is 500M+! The means the target computer really needs to have at least 1GB of RAM to boot it successfully.

Storage

SystemRescueCD is pretty big! You'll need at least 600MB+ to store it. I store it on a USB flash drive permanently plugged into my router. Other options could be to host it on a separate system, but my intention was to have it all hosted by my highly-reliable, no-moving-parts Linux router.

Enable TFTP in dnsmasq

On your OpenWRT device, create a directory for TFTP and then enable the use of it.

mkdir /tftp
uci set "dhcp.@dnsmasq[0].enable_tftp='1'"
uci set "dhcp.@dnsmasq[0].tftp_root='/tftp'"
uci commit

Next, edit /etc/dnsmasq.conf and add the following settings. These tell dnsmasq to add a DHCP option that specifies which firmware to tell the system to try and boot to. We use the NOPXE tag to tell dnsmasq to NOT send a firmware file option to devices we specify. And the ENH tag is set and used to tell us if it is the BIOS requesting an IP or the iPXE bootloader requesting the IP (in which case, we want to sent the iPXE config file instead of the iPXE code again). More details of this are one the iPXE website.

dhcp-userclass=set:ENH,iPXE
dhcp-boot=tag:!ENH,tag:!NOPXE,undionly.kpxe
dhcp-boot=tag:ENH,ipxe.cfg

If you need to disable PXE boot options for certain devices (think IoT/embedded devices like thermostats, DVRs, payment processors), use the following config template in /etc/config/dhcp and change the IPs and MACs as needed:

config host 'paymentprocessor'
        option ip '192.168.23.10'
        option mac '01:23:45:af:be:cd'
        option tag 'NOPXE'

config host 'nvr'
        option ip '192.168.20.11'
        option mac '01:23:45:af:be:cd'
        option tag 'NOPXE'

To enable all this, restart your dnsmasq server:

/etc/init.d/dnsmasq stop
# wait 10 seconds
/etc/init.d/dnsmasq start

Download ipxe bootloader

Next we need to get the latest ipxe bootloader and put it in /tftp:

cd /tftp
wget http://boot.ipxe.org/undionly.kpxe

ipxe.cfg

Put this ipxe.cfg file in /tftp so that dnsmasq can deliver it per the config above.

#!ipxe

:start
menu
item rescue   Remote Rescue (SystemRescueCD)
item memtest  MemTest86
item shell    Drop to Shell
item reboot   Reboot
item exit     Exit and continue BIOS Boot
choose os && goto ${os}

:shell
echo Type 'exit' to return to this menu
shell
set menu-timeout 0
set submenu-timeout 0
goto start

:reboot
reboot

:exit
exit

:rescue
kernel http://${next-server}/boot/sysresccd/6.1.1/boot/i686/vmlinuz initrd=initrd.img rootpass=rescueme archisobasedir=6.1.1 archiso_http_srv=http://${next-server}/boot/sysresccd/ checksum ip=dhcp nofirewall
initrd http://${next-server}/boot/sysresccd/6.1.1/boot/i686/sysresccd.img
boot

:memtest
imgload http://${next-server}/boot/memdisk iso
module http://${next-server}/boot/memtest86.iso
boot

${next-server} is an iPxe variable/setting that points to the IP addresses provided by the DHCP server (usually itself when using dnsmasq). Since we host the TFTP and HTTP servers on the same OpenWRT router, this should take us back to the right place!

Setup SystemRescueCD for PXE Booting

Download the SystemRescueCD ISO. I tested the 32-bit (i686) build of version 6.1.1. From the ISO, you'll need to extract the following files and put them in this layout on your HTTP server:

/www/boot/sysresccd/6.1.1/boot/i686/sysresccd.img
/www/boot/sysresccd/6.1.1/boot/i686/vmlinuz
/www/boot/sysresccd/6.1.1/boot/amd_ucode.img
/www/boot/sysresccd/6.1.1/boot/amd_ucode.LICENSE
/www/boot/sysresccd/6.1.1/boot/intel_ucode.LICENSE
/www/boot/sysresccd/6.1.1/boot/intel_ucode.img
/www/boot/sysresccd/6.1.1/i686/airootfs.sha512
/www/boot/sysresccd/6.1.1/i686/airootfs.sfs
/www/boot/sysresccd/6.1.1/pkglist.i686.txt
/www/boot/sysresccd/6.1.1/VERSION

lighttpd

The stock OpenWRT uses uhttpd which should work out-of-the-box for this setup. I prefer the GL.inet hardware and their stock firmware uses lighttpd by default. To make this work, I had to add the following to /etc/lighttpd.conf so that directory listings were enabled:

$HTTP["url"] =~ "^/boot($|/)" { dir-listing.activate = "enable" }

Test it!

Make sure you've restarted dnsmasq to enable the TFTP server and DHCP boot options... Then reboot a PXE system and trigger the LAN boot. Beware that most systems do not have PXE or LAN booting enabled by default. Look through the BIOS settings.

Other Useful Things to Boot

  1. Hardware Detection Tool - Helpful syslinux utility that presents a nice menu with a break-out of all the hardware identified on the system. However, it is not packaged quite right (iPXE's public build won't boot COM32 modules directly) so you'll have to repackage as a floppy-disk or ISO image. Or chainload pxelinux to start it.
  2. FreeDOS - go old-school with this barebones open-source operating system
  3. memtest - see below
  4. SeaTools Hard Drive Diagnostics - seems to work for non-Seagate drives as well
  5. Linux installers - most Linux distributions support network booting and installation and this can be a convenient way to start that process. I've used Red Hat kickstart scripts with RHEL, CentOS, and Fedora to automate installations.
  6. Other Live Linux images - like CoreOS.

memtest

memtest is a really useful memory diagnostic tool. It thoroughly tests your RAM in multiple different ways to identify any faulty regions. If you're having bizarre computer behavior (weird lock-ups, random restarts, blue screens of death, etc), then try memtest to see if bad memory might be the culprit.

To download the latest ISO and set it up for use, we'll also need the memdisk module from syslinux. Use the following commands to download both and get them setup.

cd /tmp
# first get and extract the memtest ISO image
wget http://memtest.org/download/5.31b/memtest86+-5.31b.iso.gz
gunzip memtest86\+-5.31b.iso.gz
mv memtest86\+-5.31b.iso /www/boot/memtest86.iso
# now extract the memdisk image loader
wget https://mirrors.edge.kernel.org/pub/linux/utils/boot/syslinux/syslinux-6.03.zip
unzip syslinux-6.03.zip bios/memdisk/memdisk
mv bios/memdisk/memdisk /www/boot/memdisk

More References

  1. http://boot.salstar.sk/ - I've found some great iPXE examples here for how to boot and configure most Linux distributions. Plus, you can either copy or just import his configs directly (if you trust them... I usually use them as examples and modify them for my needs)
  2. https://backreference.org/2013/11/24/pxe-server-with-dnsmasq-apache-and-ipxe/ - I was originally inspired by this post.
  3. https://git.archlinux.org/archiso.git/plain/docs/README.bootparams - boot parameters for Arch Linux (the underlying distribution for SystemRescueCD 6.0 and newer)
  4. http://www.system-rescue-cd.org/manual/PXE_network_booting/ - SystemRescueCD's directions for PXE booting. I had to customize them for use with iPXE.
Contact Us