Illegal OpCode Red Screen of Death while booting a HP Proliant server from an USB SD card

[Update]
As per Jason’s comment, with a new ILO4 update HP apparently has fixed an issue related to booting from SD cards. Whether this is the same issue is unclear though since the original KB article I linked to has not been updated.
[/Update]

Important note: The general symptom of such a Red Screen of Death described here is NOT specific to ESXi or booting from SD cards in general. It can happen with Windows, Linux or any other OS as well as other boot media such as normal disks/RAID arrays, if the server has a problem booting from this device (broken boot sector/partition/boot loader etc).

A couple of weeks ago I was updating a few HP Proliant DL360p Gen8 servers running ESXi on a local SD card with ESXi patches via VUM, so business as usual. Almost, because on one of the servers I ran into the following issue:
After rebooting the host, the BIOS POST completed fine and the Proliant DL360p Gen8 server should now boot the ESXi OS from it’s attached USB SD card where ESXi was installed; but instead it displayed this unsightly screen telling  something went very, very wrong:

iloillegalopcodeI reset the server several times via iLO but the issue persisted and I had no idea what exactly went bonkers here. Then I decided to boot a Linux live image, which worked fine, narrowing down the issue to the OS installation (device) itself. I thought the updates corrupted the installation but that actually wasn’t the case.
When attempting to mount the SD card USB drive from within the live Linux I noticed it was actually completely absent from the system. The USB bus was still ok, but lsusb showed no SD card reader device in the system at all!

Just to make sure I wasn’t imagining things I booted an ESXi installation medium too and likewise, it didn’t detect the local SD card but only the local RAID controller volume:

So the Illegal OpCode Red Screen of Death was probably the result of the server trying to force a boot from the local RAID array volume, which is a pure GPT VMFS5 volume without a proper boot partition.

I first thought the SD card reader or SD card was faulty but after googling around for a while I stumbled upon this article:
HP Advisory: ProLiant DL380p Gen8 Server -Server May Fail to Boot From an SD Card or USB Device After Frequent Reboots While Virtual Media Is Mounted in the HP Integrated Lights-Out 4 (iLO 4) Integrated Remote Console (IRC)

DESCRIPTION
In rare instances, a ProLiant DL380p Gen8 server may fail to boot from an SD card or a USB device after frequent reboots while Virtual Media is mounted in the HP Integrated Lights-Out 4 (iLO 4) Integrated Remote Console (IRC).
This issue can occur if the server is rebooted approximately every five minutes. If this occurs, the following message will be displayed: Non-System disk or disk error-replace and strike any key when ready
SCOPE
Any HP ProLiant DL380p Gen8 server with HP Integrated Lights-Out 4 (iLO 4).
RESOLUTION
If a ProLiant DL380p Gen8 server fails to boot from an SD card or a USB device, cold boot the server to recover from this issue.

The article only mentions DL380p Gen8 servers but I imagine the same could apply to DL360p Gen8 or other servers as well. The problem description doesn’t really fit all that well either to my case but I tried cold booting the server as instructed. And this did the trick. After leaving the server powered-off for about 5 minutes and powering it on again, it detected the SD card again and booted up the ESXi installation on it fine.
For good measure I rebooted the server another time, which also went without a hitch.

The key takeaway here:
1. As per the mentioned HP Advisory, the USB SD card device of a Proliant 380/360 Gen8 server might randomly disappear during a reboot, so be aware of that and try cold booting the server in that case.
2. When dealing with an Illegal OpCode boot error on a HP Proliant server like shown above, make sure you have a valid boot device and the BIOS is properly configured to boot from this device.
On a physical Linux host for example the grub boot loader might be corrupted, which can easily be fixed by re-installing grub with a live Linux. I’ve had that happen to me with physical Linux servers before.

Advertisements

9 thoughts on “Illegal OpCode Red Screen of Death while booting a HP Proliant server from an USB SD card

  1. I ran into the same issue with my DL360p G8 upon reboot. It appears the boot order was configured to boot from the RAID array first. Changed it to make the SD Card(aka USB Device) as first boot device. Server now boots normally into VMware hypervisor without any issues.

  2. I had the exact same issue. I worked with both HP support and VMWare support, only to have the VMWare support person find this blog. Once I change the boot order to boot from USB first, BOOM, it’s fixed! Thank you SOOO much..

  3. Pingback: HP Red Screen Of Death | |rootzilopochtli.com|

  4. illegal opcodes are when you run a nonexistent/non-implemented processor instruction…i don’t exactly see how that relates to booting from an external SD..?

    • As I described this can happen in any case where the system tries to boot from an invalid medium. The boot SD-card disappearing in my case is just one example:
      The general symptom of such a Red Screen of Death described here is NOT specific to ESXi or booting from SD cards in general. It can happen with Windows, Linux or any other OS as well as other boot media such as normal disks/RAID arrays, if the server has a problem booting from this device (broken boot sector/partition/boot loader etc).

      Try to power-off the system for a few minutes and boot up the server again (maybe even completely unplug the server from power and leave it unplugged for a few minutes).
      If that doesn’t help update BIOS and ILO, and if that still doesn’t work make sure the boot medium aka your RAID array or physical hard disk is ok and not corrupted.
      You can boot a live-Linux CD to verify the actual disk boot sector/partition/file systems are intact.

    • i do not know if this can be usefull but i’ll say my boot order was on RAID so i changed that to SATA AHCI and put my hd as first attempt on boot and i could get the system in normally

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s