Renamed VMware Tools components and automatic installation

With the release of vSphere 6.0 and a recent update for the 5.5 VMware Tools on May 8th 2015 (9.4.12 build 2627939), VMware also changed the Windows VMware Tools installer slightly by renaming some vShield-related components:
VMware ESXi 5.5, Patch ESXi550-201505402-BG: Updates tools-light

The vShield Endpoint drivers are renamed as Guest Introspection Drivers and two of these drivers, NSX File Introspection Driver (vsepflt.sys) and NSX Network Introspection Driver (vnetflt.sys), can be installed separately now. This allows you to install the file driver without installing the network driver.

If you’ve been using custom automated installations of the VMware Tools like me then you might have to adjust the installer command for newer tools versions.

Continue reading

Illegal OpCode Red Screen of Death while booting a HP Proliant server from an USB SD card

As per Jason’s comment, with a new ILO4 update HP apparently has fixed an issue related to booting from SD cards. Whether this is the same issue is unclear though since the original KB article I linked to has not been updated.

Important note: The general symptom of such a Red Screen of Death described here is NOT specific to ESXi or booting from SD cards in general. It can happen with Windows, Linux or any other OS as well as other boot media such as normal disks/RAID arrays, if the server has a problem booting from this device (broken boot sector/partition/boot loader etc).

A couple of weeks ago I was updating a few HP Proliant DL360p Gen8 servers running ESXi on a local SD card with ESXi patches via VUM, so business as usual. Almost, because on one of the servers I ran into the following issue:
After rebooting the host, the BIOS POST completed fine and the Proliant DL360p Gen8 server should now boot the ESXi OS from it’s attached USB SD card where ESXi was installed; but instead it displayed this unsightly screen telling  something went very, very wrong:

iloillegalopcodeI reset the server several times via iLO but the issue persisted and I had no idea what exactly went bonkers here. Then I decided to boot a Linux live image, which worked fine, narrowing down the issue to the OS installation (device) itself. I thought the updates corrupted the installation but that actually wasn’t the case.
When attempting to mount the SD card USB drive from within the live Linux I noticed it was actually completely absent from the system. The USB bus was still ok, but lsusb showed no SD card reader device in the system at all!

Continue reading

Storage performance testing: Not all IOPS are created equal

Every once in a while I come across postings of astonishingly awesome IOPS numbers achieved with a relatively moderate setup. Often this is due to the fact that the benchmark used ran a “maximum throughput” IO pattern, which doesn’t say a lot about actual storage performance. This is because these kinds of patterns issue idealized, sequential IO (often with small IO sizes) to measure the maximum, theoretical throughput of the storage subsystem. Unfortunately this kind of IO pattern practically never occurs with real world applications.

An IO stream hitting a storage device:

  • Is Random to a certain degree
  • Has a certain read/write ratio
  • Can have variable IO sizes between 512 Byte and several MiB (though that is hardly real-world relevant, up to 128KiB is normal)
  • Is stuffed into queues at various levels in the storage stack with various queue depths

All these factors have a huge influence on what everyone casually and generalizing calls “IO Operations per Second”, or IOPS. The quite basic rule of thumb is:
The smaller the IO size and the more sequential the workload, the higher the IOPS number will be.
It should be noted that randomness of IO does not affect flash-based storage like SSDs, but impacts traditional spinning disks to a very large degree. Also, taken a single (spinning) disk, write IOs are not necessarily more expensive than read IOs, considering the local cache on every disk has is a pure read-cache (and don’t you dare to enable write-caching on the disks themselves). Of course this is where RAID-penalties come in, but I’m trying to focus on IO itself regardless of physical storage properties in this article.

When vendors claim their system or disk delivers (up to, hehe) [insert arbitrary number here] IOPS, they usually refer to a workload that is completely random, mainly consists of writes and has a 4-32KiB IO block size.
If you conduct your own storage IO testing, you should use similar IO patterns if you want to compare real-world storage
performance. Otherwise you may end up with unrealistic huge numbers like in the maximum throughput cases.

A great resource for comparing storage IO performance is this thread on the VMware community forums, where hundreds of results using a common IOmeter configuration are posted.
You can get the IOmeter config file here and also paste your resulting csv files there for summarizing numbers.

There is also the VMware IO analyzer virtual appliance, providing various pre-defined IO pattern configs for applications like Exchange or SQL server, but last time I tested it behaved a bit “funky”. (It runs Windows IOmeter in Linux via wine – urks).

Last but not least, a tip if you want to analyze just what kind of IO workload your specific VM/application is issuing: vscsistats is your friend and provides tremendous in-depth information on this.

Further recommended reading:

Configuring and securing local ESXi users for hardware monitoring via WBEM

Besides good ol’ SNMP, the open Common Information Model (CIM) interface on an ESXi host provides a useful way of remotely monitoring the hardware health of your hosts via the Web-Based Enterprise Management (WBEM) protocol. Pretty much every major hardware management solution and agent today supports using WBEM to monitor hosts of various OSes.
Unlike SNMP (except for the painful to implement version 3), it builds on a standard HTTP(S) API, allowing secure SSL/TLS protected authentication and communication between the host and the management stations. Of course you can also use SNMP and WBEM independently at the same time too.
On ESXi, the CIM interface to is implemented through the open Small Footprint CIM Broker (SFCB) service.

sim1 sim2

Seems great, right? To manage your hosts via CIM/WBEM with for example the HP Systems Insight Management (SIM) pictured above, you just need to provide a local user on the ESXi host which SIM can use to authenticate against the host.
You can use the standard root user for example, but is that a good idea? I certainly disagree about that, even more so in environments of administrative disparity where you still have strict separation of virtualization admins and hardware admins (I agree this separation makes no sense in this day and age and causes all sorts of problems besides just this one, but this is the daily reality I’m facing).

Continue reading

[Script] PowerCLI – find out the host on which VMs are running when vCenter is down

Many moons ago, when I started playing with the wonderfulness that is PowerCLI, one of the first things I wrote with a particular problem in mind was a small script to quickly locate the host running our vCenter server in case anything went wrong and I lost access to vCenter directly.
So instead of trying to connect to every possible host of the cluster manually with the vSphere Client, why not just connect to all of them via PowerCLI and query them quickly?

Where’s Waldo?

This resulted in the small, simple script posted below. For this script, you can provide either a list of hosts to connect to, an alias for a cluster which member hosts you pre-populated in the script, along with one or more search strings. This search is matched against the VM names and outputs the list of found VMs with their current power state and most importantly, the host running the VM. This way you can get a VM-Host mapping of not only your vCenter VM, but other VMs as well.

I remembered this script while reading a cool article on about various other ways to keep track on which host your vCenter VM is running.
I “polished” the old, simple code a bit but yeah, I’m still pretty horrible when it comes to scripting. Anyways, here it is in case anyone finds it useful:

Continue reading

ESXi 5.1 Update 1 and vCenter 5.1 Update 1(a) released – grab your fixes

[Update 23.05.2013]
Due to a bug in the authentication module in multi-domain environments VMware released another update of vCenter: vCenter 5.1 Update 1a:

[Update 05.08.2013]
VMware releases another update for vCenter, 5.1 Update 1b:

The long anticipated first major Update bundle of ESXi and vCenter 5.1 has finally been released. Download them at the usual place.
The insane list of fixes confirms my gut feeling again that unfortunately, many VMware products only start getting usable after the first (or sometimes even the second) Update bundle. (Remember vCenter 5.1a and 5.1b or the loads of support alerts?)

ESXi 5.1 Update 1

Go and check the release notes. Seriously.
No real new features or enhancements have been added apart from a few new supported Guest OSes.
But huge loads of important issues and bugs have been fixed, a few of which were anticipated since a long time. Here are some excerpts to highlight some of the important or interesting fixes:

Continue reading

January (or Febuary?) HP ESXi updates

Attention: [Update 16.01.2013]
HP actually pulled the updates (which were titled “February” updates) from their VIBs Depot site and purged the references from the depot metadata indexes as well. I’m not sure what’s going on but you won’t be able to apply these updates (via Update Manager) unless you downloaded them already. But even if you did, you should refrain from using these bundles at this time. Unfortunately there seems to be no way of properly removing them from Update Manager if it pulled the metadata already.

[Update 21.02.2013]
HP re-released the VIBs available at

[Update 23.02.2013]
(Thanks to milanod for the hint in the comments)
HP actually removed the re-released updates from the vibsdepot yet again?!
The updated bundles are still listed on the software/support/drivers lists for Proliant Servers though:
I’m speechless in the face of this unprecedented fail.

[Update 25.02.2013]
Uh-oh, the updates SEEM to be back at File dates are from Jan 4th and the bundles md5sums match the ones from the initial release mid-January (which this post was about) exactly. So if there really was a bug with the release, it must still be there.
Taking bets on how long it’ll take HP to offline them again.

[Update 22.04.2013]
(Thanks to Wu in the comments)
The issue with the SmartArray warning which this bundle brought us has been fixed in a recent update.

After some very minor updates back in October that did not come with release notes it’s time for another round of updates to the ESXi HP extensions and other stuff. Unfortunately, we don’t seem to be getting release notes or general infos now either.
But these updates are publicly available on already and your VMware Update Manager should have already picked them up if you set it up to use the HP VIB depot.

Since HP is so kind to not provide release notes, we can only guess about actual fixes or improvements, but we can at least check which of the VIBs contained in the offline bundles really do provide updates (spoiler: not that much).
Continue reading

Invoke-VMScript issue: “The guest operations agent could not be contacted”

[This issue has been fixed with the VMware Tools releases of ESXi 5.0 Update 2 and ESXi 5.1 Update 1.
Note that you can run more recent versions of VMware Tools to fix this on your VMs even if you haven’t updated your hosts, this is fully supported]

The Invoke-VMScript PowerCLI function provides a cool way to execute scriptcode inside guests through the VMware tools, on Linux and Windows VMs (as long as a couple of prerequisites are met). While playing with it a while ago, I noticed that on many VMs, it wouldn’t run properly, instead just spilling the error “The guest operations agent could not be contacted”. To fix this, the VMware Tools service inside the guest had to be restarted. Not thinking too much of the oddness, I left it at that since I didn’t have a real use case for the Invoke-VMScript function.
This changed later on, when I ran into the same issue again that made me finally dig a bit deeper.

Quite a few other people reported this issue on the VMTN forums since some time, but there was no solution posted. A reference to Backup Exec initiated backup operations made me curious though – had I not fought with another problem induced by it already?

I then was able to track it down to the Reload Virtual Machine operation, which Backup Exec triggers each time before removing the snapshot of a backup too. But not only that, unlike my previous issue, this problem is fully reproducible manually without any involvement of Backup Exec: Continue reading

Re-Attaching a previously detached storage device

A while ago we were getting rid of some old, now unused datastores on our FC-storage, so we proceeded to unmount and remove the VMFS datastore, and then detach the LUN device through the vSphere client as described in Unpresenting a LUN in ESXi 5.x. On that note, hooray for no more claimrule mongering since ESXi 5.x!
After the whole process, I noticed that on some hosts the pseudo-device LUN 0 of an EVA storage system was also unmounted. There is no detach-option for these controllers, so I have no idea how it happened. Rescans were no good and it didn’t cause any issues, but I’d rather have it accessible again. The only problem was: There is no attach option available for these devices.

So in the end, I knew I had to resort to esxcli to re-attach the device properly. Continue reading

vSphere 5.1 Generally Available

After the announcement made at VMworld about 3 weeks ago, vSphere 5.1 is finally generally available for download and your installation pleasure. Since there are enough posts with aggregated link collections to downloads already and I’m lazy, check this post on for example. The HP custom ESXi ISO for 5.1 containing the latest HP extensions released recently is available for download too.

However, leave any laziness behind and take your time to check the vSphere 5.1 release notes, as these document a huge amount of known issues (with some workarounds):
I hope VMware didn’t rush this release and properly implemented all the nice improvements.

According to the release notes, 5.1 is set to be the last major release to formally support legacy old GuestOS’s such as Windows 3.1/NT/95/98/, RHEL 2.1, OS/2, Novell Netware or Ubuntu 8.*/9.*, which long since ran out of OS support too.
A few of these old OSes are already listed as deprecated for 5.1 in the Compatibility List according to the GuestOS support policy.

The HCL was updated to contain 5.1 info, while the compatibility matrix has yet to be updated with any 5.1 products (Update: info now available). As usual, you will have to update vCenter first and then the hosts or other depending applications.
I also suppose the new 5.1 VMware Tools are supported even running on 5.0 only clusters too, as it was in previous versions. This might save you an unnecessary reboot later.

It’s time to start up the printer and get really familiar with esxcli, as this will be the primary tool for cli management, especially in case of emergencies where you don’t have access to Powercli or GUI tools. esxcli in combination with the vMA provided vifastpass authentication comes in especially handy.

If you already run Windows Server 2012 VMs (Tech Preview support) on your 5.0 U1 infrastructure, be aware of this significant issue when you upgrade to 5.1:
Snapshots and checkpoints of virtual machines with Windows 8 or Windows Server 2012 do not boot
>Snapshots and checkpoints of virtual machines with Windows 8 or Windows Server 2012 that are taken on a host running ESXi 5.0 Update 1 or ESXi 5.0 P03 will not resume on a host running later versions of ESXi ( ESXi 5.0 Update 2, ESXi 5.1, etc.) and the reverse
>VMotion is prevented between hosts running ESXi 5.0 Update 1 or ESXi 5.0 P03 to and from hosts running later versions of ESXi .