[Script] Poor man’s vSphere network health check for standard vSwitches

Among many other new features, the vSphere 5.1 distributed vSwitch brought us the Network Health Check feature. The main purpose of this feature is to ensure that all ESXi hosts attached to a particular distributed vSwitch can access all VLANs of the configured port groups and with the same MTU. This is really useful in situations where you’re dealing with many VLANs and the roles of virtualization and network admin are strongly separated.

Unfortunately, like pretty much all newer networking features, the health check is only included in the distributed vSwitch which requires vSphere Enterprise+ licenses.

There are a couple other (cumbersome) options you have though:
If you’re have ESXi 5.5 you can use the pktcap-uw utility on the ESXi shell to check if your host receives frames for a specific VLAN on an uplink port:
The following example command will capture receive-side frames tagged with VLAN 100 on uplink vmnic3:
# pktcap-uw –uplink vmnic3 –dir 0 –capture UplinkRcv –vlan 100
If systems are active in this VLAN you should see a few broadcasts or multicasts already and meaning the host is able to receive frames on this NIC for this VLAN. Repeat that for every physical vmnic uplink and VLAN.

Another way to check connectivity is to create a vmkernel interface for testing on this VLAN and using vmkping. Since manually configuring a vmkernel interface for every VLAN seems like a huge PITA, I came up with a short ESXi shell script to automate that task.
Check out the script below. It uses a CSV-style list to configure vmkernel interfaces with certain VLAN and IP settings, pinging a specified IP on that network with the given payload size to account for MTU configuration. This should at least take care of initial network-side configuration errors when building a new infrastructure or adding new hosts.
This script was tested successfully on ESXi 5.5 and should work on ESXi 5.1 as well. I’m not entirely sure about 5.0 but that should be ok too. (Please leave a comment in case you can confirm/refute that).

Introducing the ghetto-vSwitchHealthCheck.sh:

Update: I have moved my scripts to GitHub and updated some of them a bit. You can find the current version of this particular script here.

Continue reading

Advertisements

Squid and chunked transfer encoding shenanigans

A while ago I was troubleshooting an issue with a certain client software (that I don’t even know the name of) on an internal network, that was uploading images via HTTP to a remote service on the internet. The simple HTTP POST upload requests failed for some reason with a HTTP/1.0 501 Not Implemented error from our Squid proxy server, prompting me to analyze further.
Fortunately the I had a network trace from the local admins so I could easily reproduce and test the requests with curl from my Linux client.

The issue proved to be caused to the chunked Transfer-Encoding headers of the POSTs and how our Squid 3.1 proxy server is unable to process requests with this Transfer-Encoding in certain cases.

I’ve seen a few mentions of issues with chunked Transfer-Encoding and Squid on the internets but many were older, contradictory or referencing Server side responses and not client side requests like in this case.

In a nutshell, Squid 3.1 is just not fully HTTP/1.1 compatible yet and only partially supports some features like chunked encoding as stated on the project’s website:

Squid-3.2 claims HTTP/1.1 support. Squid v3.1 claims HTTP/1.1 support but only in sent requests (from Squid to servers). Earlier Squid versions do not claim HTTP/1.1 support by default because they cannot fully handle Expect:100-continue, 1xx responses, and/or chunked messages.
Both Squid-3 and Squid-2 contain at least response chunked decoding. The chunked encoding portion is available from Squid-3.2 on all traffic except CONNECT requests.

Continue reading