Squid Expect: 100-continue header issues

As I mentioned on my other post about tuning Squid, this article is about the ignore_expect_100 setting and why it’s set on our Squid proxies.

We had and issue where requests for a certain site failed with an HTTP 417 error generated by Squid. You would find entries like these in the Squid access.log for the site  that wasn’t accessible:
1329478677.344      0 NONE/417 4480 POST http://somewebsite.com/webservice/data.asmx – NONE/- text/html
(Note: It’s actually an application communicating via HTTP and not manual browser-based access by human beings.)

Turned out the problem is that this application sends requests with an “Expect: 100-continue” HTTP-header and Squid doesn’t have a proper implementation of the HTTP 1.1 Expect-mechanism.

The purpose of the 100 (Continue) status (see section 10.1.1) is to allow a client that is sending a request message with a request body to determine if the origin server is willing to accept the request (based on the request headers) before the client sends the request body. In some cases, it might either be inappropriate or highly inefficient for the client to send the body if the server will reject the message without looking at the body.

In layman’s terms, this means if a client that wants to send a request (with content) to a webserver , e.g. upload a file with an HTTP-POST, it can first ask “Hey webserver, I’m about to POST you a file, here are the HTTP-Headers of my request. If that’s all right with you, send me a 100-continue response and I’ll transmit the file.” (kind of similar to an HTTP-HEAD , which is just an exchange of headers too). Now why is this useful? For example, it allows a webserver to check the Content-Type, Content-Length or any other header of the request before the client attempts to send any actual data. If the webserver doesn’t accept this MIME-type, decides the file is too large or anything else, it can simply respond with a 417 Expectation Failed and no resources would have been wasted by sending a file the web application wouldn’t process anyways.

Continue reading

Squid Proxy Optimizations

Geso. The tweaks described here are based on a Squid 3.1.x implementation, but should be valid on newer versions (3.2 currently in Beta), Squid 2.7 and 2.6 too. Just check the respective documentation on the Squid website.

To give you a little background on the involved environment, the Squid proxies I am referring to here are used in a simple proxy-sandwich configuration. Downstream of the Squids are our “main proxies” of which provide load balancing, high availability and caching logic. Those direct all traffic to our friends the Squids, responsible for content filtering. Now behind the Squids is another set of upstream proxies which provides AV scanning of web traffic. (Now don’t you dare to ask why this is 3-layered like this).
On an average day (90% of the traffic is generated between around 07am and 15pm), we’re pushing around 200-250GB of “ordinary” HTTP(S) and FTP traffic, consisting of 7-8 million requests through this configuration.

Continue reading

New ESXi security patch VMSA-2012-0011 released on June 14th

Today VMware released a new security update for ESX(i), from version 3.5 to 5.0, as well as other hosted virtualization platforms like Workstation/Player, and updated several older security advisories.
If you’re not signed up on the VMware security mailing list, you should do so at http://lists.vmware.com/mailman/listinfo/security-announce in order to get all the latest information on updates and advisories.

The new advisory is available here. The new patch VMware ESXi 5.0, Patch ESXi500-201206401-SG: Updates esx-base fixes two critical security issues:

VMware Host Checkpoint File Memory Corruption
Certain input data is not properly validated when loading checkpoint files. This might allow an attacker with the ability to load a specially crafted checkpoint file to execute arbitrary code on the host.
The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the name CVE-2012-3288 to this issue.
The following workarounds and mitigating controls might be available to remove the potential for exploiting the issue and to reduce the exposure that the issue poses.

Workaround: None identified.

Mitigation: Do not import virtual machines from untrusted sources.

VMware Virtual Machine Remote Device Denial of Service
A device (for example CD-ROM or keyboard) that is available to a virtual machine while physically connected to a system that does not run the virtual machine is referred to as a remote device. Traffic coming from remote virtual devices is incorrectly handled. This might allow an attacker who is capable of manipulating the traffic from a remote virtual device to crash the virtual machine.
The Common Vulnerabilities and Exposures project (cve.mitre.org) has assigned the name CVE-2012-3289 to this issue.
The following workarounds and mitigating controls might be available to remove the potential for exploiting the issue and to reduce the exposure that the issue poses.

Workaround: None identified.

Users need administrative privileges on the virtual machine in order to attach remote devices.
Do not attach untrusted remote devices to a virtual machine.

This is already the 2nd critical security-related patch after VMware ESXi 5.0, Patch ESXi500-201205401-SG which was released a month ago following a leak of VMware source code which raised some public attention. I really hope we’re done with this soon.

Here are the updated advisories based on older patches:
– http://www.vmware.com/security/advisories/VMSA-2012-0005.html

– http://www.vmware.com/security/advisories/VMSA-2012-0006.html

– http://www.vmware.com/security/advisories/VMSA-2012-0007.html

– http://www.vmware.com/security/advisories/VMSA-2012-0009.html

The actual changes of these advisories can be found in section 6. Change log. There doesn’t seem to be any really important information though.

And last but not least, if you’re running ESX on HP, while you’re installing this you might as well update your HP-Extensions while you’re at it.

ESXi HP Updates

HP just released a batch of new firmware for their servers and blades, Virtual Connect modules as well as updated ESXi extensions. Here’s my take on the new stuff.

HP ESXi VIBs and handling Update Manager

Updated HP Extensions and notable excerpts from the release notes:

  • The ESXi offline bundle (CIM providers) has been updated to 1.2
    Added additional support for AC Lost detection for power supplies.
    Supporting some more gen8 servers
  • The Agentless Management Service (AMS) Offline Bundle for Gen8 servers has been updated to 9.1.0
    Added network and SAS driver information reporting.
    Added performance data reporting.
    Supporting some more gen8 servers
  • The ESXi utilities bundle has been updated to 1.2
    Supporting some more gen8 servers
  • The NMI Sourcing driver has not been updated for ESXi5, but for ESXi 4.1.

If you run ESXi on HP Proliant systems, you should add the HP vibsdepot to your vCenter Update Manager repositories if haven’t done so already. But even if you did so in the past, you’ll need to add another repository for the new bundles since HP changed the way they provide bundles from their vibsdepot. Instead of just adding “http://vibsdepot.hp.com/index.xml” as a custom download source in UpdateManager, which would yield the most up-to-date bundles, HP now distributes mutliple repositories based on release cycles:

The following points define how to use vibsdepot under several customer scenarios:

– VUM – connect VUM to “http://vibsdepot.hp.com/hpq/<release date>/index.xml” to download complete update patches as well as individual patches.
– ESXCLI – use command “esxcli software vib install -d http://vibsdepot.hp.com/hpq/<release date>/index.xml”.

So in a nutshell, to make use of the updated bundles in VMware  Update Manager, you’ll have to add “http://vibsdepot.hp.com/hpq/jun2012/index.xml&#8221; in UM. You can also remove or deactivate the old vibsdepot URL.
And don’t forget to update the URL once HP releases updated extensions (or HP changes this procedure all over yet again)!
[Update: You actually do not need to do that anymore if you just use http://vibsdepot.hp.com/index.xml. This links all release versions now.]

Continue reading

Snapshot removal issues with BackupExec and locked files

<Status update Nov 22nd 2012>
Check out the comments after reading this article, some interesting points there. The case at Symantec never got resolved to this very day. We’re also still using Backup Exec 2010 R3.
</Status update>

<Update Sept 12th 2013>
The issue is now finally resolved since our backup admins upgraded to Backup Exec 2010 R3 Service Pack 3.

We are currently facing an issue with a small number of VMs, where  snapshots created by our backup software, Symantec BackupExec, can’t be removed properly because of locked files (neither through BackupExec nor manually).
In vCenter, the warning “Virtual machine disks consolidation failed.” is being logged as a simple event (I might create an alarm for this now that I think about it).

The problem

You will not see these snapshots in the snapshot manager (fix this for good, VMware!), but only on the filesystem.
Unfortunately, unlike with vSphere 4 there is no obvious, specific error. The remove snapshot task completes successfully and you’ll only notice on the VM summary page that it needs disk consolidation.

Pre-vSphere 5, the task would fail with an error about how it couldn’t consolidate the snapshot due to a locked file. This info is now only found in the vmware.log file (and surely vmkernel log files) of the VM in its datastore:

# grep -i lock vmware.log
2012-06-04T08:06:21.069Z| vcpu-0| AIOGNRC: Failed to open ‘/vmfs/volumes/4fb20a9a-1b7f7c20-0363-002481e443c1/SomeVM002/SomeVM002-flat.vmdk’ : Failed to lock the file (400000003) (0x2013).2012-06-04T08:06:21.069Z| vcpu-0| AIOMGR: AIOMgr_OpenWithRetry: Descriptor file ‘/vmfs/volumes/4fb20a9a-1b7f7c20-0363-002481e443c1/SomeVM002/SomeVM002-flat.vmdk’ locked (try 0)2012-06-04T08:06:22.580Z| vcpu-0| DISKLIB-VMFS : “/vmfs/volumes/4fb20a9a-1b7f7c20-0363-002481e443c1/SomeVM002/SomeVM002-flat.vmdk” : failed to open (Failed to lock the file): AIOMgr_Open failed. Type 3
2012-06-04T08:06:22.580Z| vcpu-0| DISKLIB-LINK : “/vmfs/volumes/4fb20a9a-1b7f7c20-0363-002481e443c1/SomeVM002/SomeVM002.vmdk” : failed to open (Failed to lock the file).
2012-06-04T08:06:22.580Z| vcpu-0| DISKLIB-CHAIN : “/vmfs/volumes/4fb20a9a-1b7f7c20-0363-002481e443c1/SomeVM002/SomeVM002.vmdk” : failed to open (Failed to lock the file).
2012-06-04T08:06:22.580Z| vcpu-0| DISKLIB-LIB : Failed to open ‘/vmfs/volumes/4fb20a9a-1b7f7c20-0363-002481e443c1/SomeVM002/SomeVM002.vmdk’ with flags 0x20a Failed to lock the file (16392).
2012-06-04T08:06:22.580Z| vcpu-0| SNAPSHOT:Failed to open disk /vmfs/volumes/4fb20a9a-1b7f7c20-0363-002481e443c1/SomeVM002/SomeVM002.vmdk : Failed to lock the file (16392)
2012-06-04T08:06:22.601Z| vcpu-0| DISK: Failed to open disk for consolidate ‘/vmfs/volumes/4fb20a9a-1b7f7c20-0363-002481e443c1/SomeVM002/SomeVM002-000004.vmdk’ : Failed to lock the file (16392) 53452012-06-04T08:06:22.657Z| vcpu-0| Vix: [675925 vigorCommands.c:577]: VigorSnapshotManagerConsolidateCallback: snapshotErr = Failed to lock the file (5:4008)

Nice, isn’t it? Creating a new snapshot and selecting delete all snapshots will not work because it’s still locked. It will only increase the number of delta files for your VM.

Digging down to the root of the issue

Continue reading