Re: [squid-users] Squid 2.6.STABLE9 and caching of 302 redirects

From: John Line <jml4@dont-contact.us>
Date: Thu, 8 Feb 2007 12:29:47 +0000 (GMT)

On Tue, 6 Feb 2007, Henrik Nordstrom wrote:

> tis 2007-02-06 klockan 01:00 +0000 skrev John Line:
>
>> Investigation showed that the problem was that the new Squid version was
>> caching the temporary redirects (HTTP status 302) sent by origin servers
>> to direct unauthenticated requests to our authentication server. When the
>> authentication server subsequently redirected the (now authenticated)
>> requests back to the originally-requested URLs, Squid served the
>> corresponding cached redirects instead of passing the requests through to
>> the origin servers.
>
> Odd. Should not happen.
> [snip]

I haven't yet filed a bug report, as further investigation showed the
situation was more complex than it seemed at first. I'd made the mistake
of taking at face value the response header "X-Cache: HIT from
omicron.wwwcache.cam.ac.uk"; Squid *was* attempting to re-validate the
cached redirect, but mistakenly concluding that it was still valid.

I'm not sure whether that is Squid's fault, or a problem with the
behaviour of the origin server and its authentication component. I'll try
describing the significant details of what I believe is actually
happening, in the hope that someone with a better understanding of the
HTTP RFC's rules about caching and re-validation will be able to say with
more certainty where the problem lies.

[This is based on Squid logging, Firefox "Live HTTP Headers" extension
reporting the browser's view of events, and ethereal watching the HTTP
dialogue between Squid and the origin server).]

The significant events are:

  * browser requests a URL for which authentication is needed (and it
    doesn't have a cookie identifying it as already authenticated for that
    origin server); Squid passes the request to the origin server, which
    sends a 302 redirect, directing the browser to our authentication
     server.
  * Squid caches that redirect, as it has an Expires: header (though it is
    "pre-expired" - Expires: timestamp identical to Date: timestamp - so
    (by the rules described in the HTTP RFC) revalidation is needed before
    the cached redirect can be served in response to subsequent requests..
  * after interaction with the authentication server, the user's browser
    ends up requesting the same URL that it originally requested, but it
    is now authenticated (via a cookie).
  * Squid apparently sees that the cached 302 redirect response for that
    URL is expired (always has been been, the Expires: header was meant
    to ensure even HTTP 1.0 caches wouldn't cache the redirect...).

It's the next bit of the interaction that leaves me uncertain where the
fault lies.

  * Squid sends a request for the URL to the origin server, passing through
    the authentication cookie and adding an "If-Modified-Since" header
    quoting the timestamp from the Date: or Expires: header (can't tell
    which, they are identical) of the cached redirect. Is it allowed to do
    that? The redirect does not have a Last-Modified: header, so it must be
    using one of the others. The request from the browser did NOT have an
    If-Modified-Since header, so Squid must have added it.

  * Because the user is now authenticated, the origin server does NOT send
    a redirect. Instead, it sends a 304 Not Modified response because the
    requested document is a static HTML page that was last modified a long
    time (days, months, or years) before the timestamp quoted in the
    If-Modified-Since header.

  * Squid sends the cached redirect to the browser and logs the request as
    TCP_REFRESH_HIT/302. The user is stuck, since re-confirming the
    authentication just results in being sent back to Squid/the origin
    server and being sent another copy of the redirect.

  * Note that using squidclient to purge the cached redirect fixes the
    problem temporarily, allowing the already-authenticated browser to
    access the requested document and any other that's allowed by the user's
    credentials, but only until the authentication cookie is invalidated
    (e.g. by restarting the browser), after which the problem recurs (for
    URLs that were being browsed without difficulty until re-authentication
    was required).

That leaves me uncertain about where the blame lies.

It seems odd - and maybe wrong - that Squid is using If-Modified-Since
with a timestamp derived from the Date: or Expires: header - surely it
should only use a timestamp from a Last-Modified: header? Since the cached
redirect does NOT have a Last-Modified: header, that should mean that
Squid cannot attempt revalidation and should discard the cached redirect
and simply send a normal request to the origin server. That would
certainly have the right outcome (getting a copy of the requested
document, as long as the user was authenticated).

RFC 2616 says (in section "13.3 Validation Model")

       Note: a response that lacks a validator may still be cached, and
       served from cache until it expires, unless this is explicitly
       prohibited by a cache-control directive. However, a cache cannot
       do a conditional retrieval if it does not have a validator for the
       entity, which means it will not be refreshable after it expires.

That reads like it is agreeing with me, unless it is legitimate for Squid
to use the timestamp from another header (must be Date: or Expires:) with
If-Modified-Since.

Section 13.3.5 of the RFC says (in part) "Thus, comparisons of any other
headers (except Last-Modified, for compatibility with HTTP/1.0) are never
used for purposes of validating a cache entry.", which appears to confirm
that Last-Modified: is the only header which can be used in such
comparisons.

Is the problem actually Squid's fault? Or is it the origin server's fault
for responding with details about the HTML document when Squid was asking
about the redirect?

I don't see how the origin server could avoid doing that, though, since
the request from Squid does not and cannot distinguish those two cases
(i.e. it cannot ask "is this redirect what you would currently send for
this request?"). Although some requests include an authentication cookie
and others do not, and different outcomes are expected, Squid cannot be
expected to know the significance of the cookie to the origin server.

I'll file a bug report if responses seem to favour it being Squid's fault!

                                 John

-- 
John Line - web & news development, University of Cambridge Computing Service
Received on Thu Feb 08 2007 - 05:30:06 MST

This archive was generated by hypermail pre-2.1.9 : Thu Mar 01 2007 - 12:00:01 MST