Re: [PATCH] URL handling bugs

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 18 May 2011 09:50:48 -0600

On 05/18/2011 09:36 AM, Alex Rousskov wrote:
> On 05/18/2011 08:52 AM, Amos Jeffries wrote:
>> This patch includes two bug fixes in URL handling which were uncovered
>> during testing of the URL logging update:
>>
>> * URL re-write handling was not correctly creating its adapted request
>> copy. The code here is much reduced by using the clone() method. Still
>> not completely satisfactory (marked with XXX) since on errors there is a
>> wasted cycles cloning.
>
> Wasting a few cycles on errors would be OK, I guess, but there is also
> at least a possibility of a memory leak: HttpRequest::clone() always
> produces a new request, but AFAICT urlParse() sometimes returns NULL
> instead of the request, without deleting the request parameter.

Even worse, if the new url is a urn, parseUrl returns a brand new request:

> } else if (!strncmp(url, "urn:", 4)) {
> return urnParse(method, url);

which means that our attempt to use a cloned request failed _and_ we now
have two requests: the one created by urnParse(), which lacks clone
information, and the one created by clone(), which lacks the new URL.

I do not know whether anybody still cares about URNs these days, but
this code path is not going to work well.

The core problem here may be in trying to combine two different
interfaces into one:

  1. Trying to give an old request a new method/URL. This should be done
inside HttpRequest::resetUrl() or similar. This will not create new
requests or delete old ones.

  2. Trying to create a new request using a method/URL pair. This can be
done as a stand-alone function or a static HttpRequest method that
returns a new request. This function should not have a request as a
parameter.

HTH,

Alex.

> Thus, we
> must not do the following:
>
> urlParse(..., old_request->clone())
>
> but have to do something like
>
> new_request = old_request->clone();
> if (!urlParse(new_request->method, result, new_request))) {
> debugs(...);
> delete new_request;
> new_request = NULL;
> }
>
> which still looks terrible but at least avoids memory leaks.
>
> Eventually, this would become an HttpRequest::resetUrl() or reparseUrl()
> method, I guess:
>
> new_request = old_request->clone();
> if (!new_request->resetUrl(result)) {
> ...
> }
>
> (assuming there are good reasons for urlParse() to require access to
> request details), but such changes are probably outside your patch scope.
>
>
>> + // XXX: why does this not get copied by clone()?
>> + new_request->content_length = old_request->content_length;
>
> Because it is set in HttpHdr::hdrCacheInit() called by
> HttpRequest::clone(). You should be able to remove the above two lines.
>
> It may be more efficient to add HttpHdr::clone() instead of initializing
> its cache from scratch, of course, but that optimization is probably
> outside your patch scope.
>
>
>> - copy->canonical = canonical ? xstrdup(canonical) : NULL;
>> + urlCanonical(copy); // does a re-build from the above copied details.
>
> This change seems like a step backwards to me. We should be able to just
> clone the canonical field when cloning the request. Why do we need to
> rebuild it from scratch? The clone() method has the right to correctly
> clone low-level bits. We should treat it as a copy constructor,
> essentially (in fact, it should probably be eventually implemented using
> a private copy constructor).
>
> If the canonical field of the being-cloned request is not valid (and
> hence should not be cloned), the bug is elsewhere.
>
> The urlCanonical(request) global should be converted into a constant
> HttpRequest method, but that may be outside your patch scope. In any
> case, urlCanonical() is not designed for cloning and, unlike the HttpHdr
> cache discussed above, we already have a simple xstrdup() cloning
> solution here that we should use.
>
>
>> * URL parsing needs to set the canonical field to unset whenever the
>> URI is re-parsed into a request. This field is an optimization for later
>> display speed-ups. This has been causing bad URL display throughout
>> re-written request handling
>
> Can you rephrase using a word other than "display"? I think I agree with
> the above, but it is not clear what you mean by "display". Perhaps you
> can say that stale canonical field values were used throughout Squid
> after a request was rewritten?
>
>
>> and when the clone() was corrected to use
>> urlCanonical() caused asserts in the server-side.
>
> I disagree that it was a correction. Please see above for specifics.
>
>
>> This canonical change has been discussed previously. There is one other
>> incorrect setting of HttpRequest::canonical in icap/ModXact.cc, but that
>> must handle a const input which urlCanonical() function does not yet
>> handle.
>
> To fix that, urlCanonical(request) should become a constant HttpRequest
> method. Is there an XXX about this in icap/ModXact.cc? If not, please add.
>
>
> Thank you,
>
> Alex.
>
Received on Wed May 18 2011 - 15:51:10 MDT

This archive was generated by hypermail 2.2.0 : Wed May 18 2011 - 12:00:53 MDT