Re: [PREVIEW] Retry requests that failed due to a pconn race, take 1

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 14 Feb 2012 16:10:18 -0700

On 02/14/2012 02:45 PM, Amos Jeffries wrote:
>> 1. Do not set wasIdle in pinConnection(). Yes, the pinned connection
>> does become idle, but we cannot retry pinned pconn races so there is
>> little point in detecting and then doing extra work to ignore them.
>>
>> 2. Set wasIdle in pinConnection() (already done in the attached patch).
>> In forward.cc, add code that will avoid retrying pinned pconn races.
>> This avoids the lie in pinConnection() but makes the forward.cc code a
>> little more complex.
>>
>> 3. Set wasIdle in pinConnection() (already done in the attached patch).
>> Leave forward.cc as is because my assertion analysis above is wrong and
>> current code can retry pinned pconn races correctly.
>>
>> I am not an expert on current pinned connection use. Which option is the
>> best?
>
>
> I think (2). Turning that assert into:
> serverConn = (pinned_connection ?
> pinned_connection->validatePinnedConnection(request,
> serverDestinations[0]->getPeer()) : NULL);

Done in "take 2" of the patch (attached). I could not use the ?:
operator because of the type mismatch, but the code is essentially the same.

I also improved the "retry the same destination" condition so that we
preserve the old behavior if serverConn is nil and, hence, we cannot
detect a pconn race.

Do you agree with FTP/Gopher changes? Should forward.cc make the retry
decision itself instead, as in the sketch below?

 if (!serverConn || !serverConn->wasIdle || protoHasNoPconnRaces())
     serverDestinations.shift(); // last one failed. try another.
 else
     debugs(17, 4, HERE << "retrying the same destination");

Finally, I am considering moving wasIdle flag from Connection to
FwdState after all. While it is true that FwdState cannot know whether
the pinned connection was an idle persistent connection, it just
occurred to me that ALL pinned connections were idle and persistent when
they enter FwdState!

Thus, FwdState can maintain the wasIdle flag on its own. Doing so will
require calls from Server kids to FwdState to clear the flag when we
read something from the server, but it may be simpler (more localized)
than the current Connection::wasIdle solution. What do you think?

>> As discussed earlier, we cannot bypass the pop() call because we want to
>> close an old idle connection if we have to create a new one.

> In this case an existing one has just closed/failed. So killing one
> will be a net negative on the connection count. Seems to just be a
> waste of one of those N other connections (without even checking its
> usability).

> I'm not arguing for/against this, just pointing out the case here is
> not as bad as the expanding connection count we argued out earlier.

Agreed. I will probably remove that extra pop() condition added in take1
if nobody else finds a reason to add it.

However, we will then need to add code to require a new connection when
retrying after a pconn race because we do not want a single user to keep
testing all those "N other connections" for validity. We will not close
any of them, but we will not use them either.

Thank you,

Alex.

Received on Tue Feb 14 2012 - 23:10:45 MST

This archive was generated by hypermail 2.2.0 : Wed Feb 15 2012 - 12:00:08 MST