Re: [PATCH] protocol_t upgrade

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 01 Mar 2011 12:44:39 +1300

 On Mon, 28 Feb 2011 14:55:59 -0700, Alex Rousskov wrote:
> On 02/27/2011 06:12 PM, Amos Jeffries wrote:
>> This begins the libanyp.la SourceLayout changes by moving the
>> protocol_t
>> type code to stand-alone files inside its namespace.
>>
>> On the most part there are no behaviour changes. The automatic
>> generation of the enum->text string produces an upper-case strings
>> array, so explicit down-casing is added for the obvious places such
>> as
>> scheme:// syntax.
>
>> + static char out[BUFSIZ];
>> + *out = '\0';
>> + if (in != NULL) {
>> + int p = 0;
>> + for (; p < BUFSIZ && in[p] != '\0'; ++p)
>> + out[p] = xtolower(in[p]);
>> + out[p] = '\0';
>
> The last line will write outside the "out" buffer boundary if "in" is
> longer than BUFSIZ. The first assignment would be a waste of cycles
> if
> "in" has characters in it which is probably the common case. You
> probably want to reshuffle to have just one (and correct) termination
> assignment at position p, where p is always correct and smaller than
> BUFSIZ.

 Thanks. Fixed.

>
> I could not find where this function is used in the patch. Is there
> some
> code missing from the patch?

 Um, yes. This is shuffled to URLScheme method which is its only user.

>
>> +/// Display the Protocol Type (in upper case).
>> +inline std::ostream &
>> +operator <<(std::ostream &os, ProtocolType const &p)
>> +{
>> + if (p < PROTO_UNKNOWN)
>> + os << ProtocolType_str[p];
>> + return os;
>> +}
>
> Should we print something when p is PROTO_UNKNOWN or worse? Where do
> we
> use this operator?

 Any debugs dumping objects using the type. Potentially any streamed
 output of request/reply status lines.

 If the protocol is unset or unknown the ProtocolType value is not
 useful. The caller must do the output of any replacement data it has.
 Added documentation.

>
>> - if (request->protocol == PROTO_HTTP)
>> + else if (request->protocol == AnyP::PROTO_HTTP)
>> return httpCachable(method);
>>
>> - if (request->protocol == PROTO_GOPHER)
>> + else if (request->protocol == AnyP::PROTO_GOPHER)
>> return gopherCachable(request);
>>
>> - if (request->protocol == PROTO_CACHEOBJ)
>> + else if (request->protocol == AnyP::PROTO_CACHEOBJ)
>
> Your call, but I would not change "if" to "else if" here. The change
> makes understanding the code more difficult and is inconsistent with
> the
> rest of that function.

 Okay. Removed the else's.

>
>> - CbDataList<protocol_t> *q = new CbDataList<protocol_t>
>> (urlParseProtocol(t));
>> - *(Tail) = q;
>> - Tail = &q->next;
>> + for (int p = AnyP::PROTO_NONE; p < AnyP::PROTO_UNKNOWN;
>> ++p) {
>> + if (strcasecmp(t, AnyP::ProtocolType_str[p]) != 0) {
>> + CbDataList<AnyP::ProtocolType> *q = new
>> CbDataList<AnyP::ProtocolType>(static_cast<AnyP::ProtocolType>(p));
>
> The above changes the behavior from creating a PROTO_NONE-based
> CbDataList to silently skipping some of the unknown protocols. Do we
> check for the unknown protocols in the caller? If not, will this code
> change cause any user-visible changes?

 We never have checked for unknown protocols here. Just converted them
 to matchable entries of NONE.

 Adding a config warning to mention skipped protocols now.

 Two possibly visible changes:
  * the "FILE" protocol no longer accepted and silently converted to FTP
  * unknown protocols no longer silently accepted and converted to NONE,
    thus any potential for mis-matching unknowns and comment text is
 removed.

>
> Also, urlParseProtocol() does not recognize some of the protocols for
> which the strcasecmp test will succeed, right? Should
> urlParseProtocol()
> be extended to be consistent with the new enumeration logic above?
>

 Yes. This is done as part of the next step moving URL parsing into
 class URL. It is moved to use the same for-loop pattern to fill
 URLSchemes.

 Meanwhile we will have protocols accepted by ACL proto which are not
 yet matchable. This does not seem to be a problem since other parts of
 the parser will reject unknown protocols at present.

>
> When a similar change was done for request methods, we ended up
> creating
> an HttpRequestMethod class with a few useful methods. IIRC, the
> primary
> motivation was correct handling of unknown methods -- we needed a
> place
> to store the unknown method image/string. Do we need any need to
> handle
> unknown protocols by storing their image/string?
>

 Good point. At present URLScheme class fills that purpose but only does
 the registered protocols right now. The URL updates plan to change it to
 handle unknowns as string images.

 In the final URL and protocol design we have AnyP::ProtocolType which
 enumerates the registered protocols accepted in URLs (lower case) and
 the request protocol field (upper case). Then URLScheme which expands
 that to allow other protocol names accepted in URLs.

 Thank you for checking this.

 Amos
Received on Mon Feb 28 2011 - 23:44:44 MST

This archive was generated by hypermail 2.2.0 : Tue Mar 01 2011 - 12:00:14 MST