Re: [PATCH] String API re-implementation using SBuf

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 21 Nov 2013 12:20:10 -0700

On 11/18/2013 05:13 AM, Amos Jeffries wrote:
> On 16/10/2013 7:12 a.m., Alex Rousskov wrote:
>> On 10/11/2013 11:03 PM, Amos Jeffries wrote:
>>> On 10/10/2013 10:12 p.m., Amos Jeffries wrote:
>>>> Replace String class internals with an SBuf.
>> FWIW, you made the review process more time consuming and less accurate
>> [for me] by moving implementation from .cc and .cci into .h. As you
>> know, bzr does not track such changes well. I only found a bug or two,
>> but I am not confident the move did not hide more problems from me.
>
> Okay, undone that moving now. The attached patch is more easily comparible.

Thank you. I am using the patch from your recent posting on the same thread.

> - String(char const *);
> + explicit String(char const *);

Most of the non-String.* changes in the patch are due to this change.
Most of those changes are not in a performance-sensitive code. And many
are due to some _other_ code using char* types. Furthermore:

> String &operator =(char const *);

The above operator is inconsistent with an explicit char* constructor.
If you do not allow implicit construction, you should not allow
assignment either because it is just one more form of an implicit char*
conversion.

Please consider postponing adding "explicit" to the old constructor
(which is unrelated to the patch scope AFAICT) until more stuff is
converted to SBuf.

The above comment does not apply to the explicit SBuf constructor. I am
not sure "explicit" is warranted there, but I do not have enough
ammunition to object to it: My overall motivation here is to make the
eventual transition to SBuf smoother. Explicit constructors for an
essentially temporary String class hurt that goal: We are forced to add
String() wrappers first and then will be forced to remove them when the
String class is gone...

>> * Manipulation of defined_ in the assignment operator is questionable
>> because it differs from similar manipulation in the constructor. If the
>> code is correct, a comment explaining the difference is warranted.
>
> Done. Both use size()>0 now.

There is still a difference with:

> +String::String(char const *aString) :
> + buf_(aString),
> + defined_(aString != NULL)

Above, defined_ for "" is true. In the new SBuf constructor, defined_
for "" is false. If the code is correct, a comment explaining the
difference is warranted.

> + if (str) {
> + append(str);
> + // XXX: empty string "" sent to append means no change,

The comment is misleading a bit: Appending an empty string sets defined_
to true in the patched code because patched String::append() sets
defined_ to true unconditionally.

> char const *
> String::termedBuf() const
> {
> - return buf_;
> + if (!defined()) return NULL;

Since an empty String could have been defined() in the unpatched code, I
suggest relaxing this to always return c_str(), even for undefined
Strings. I think it will be safer this way, but you should double check
that I did not miss any cases where it would break something.

>>> + int cmp(char const *s, size_type count) const {
>>> + // XXX: SquidString ignores the terminator. SBuf treats it as a byte.
>>> + if (s!=NULL && strlen(s) < count)
>>> + count = strlen(s);
>>> + return buf_.cmp(SBuf(s,count), count);
>>> + }
>> s may not be null-terminated in this context. We should not use
>> strlen(s), should we?
>>
>> Why change the implementation here to use SBuf::cmp() and introduce bugs
>> and XXXs? Can we continue to use strncmp() instead? The goal is not to
>> produce a "better" SquidString but "functionally the same" SquidString,
>> right?
>
> Yes you are right. I was hoping to get all the functionality offloaded
> to SBuf calls. But this one will have to revert.

The updated code is still using a different comparison function and
creating SBufs when no allocation was performed by the old code:

> int
> String::cmp(char const *aString) const
> {
> - int result = 0;
> - if (nilCmp(!size(), (!aString || !*aString), result))
> - return result;
> -
> - return strcmp(termedBuf(), aString);
> + return buf_.cmp(SBuf(aString));
> }

Is there anything wrong with keeping String::cmp() mostly intact?

> void
> String::cut(String::size_type newLength)
> {
> // size_type is size_t, unsigned. No need to check for newLength <0
> if (newLength > buf_.length()) return;
>
> buf_ = buf_.substr(0, newLength);
>
> // buf_ may be NULL on zero-length strings.
> if (buf_.length() == 0 && !defined()) return;
>
> buf_.setAt(newLength, '\0'); // terminate String.
> }

Several problems here:

* The setAt() call makes String size() different from newLength. That
does not smell good for the caller. For example, when newLength is zero,
the code will keep String.size() positive. I am worried that might even
create infinite loops if a caller like clientFollowXForwardedForCheck()
iterates until the size of the cut string becomes zero.

I suggest something like this instead:

    ...
    buf_ = buf_.substr(0, newLength);
    // terminate for historical reasons
    *buf_.rawSpace(1) = '\0';
    ...

* The NULL comment is stale.

* The old (buf_.length() == 0 && !defined()) line was there to avoid
0-terminating a NULL buffer. Since we now cannot maintain that NULL
buffer condition reliably, I suggest removing that line and always
terminating, to be safe when dealing with empty buffers.

> + String theServiceId(serviceId);

> + String theName(fname);

Please avoid "the" prefix for local variables. It is meant for class
data members (in part, to distinguish the former from the latter).

Thank you,

Alex.
Received on Thu Nov 21 2013 - 19:20:18 MST

This archive was generated by hypermail 2.2.0 : Fri Nov 22 2013 - 12:00:11 MST