This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix tokenize function and test.


On 06/17/2009 11:39 AM, Przemysław Pawełczyk wrote:
> 2009/6/17 Josh Stone <jistone@redhat.com>:
>> If you really want to be concerned with optimizing this, then you should
>> be using memcpy instead, since the position of the NUL is precisely known.
> 
> I'm not so concerned about optimization, but I don't like doing
> superfluous things. You're totally right with memcpy, will you accept
> it if I use it in my patch?

Yes, that's fine.

>>> Maybe you're talking about putting it into another patch? Yes, it
>>> works without this change, but I would say, that is also a fix, but
>>> for badly written code. Of course I can send it as a second patch if
>>> you wish.
>>
>> I don't understand the hate for strlcpy, but I don't care so much to
>> argue over it.  Technically, we could even be using strcpy here, since
>> the input string is already known to be small enough.
> 
> I have no feelings towards strlcpy. Why do you think so? It's useful
> function and I don't negate this fact.

I misunderstood your comment about "badly written code."  There are some
notable people who are against strlcpy (e.g. Drepper won't add it to
glibc).  In a case like ours with fixed buffers, I think it makes
complete sense.

> Simply there is no reason for doing implicit strlen call every
> tokenize("", delim), because (as you pointed already) position of the
> NUL is precisely known.
> Maybe in a few years MAXSTRINGLEN will be hundred times greater than
> today, so then we will be changing this function once again?
> IMHO assuming that input string is known to be small enough is not the
> right way.

What is there to change?  The input and output buffers are both sized
MAXSTRINGLEN.  The tokens are subsets of the input string, thus they
must also be less than MAXSTRINGLEN.  But anyway, that was an off-handed
comment that doesn't need to be addressed in your patch.

>> I care that the code is clear and concise.  It wasn't immediately
>> obvious to me that (str_start ? str_start : str_end + 1) means "the end
>> of this token," and my question shows that I also didn't see why that
>> was a win.  I get it now, but I'm bound to forget the next time I look
>> at it. :)
> 
> "The end of this token" semantic is consequence of how strsep works.
> Nevertheless I can add some comments if you think it will make the
> code more clear.

Right, I was just distracted because a variable with "start" in the name
is also used as an end pointer.  A comment would be nice, or even just
self-documenting with a "token_end" temporary.

Josh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]