This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA] C++-ify parse_format_string


On 11/24/2017 03:17 AM, Simon Marchi wrote:
> On 2017-11-23 05:40 PM, Pedro Alves wrote:
>> On 11/23/2017 09:13 PM, Simon Marchi wrote:
>>
>>> I have a patch in some branch that does essentially the same thing, so
>>> I was able to compare our approaches.  In my version, I removed the
>>> big allocation that is shared among pieces, and made each piece have
>>> its own std::string.  Unless we want to keep the current allocation
>>> scheme for performance/memory usage reasons, I think that using
>>> std::strings simplifies things in the parse_format_string function.
>>> The format_pieces structure is replaced with an std::vector of
>>> format_piece.
>>
>> Sounds like a step backwards to me.  If it simplifies things, then
>> it sounds like it might be because we're missing some utility,
>> like string_view or something like that.
> 
> I am not sure I understand, can you expand a little bit about what you
> have in mind?

I call it a step backwards because simplification is done at the
cost of efficiency (individual heap allocations, changing the
ownership model), when the original code avoided that.  It's well known that
standard C++ is missing basic string manipulation utilities, but that's
no reason to force ourselves to consider std::string the ultimate tool, IMO.
There's C++ the library, and then there's the C++ the language.  We can't
do anything about the latter, but we can extend the former.  Later revisions
of the standard are adding more facilities, like std::string_view in C++17, and
now things like std::array_view / std::span (from gsl::span) have a very
good chance of making their way into C++20.  Many C++ projects end up
reinventing something like those types, hence the attempts at
standardization, making those types lingua franca for non-owning use cases.

> 
> What I meant is that is changes things like this:
> 
> 	    strncpy (current_substring, percent_loc, length_before_ll);
> 	    strcpy (current_substring + length_before_ll, "I64");
> 	    current_substring[length_before_ll + 3] =
> 	      percent_loc[length_before_ll + lcount];
> 	    current_substring += length_before_ll + 4;
> 
> into this
> 
> 	    piece_string.assign (percent_loc, length_before_ll);
> 	    piece_string += "I64";
> 	    piece_string += percent_loc[length_before_ll + lcount];
> 
> Less magical number, less playing with offsets, etc.

I meant that you don't need for each piece_string to own its
own deep copy of the string to get rid of the magical numbers.
You can get that even with a couple C-like free-functions, like:

 // memcpy, null-terminate, and return advanced pointer.
 char *append (char *p, const char *str);
 char *append (char *p, const char *str, size_t len);

and then:

 char *p = current_substring;

 p = append (p, percent_loc, length_before_ll);
 p = append (p, "I64");
 p = append (p, percent_loc[length_before_ll + lcount]);

 current_substring = p;

You could wrap that in a class, like:

 /* Helper to build an null-terminated C string.  */
 struct zstring_builder
 {
    /* POS is a pointer to the position in buffer to work on.  */
    explicit zstring_builder (char *pos)
      : m_pos (pos)
    {}
 
    // memcpy, advance m_pos, null-terminate.
    zstring_builder &append (const char *s, size_type count);
    zstring_builder &append (const char* s);
 
    char *pos () { return m_pos; }
 
 private:
    char *m_pos;
 };

Or:

 /* Helper to build a null-terminated C string on top
    of a vector.  */
 struct zstring_builder
 {
    /* POS is a pointer to the position in buffer to work on.  */
    explicit zstring_builder (std::vector<char> &buf, size_t pos);
 
    // memcpy, advance m_pos, null-terminate.
    zstring_builder &append (const char *s, size_type count);
    zstring_builder &append (const char* s);
 
    char *pos () { return &m_vector[m_pos]; }
 
 private:
    std::vector &m_buf;
    size_t m_pos;
 };

or something else along those lines, if you now make piece_string
a zstring_builder, you can get similar, and just as readable code:

            zstring_builder piece_string (current_substring);

            piece_string.append (percent_loc, length_before_ll);
 	    piece_string.append ("I64");
 	    piece_string.append (percent_loc[length_before_ll + lcount]);

            current_substring = piece_string.pos ();

Alternatively, we could have something like a zstring_ref
class [1], as a non-owning mutable view of a C string (that never
reallocates), guaranteed to be null-terminated, and then the code
you shown would be just exactly the same.  Open choices could be whether
to save the string size inside zstring_ref (like string_view), or
not saving a size, making it a really-thin wrapper around a
pointer instead. 

std::string is great as an _owning_ SSO-enabled (which is sometimes a
curse when embedded in structures...) container of characters.  But
when we don't want an owning container, the right tool / refactor can
still make the code just as readable.

[1] - I've thought several times because that something like a zstring_view
(like string_view, but guarantees null-termination) would be handy in
a lot of places.  string_view isn't mutable, though, hence the
alternative name instead.  could be mut_string_view too, for example.

Thanks,
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]