This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [RFA] C++-ify parse_format_string
On 11/24/2017 03:17 AM, Simon Marchi wrote:
> On 2017-11-23 05:40 PM, Pedro Alves wrote:
>> On 11/23/2017 09:13 PM, Simon Marchi wrote:
>>
>>> I have a patch in some branch that does essentially the same thing, so
>>> I was able to compare our approaches. In my version, I removed the
>>> big allocation that is shared among pieces, and made each piece have
>>> its own std::string. Unless we want to keep the current allocation
>>> scheme for performance/memory usage reasons, I think that using
>>> std::strings simplifies things in the parse_format_string function.
>>> The format_pieces structure is replaced with an std::vector of
>>> format_piece.
>>
>> Sounds like a step backwards to me. If it simplifies things, then
>> it sounds like it might be because we're missing some utility,
>> like string_view or something like that.
>
> I am not sure I understand, can you expand a little bit about what you
> have in mind?
I call it a step backwards because simplification is done at the
cost of efficiency (individual heap allocations, changing the
ownership model), when the original code avoided that. It's well known that
standard C++ is missing basic string manipulation utilities, but that's
no reason to force ourselves to consider std::string the ultimate tool, IMO.
There's C++ the library, and then there's the C++ the language. We can't
do anything about the latter, but we can extend the former. Later revisions
of the standard are adding more facilities, like std::string_view in C++17, and
now things like std::array_view / std::span (from gsl::span) have a very
good chance of making their way into C++20. Many C++ projects end up
reinventing something like those types, hence the attempts at
standardization, making those types lingua franca for non-owning use cases.
>
> What I meant is that is changes things like this:
>
> strncpy (current_substring, percent_loc, length_before_ll);
> strcpy (current_substring + length_before_ll, "I64");
> current_substring[length_before_ll + 3] =
> percent_loc[length_before_ll + lcount];
> current_substring += length_before_ll + 4;
>
> into this
>
> piece_string.assign (percent_loc, length_before_ll);
> piece_string += "I64";
> piece_string += percent_loc[length_before_ll + lcount];
>
> Less magical number, less playing with offsets, etc.
I meant that you don't need for each piece_string to own its
own deep copy of the string to get rid of the magical numbers.
You can get that even with a couple C-like free-functions, like:
// memcpy, null-terminate, and return advanced pointer.
char *append (char *p, const char *str);
char *append (char *p, const char *str, size_t len);
and then:
char *p = current_substring;
p = append (p, percent_loc, length_before_ll);
p = append (p, "I64");
p = append (p, percent_loc[length_before_ll + lcount]);
current_substring = p;
You could wrap that in a class, like:
/* Helper to build an null-terminated C string. */
struct zstring_builder
{
/* POS is a pointer to the position in buffer to work on. */
explicit zstring_builder (char *pos)
: m_pos (pos)
{}
// memcpy, advance m_pos, null-terminate.
zstring_builder &append (const char *s, size_type count);
zstring_builder &append (const char* s);
char *pos () { return m_pos; }
private:
char *m_pos;
};
Or:
/* Helper to build a null-terminated C string on top
of a vector. */
struct zstring_builder
{
/* POS is a pointer to the position in buffer to work on. */
explicit zstring_builder (std::vector<char> &buf, size_t pos);
// memcpy, advance m_pos, null-terminate.
zstring_builder &append (const char *s, size_type count);
zstring_builder &append (const char* s);
char *pos () { return &m_vector[m_pos]; }
private:
std::vector &m_buf;
size_t m_pos;
};
or something else along those lines, if you now make piece_string
a zstring_builder, you can get similar, and just as readable code:
zstring_builder piece_string (current_substring);
piece_string.append (percent_loc, length_before_ll);
piece_string.append ("I64");
piece_string.append (percent_loc[length_before_ll + lcount]);
current_substring = piece_string.pos ();
Alternatively, we could have something like a zstring_ref
class [1], as a non-owning mutable view of a C string (that never
reallocates), guaranteed to be null-terminated, and then the code
you shown would be just exactly the same. Open choices could be whether
to save the string size inside zstring_ref (like string_view), or
not saving a size, making it a really-thin wrapper around a
pointer instead.
std::string is great as an _owning_ SSO-enabled (which is sometimes a
curse when embedded in structures...) container of characters. But
when we don't want an owning container, the right tool / refactor can
still make the code just as readable.
[1] - I've thought several times because that something like a zstring_view
(like string_view, but guarantees null-termination) would be handy in
a lot of places. string_view isn't mutable, though, hence the
alternative name instead. could be mut_string_view too, for example.
Thanks,
Pedro Alves