This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Introduce gdb::byte_vector, add allocator that default-initializes


On 2017-06-14 01:04, Pedro Alves wrote:
On 06/12/2017 08:52 PM, Simon Marchi wrote:
On 2017-06-12 19:07, Pedro Alves wrote:

One nice thing is that with this allocator, for changes like these:

  -std::unique_ptr<byte[]> buf (new gdb_byte[some_size]);
  +gdb::byte_vector buf (some_size);
   fill_with_data (buf.data (), buf.size ());

the generated code is the exact same as before.  I.e., the compiler
de-structures the vector and gets rid of the unused "reserved vs size"
related fields.

How come?

I mean that even though std::vector is a struct/aggregate, the
compiler does "scalar replacement of aggregates", constant
propagation, etc., and then all that's left in the generated code is a
pointer to the buffer returned by the new call that "operator new()"
returns.  Even though sizeof(buf) is 24 on libstdc++ there are no
traces of a real "buf" tuple in memory.

E.g., with:

extern void fill_with_data (gdb_byte *buf, size_t len);

void foo_uniq ()
{
  std::unique_ptr<gdb_byte[]> buf (new gdb_byte [100]);

  fill_with_data (buf.get (), 100);
}

void foo_def_vec ()
{
  gdb::byte_vector buf (100);

  fill_with_data (buf.data (), 100);
}

We get, with gcc 7, -O2:

Dump of assembler code for function foo_uniq():
   0x00000000005ac320 <+0>:	push   %rbp
   0x00000000005ac321 <+1>:	push   %rbx
   0x00000000005ac322 <+2>:	mov    $0x64,%edi
   0x00000000005ac327 <+7>:	sub    $0x8,%rsp
0x00000000005ac32b <+11>: callq 0x6218c0 <operator new[](unsigned long)>
   0x00000000005ac330 <+16>:	mov    $0x64,%esi
   0x00000000005ac335 <+21>:	mov    %rax,%rdi
   0x00000000005ac338 <+24>:	mov    %rax,%rbx
   0x00000000005ac33b <+27>:	callq  0x600820 <fill_with_data(unsigned
char*, unsigned long)>
   0x00000000005ac340 <+32>:	add    $0x8,%rsp
   0x00000000005ac344 <+36>:	mov    %rbx,%rdi
   0x00000000005ac347 <+39>:	pop    %rbx
   0x00000000005ac348 <+40>:	pop    %rbp
   0x00000000005ac349 <+41>:	jmpq   0x80904a <operator delete[](void*)>
   0x00000000005ac34e <+46>:	mov    %rax,%rbp
   0x00000000005ac351 <+49>:	mov    %rbx,%rdi
   0x00000000005ac354 <+52>:	callq  0x80904a <operator delete[](void*)>
   0x00000000005ac359 <+57>:	mov    %rbp,%rdi
   0x00000000005ac35c <+60>:	callq  0x828dbb <_Unwind_Resume>
End of assembler dump.
Dump of assembler code for function foo_def_vec():
   0x00000000005ac370 <+0>:	push   %rbp
   0x00000000005ac371 <+1>:	push   %rbx
   0x00000000005ac372 <+2>:	mov    $0x64,%edi
   0x00000000005ac377 <+7>:	sub    $0x8,%rsp
0x00000000005ac37b <+11>: callq 0x6217b0 <operator new(unsigned long)>
   0x00000000005ac380 <+16>:	mov    $0x64,%esi
   0x00000000005ac385 <+21>:	mov    %rax,%rdi
   0x00000000005ac388 <+24>:	mov    %rax,%rbx
   0x00000000005ac38b <+27>:	callq  0x600820 <fill_with_data(unsigned
char*, unsigned long)>
   0x00000000005ac390 <+32>:	add    $0x8,%rsp
   0x00000000005ac394 <+36>:	mov    %rbx,%rdi
   0x00000000005ac397 <+39>:	pop    %rbx
   0x00000000005ac398 <+40>:	pop    %rbp
   0x00000000005ac399 <+41>:	jmpq   0x80938a <operator delete(void*)>
   0x00000000005ac39e <+46>:	mov    %rax,%rbp
   0x00000000005ac3a1 <+49>:	mov    %rbx,%rdi
   0x00000000005ac3a4 <+52>:	callq  0x80938a <operator delete(void*)>
   0x00000000005ac3a9 <+57>:	mov    %rbp,%rdi
   0x00000000005ac3ac <+60>:	callq  0x828dbb <_Unwind_Resume>
End of assembler dump.

The only difference is that one called operator new, while the
other called operator new[].

I don't really understand default-init-alloc.h, but the rest of the
patch looks good to me.

The important thing to understand is that std containers never
call operator new/delete directly, and don't call ctors/dtors directly
either.  Instead there's a level of (usually compile-time) indirection
via an "allocator".

  http://en.cppreference.com/w/cpp/memory/allocator

(be sure to set "standard revision to C++11" to hide c++03 cruft, btw.)

There's another level of indirection here that I'll ignore for
simplicity - std::allocator_traits.

So when std::vector is constructed with a size or resized, the
allocator's allocate method is called to allocate a raw block of
contiguous memory for all the vector's elements. gdb::default_init_alloc doesn't override the (provide an) "allocate" method, so we end up in the default memory allocation via "operator new (size_t)". Note this "new" call
returns a raw memory block, not constructed elements.  We still need
to run ctors to give life to the elements.

To run the ctor of each of the elements, std::vector calls
the allocator's "construct" method.  You can imagine it as having
this signature:

 template< class U, class... Args >
 void construct( U* p, Args&&... args );

Note that's a variadic template method.  When a vector is
created with an initial size, or is resized, and you don't specify
the value new element should have, i.e., overload (3) at:

 http://en.cppreference.com/w/cpp/container/vector/vector

the new elements must be default constructed.  I.e.,
their default constructor (i.e., ctor with no arguments) must
be called.  std::vector does that by calling the allocator's
construct method above passing it no "args" after "p".  Since
gdb::default_init_allocator has a "construct" overload like this:

 template< class U >
 void construct( U* p );

that's picked as the right overload to call, because it's
considered a better match than a variadic template.

Here's that gdb::default_init_allocator method in full:

+  /* .. and provide an override/overload for the case of default
+     construction (i.e., no arguments).  This is where we construct
+     with default-init.  */
+  template <typename U>
+  void construct (U *ptr)
+    noexcept (std::is_nothrow_default_constructible<U>::value)
+  {
+    ::new ((void *) ptr) U; /* default-init */
+  }

Ignore the "noexpect" bit.

That "new" call with an argument passed before the type
is called a "placement-new".  That's how you run the ctor
of U on pre-existing memory.  In this case "*ptr".

That new expression does default-initialization because it doesn't
have "()" after the U.  If it had, like this:

   ::new ((void *) ptr) U ();

then that'd value-initialize *ptr.  For non-trival types, it's
the same thing.  But for scalar types, default-initialization
does nothing, and value-initialization memsets to 0.

Value-initialization would be what what the:

 template< class U, class... Args >
 void construct( U* p, Args&&... args );

"overload" would do, if we'd let it.  (Again, "quotes" because
I'm ignoring allocator_traits for simplicity.)

So what happens if you resize the vector with an explicit value, like

  buf.resize(new_size, 0xff);

?

In that case you're calling overload (2) at :
  http://en.cppreference.com/w/cpp/container/vector/vector

and then the default_init_allocator's "construct" method that
takes no arguments beyond the object's pointer is not picked
by overload resolution, so the generic one is picked:

 template< class U, class... Args >
 void construct( U* p, Args&&... args );

and that one constructs *P with

   ::new ((void *) ptr) U (args...);

which expanding "args..." ends up being:

   ::new ((void *) ptr) U (0xff);

You see here why by default the you end up with
value-initialization.  If "args..." expanded to nothing,
you'd get:

   ::new ((void *) ptr) U ();

See more here:

 http://en.cppreference.com/w/cpp/concept/Allocator
 http://en.cppreference.com/w/cpp/memory/allocator_traits
 http://en.cppreference.com/w/cpp/memory/allocator
 http://en.cppreference.com/w/cpp/memory/allocator_traits/construct
 http://en.cppreference.com/w/cpp/memory/allocator/construct
 http://en.cppreference.com/w/cpp/language/new

I hope that helps.  I think the best way to get a good grasp
on this is to just step through the std::vector code all the
way to the allocator.

Thanks,
Pedro Alves

Thanks for the detailed explanation. It's all very logical, but it's also full of small details essential to really understand what's happening.

Simon


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]