This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v3 2/2] Implement pahole-like 'ptype /o' option


On 12/12/2017 06:40 PM, Sergio Durigan Junior wrote:
> On Tuesday, December 12 2017, Pedro Alves wrote:
> 
>> On 12/12/2017 06:19 AM, Sergio Durigan Junior wrote:
>>
>>> Apparently the patch can't handle bitfields very well.  I've found a few
>>> cases where the bitfield handling gets confused, printing wrong
>>> offsets/sizes/holes.  Bitfields can be extremely complex to deal with
>>> when it comes to offsets...
>>>
>>> I spent hours trying to improve the patch, managed to make some
>>> progress, but there are still corner cases to fix.  For example, the
>>> patch doesn't deal well with this case:
>>>
>>> struct aa {                                    
>>> /*    0      |     1 */    char aa;            
>>> /*    1: 1   |     1 */    unsigned char a : 7;                                                
>>> /*    1:15   |     4 */    int b : 10;         
>>> } /* total size:    4 bytes */                 
>>>
>>> In this case, the bitfield "b" would be combined with the previous
>>> bitfield "a", like pahole reports:
>>>
>>> struct aa {                                    
>>>         char                       aa;                   /*     0     1 */                     
>>>         unsigned char              a:7;                  /*     1: 1  1 */                     
>>>
>>>         /* Bitfield combined with previous fields */                                           
>>>
>>>         int                        b:10;                 /*     0: 7  4 */                     
>>> }
>>>
>>> Confusing...  I'm not sure why pahole reports b's offset to be 0.
>>
>> 0 seems right to me.  The bitfield's type is int, with size 4,
>> and it lives at byte offset 0:
>>
>>  <2><53>: Abbrev Number: 5 (DW_TAG_member)
>>     <54>   DW_AT_name        : (indirect string, offset: 0x4ae): b
>>     <58>   DW_AT_decl_file   : 1
>>     <59>   DW_AT_decl_line   : 7
>>     <5a>   DW_AT_type        : <0x71>
>>     <5e>   DW_AT_byte_size   : 4
>>     <5f>   DW_AT_bit_size    : 10
>>     <60>   DW_AT_bit_offset  : 7
>>     <61>   DW_AT_data_member_location: 0     <<<
>>
>> Replacing the 0 with 1, like:
>>
>>     int                        b:10;                 /*     1: 7  4 */
>>
>> would look incorrect to me, because that'd make '(char*)&aa + 1 + 4'
>> (address of containing object + byte offset + byte size) overshoot the
>> size of the containing object.
> 
> OK.  The GDB patch is printing "1:15" because the offset is calculated
> as:
> 
>   (bitpos + offset_bitpos) / TARGET_CHAR_BIT
> 
> In this particular case, offset_bitpos is zero because we're not inside
> an inner struct, so we don't care about it.  bitpos is TYPE_FIELD_BITPOS
> (type, field_idx), which is 15 in this case, because sizeof (aa.aa) +
> bitsof (aa.a) = 15.  

Well, actually it's 15 because that's the bit number of the LSB of that
field, and x86 is a !gdbarch_bits_big_endian machine.  Because we have:

 <2><53>: Abbrev Number: 4 (DW_TAG_member)
    <54>   DW_AT_name        : b
    <56>   DW_AT_decl_file   : 1
    <57>   DW_AT_decl_line   : 7
    <58>   DW_AT_type        : <0x6f>
    <5c>   DW_AT_byte_size   : 4
    <5d>   DW_AT_bit_size    : 10
    <5e>   DW_AT_bit_offset  : 7
    <5f>   DW_AT_data_member_location: 0

and with that, we reach here:

static void
dwarf2_add_field (struct field_info *fip, struct die_info *die,
		  struct dwarf2_cu *cu)
{
...
      attr = dwarf2_attr (die, DW_AT_bit_offset, cu);
      if (attr)
	{
	  if (gdbarch_bits_big_endian (gdbarch))
	    {
...
	    }
	  else
	    {
...
	      attr = dwarf2_attr (die, DW_AT_byte_size, cu);
	      if (attr)
		{
		  /* The size of the anonymous object containing
		     the bit field is explicit, so use the
		     indicated size (in bytes).  */
		  anonymous_size = DW_UNSND (attr);
		}
	      else
		{
...
		}
	      SET_FIELD_BITPOS (*fp,
				(FIELD_BITPOS (*fp)
				 + anonymous_size * bits_per_byte
				 - bit_offset - FIELD_BITSIZE (*fp)));
	    }
	}

which gives us:

  bitpos = DW_AT_byte_size * 8 - DW_AT_bit_offset - DW_AT_bit_size
         = 4 * 8 - 7 - 10
         = 15

and since x86 is a !gdbarch_bits_big_endian machine, 15 goes here:

 byte:  0     1      2      3
 bits:  7-0   15-8   23-16   31-24
              ^

which is what we see if we write a "1" to aa.b:

 (gdb) p aa.b = 1
 $1 = 1
 (gdb) x /4xb &aa
 0x60103c <aa>:  0x00    0x80    0x00    0x00
                         ^^^^

> When we divide it by TARGET_CHAR_BIT, we get 1.  It
> seems using TYPE_FIELD_BITPOS is not reliable when dealing with
> bitfields.

I noticed:

 (gdb) p /x & aa.aa
 $1 = 0x601040
 (gdb) p /x & aa.a
 $2 = 0x601041
 (gdb) p /x & aa.b
 $3 = 0x601040
 (gdb) 

Ignoring the fact that taking the address of bitfields is
not valid C/C++ [ :-) ], it seems like the addresses above
match what we'd expect the byte offset to be.  At least for this
example.

So maybe what you need is to factor out or duplicate the logic
used by the related value printing code.  Here in
value_primitive_field:

  if (TYPE_FIELD_BITSIZE (arg_type, fieldno))
    {
      /* Handle packed fields.

...
      LONGEST bitpos = TYPE_FIELD_BITPOS (arg_type, fieldno);
      LONGEST container_bitsize = TYPE_LENGTH (type) * 8;

      v = allocate_value_lazy (type);
      v->bitsize = TYPE_FIELD_BITSIZE (arg_type, fieldno);
      if ((bitpos % container_bitsize) + v->bitsize <= container_bitsize
	  && TYPE_LENGTH (type) <= (int) sizeof (LONGEST))
	v->bitpos = bitpos % container_bitsize;
      else
	v->bitpos = bitpos % 8;
      v->offset = (value_embedded_offset (arg1)
		   + offset
		   + (bitpos - v->bitpos) / 8);
    }

Above "v->offset" may be the byte offset that you're after.

> 
> So there is something I'm not accounting here.
> 

>> What is that number after ":"  in bitfields supposed to mean in
>> pahole's output (and I assume that that's what you're trying
>> to emulate)?  We're missing documentation for that.
> 
> Yeah.  I tried to find documentation on the output, but couldn't.
> Yesterday I was reading pahole's code.  From what I understand, it is
> the number of bits left in the object.
> 
>> It seems like it's supposed to mean the number of bits left in the
>> containing anonymous object (i.e., in the 4 bytes of the declared
>> int)?  Then "0:7" seems right, given:
>>
>> sizeof (int) * 8 - bitsof(aa.a) - bitsof(aa.a) - bits(aa.b)
>>    => sizeof (int) * 8 - 7 - 10 
>>    => 7
> 
> I guess you meant:
> 
>   sizeof (int) * 8 - bitsof (aa.a) - bitsof(aa.b)

No, because that's be

     => 32 - 7 - 10
     => 15

I meant:

 sizeof (int) * 8 - bitsof(aa.aa) - bitsof(aa.a) - bits(aa.b)
    => sizeof (int) * 8 - 8 - 7 - 10 
    => 7

which gives us the 7 unused bits that pahole reports.

> 
> Right?
> 
> But yeah, it makes sense when you consider that the first bitfield got
> "promoted" from "unsigned char" to "int".  I haven't figured out a way
> to keep track of these type changes in order to calculate the right
> offsets, remaining space and holes.

Do we really need to track type changes, or do we simply need to
detect that the byte offset moved in the negative direction?
I.e., here:

struct S {                                    
        char                       aa;                   /*     0     1 */                     
        unsigned char              a:7;                  /*     1: 1  1 */                     

        /* Bitfield combined with previous fields */                                           

        int                        b:10;                 /*     0: 7  4 */                     
};

The byte offset went "0 -> 1 -> 0", and the "1 -> 0" would mean that the
bitfield was "combined"?

Thanks,
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]