This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GDB internal error in pc_in_thread_step_range


On 2018-12-22 03:44, Eli Zaretskii wrote:
Date: Thu, 20 Dec 2018 18:03:33 -0500
From: Simon Marchi <simon.marchi@polymtl.ca>
Cc: gdb-patches@sourceware.org

>> Ok.  Well this is already strange.  Why is there an mst_text (code)
>> symbol with a value of 0?
>
> Its address is zero because it's an unresolved symbol:
>
>   d:\usr\eli>nm -A hello0.exe | fgrep " U "
>   hello0.exe:         U ___deregister_frame_info
>   hello0.exe:         U ___register_frame_info
>   hello0.exe:         U __Jv_RegisterClasses

Huh, interesting.  I looked at elfread, and similar undefined symbols
are skipped:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/elfread.c;h=71e6fcca6ec62ec57f93f06d8a9913612be6f9e2;hb=HEAD#l270

So maybe GDB should skip them as well?

Yes.  Can you please give it a try it?

> In any case, if we do call the "Cannot find bounds of current
> function" error, that will throw to the command loop, which I think is
> undesirable in this case.  We want GDB to step out of this code, not
> to error out.

When we have no line information for the current PC and the user asks us
to step, we fall back to "single step until out of the current
function".  But if the minimal symbol information doesn't let us know
the bounds of the current function, then we can't "single step until out
of the current function", because we don't know where it starts/end.

In your binary, the lowest .text function symbol is
__mingw32_init_mainargs at 0x000002a0 (0x4012a0 once relocated).  Your
pc is 0x40126d (according to an earlier message, but reading lower I
realize this may not be valid anymore), which is lower. So there's just
no minimal symbol for GDB to find.  In that case, it sounds right to
error our and say "I can't step, I don't have enough information". The
user can still use stepi.

But this use case is somewhat special, IMO: stepping outside of 'main'
can happen unintentionally, and should not cause an error.  It should
let the inferior run to completion without any errors.  Raising an
error in this case is confusing.

Just a precision, it's not the stepping out of main that causes an error, it's trying to step again:

<in main>
(gdb) step
<out of main>
(gdb) step
Cannot find bounds of current function

Perhaps the error message could be improved, but I think this is the right thing to do. It is often reported as a bug when "step" lets the program run free and acts as "continue". If you find yourself in that situation again, why not just use "continue" to let the program exit?

This case would work just fine if your binary had a matching symbol for this location, so I would start looking at the toolchain, see if can provide that.

Side-question, are there some debug symbols in the binary that could
describe this location?

How do I know that?

I would normally use "readelf --debug-dump" to look at the DWARF info, but since this is not an ELF, I don't know.

> --- gdb/minsyms.c~0	2018-07-04 18:41:59.000000000 +0300
> +++ gdb/minsyms.c	2018-12-20 08:06:11.516834500 +0200
> @@ -1514,7 +1514,8 @@ minimal_symbol_upper_bound (struct bound
>      {
>        if ((MSYMBOL_VALUE_RAW_ADDRESS (msymbol + i)
>  	   != MSYMBOL_VALUE_RAW_ADDRESS (msymbol))
> -	  && MSYMBOL_SECTION (msymbol + i) == section)
> +	  && MSYMBOL_SECTION (msymbol + i) == section
> +	  && MSYMBOL_TYPE (msymbol + i) != mst_abs)
>  	break;
>      }

Note that if we implement the solution of rejecting the symbols with
section == -1, those mst_abs symbols won't be there anymore.

Fine by me.  Should we push such a change?

Based on what we saw, I would be for it. But you'll need to make the change and test it for regression, as I don't have the necessary setup (and knowledge) to do that on Windows.

> I think if we want to avoid showing __register_frame_info, we need
> further changes in lookup_minimal_symbol_by_pc_section.  But I don't
> see how this will help us, unless we also allow displaying the above
> message for functions whose names we don't know, perhaps saying
> something like
>
>   Single stepping until exit from function <unknown>

The problem is not only that we are missing the name, but most
importantly that we are missing the bounds of the current function.
With what you've implemented here, GDB thinks there is a function that
occupies the range [0,401000[ (something like that), so it single steps
until it gets out of that range, but the process exits before.

Which IMO is just fine for this specific use case.

Yes, but it's by chance. If the process didn't exit, it wouldn't be "Single stepping until exit from function X", it would be "Single stepping until god-knows-when". So it might work fine here, but if we adopted this behavior generally, it could just be more confusing in other cases.

So it kind of works for your use case, but it's not right, IMO. If the
process did not exit as it does here, the behavior would be erratic.

I don't think it would be erratic, we will just see the same

    0x00401nnn in __register_frame_info ()

for several steps.  Is that so bad?

Well, first thing, I think it's wrong that we show that it's in __register_frame_info. If this was an actual resolved .text symbol, it wouldn't be so bad, but here it's not even a function in the program, it doesn't make sense.

Also, the user wouldn't do several step. They would do one step and GDB would keep single stepping until execution gets out of the stepping range [0,401000[. This is very unpredictable, from the point of view of the user.

Ok, well I think it shows the problem quite clearly, some symbol is
missing for GDB to work properly in that context.  I think that we
should improve GDB to handle it better error out clearly (instead of
hitting a failed assert), but I don't think it can do much more.

Can you suggest a patch?  I'm not sure I understand the behavior that
will be the result.

I could try later, using a mingw64 executable (though I can't run it, just read the symbols in GDB). But I think you are in a better position than me to do that, since you are more familiar with the platform.

I would probably be looking into adding some "if" in coff_symtab_read, to filter out the unwanted symbols, as discussed previously. You can now use "set debug symtab -create 2" to confirm that symbols like __register_frame_info no longer lead to the creation of a minimal symbol.

I guess that having debug info for the file containing
__mingw_CRTStartup would help, if you really needed to step past main?

I don't need to step past main, it just happens in many cases, when I
type one "next" too many.  I would like to avoid any errors in that
case.

As I mentioned earlier, I think it's fine if step fails with "I don't have enough info to do my work", and suggest you use continue instead.

Simon


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]