This is the mail archive of the
gdb-patches@sources.redhat.com
mailing list for the GDB project.
Re: Single step vs. "tail recursion" optimization
Donn Terry wrote:
>
> (I'm sorry to have to be the messenger on this one...)
>
> Here's a mini testcase. I've also attached the resulting .s files for
> -O2 and -O3.
You do know, don't you, that there are lots of optimizations that GDB
fails to debug at -O2 and -O3? If something bad happens at those
levels, our practice is to say "turn down the optimization and try again".
> Shudder. Andrew's speculation about s not working because there were no
> symbols
> is correct. S-ing works until the call to getpid().
>
> I haven't actually tried to figure out why gdb isn't doing it right in
> that case
> because there's actually something potentially even uglier going on in
> the -O3 case.
> This is something that the "management" of gdb and the "management" of
> gcc are going
> to have to take on and resolve as either "no, you can't sanely debug
> -O3" or "we need
> some help from the compiler to sort this one out". (And if the latter,
> then the same
> help may be useful with the -O2 case!) (I haven't seen this addressed,
> but I could
> easily have missed it.)
>
> Note that in the case of -O3, foo() and bar() are NEVER actually called
> from main,
> but rather getpid() is called directly. (Note also the reordering of the
> functions.)
> (Seeing that this sort of optimization is pretty compellingly needed for
> C++ code,
> "don't do that" seems an unlikely outcome.)
>
> Donn
> P.S. This may explain some instances of "stack unwind missed a frame"
> bugs.
>
> bar() {
> getpid();
> }
>
> foo() {
> bar();
> }
>
> main()
> {
> foo();
> }
>
> ------------------ -O2 -------------------
> .file "bat.c"
> .global __fltused
> .text
> .p2align 4,,15
> .globl _bar
> .def _bar; .scl 2; .type 32; .endef
> _bar:
> pushl %ebp
> movl %esp, %ebp
> popl %ebp
> jmp _getpid
> .p2align 4,,15
> .globl _foo
> .def _foo; .scl 2; .type 32; .endef
> _foo:
> pushl %ebp
> movl %esp, %ebp
> popl %ebp
> jmp _bar
> .def ___main; .scl 2; .type 32; .endef
> .p2align 4,,15
> .globl _main
> .def _main; .scl 2; .type 32; .endef
> _main:
> pushl %ebp
> movl %esp, %ebp
> pushl %eax
> pushl %eax
> xorl %eax, %eax
> andl $-16, %esp
> call __alloca
> call ___main
> call _foo
> movl %ebp, %esp
> popl %ebp
> ret
> .def _getpid; .scl 2; .type 32; .endef
> ------------------------ -O3 ---------------------------------
> .file "bat.c"
> .global __fltused
> .def ___main; .scl 2; .type 32; .endef
> .text
> .p2align 4,,15
> .globl _main
> .def _main; .scl 2; .type 32; .endef
> _main:
> pushl %ebp
> movl %esp, %ebp
> pushl %eax
> pushl %eax
> xorl %eax, %eax
> andl $-16, %esp
> call __alloca
> call ___main
> call _getpid <<< NO CALL TO foo()
> movl %ebp, %esp
> popl %ebp
> ret
> .p2align 4,,15
> .globl _bar
> .def _bar; .scl 2; .type 32; .endef
> _bar:
> pushl %ebp
> movl %esp, %ebp
> popl %ebp
> jmp _getpid
> .p2align 4,,15
> .globl _foo
> .def _foo; .scl 2; .type 32; .endef
> _foo:
> pushl %ebp
> movl %esp, %ebp
> popl %ebp
> jmp _getpid <<< NOTE THAT foo() doesn't call
> bar() either!
> .def _getpid; .scl 2; .type 32; .endef
>
> -----Original Message-----
> From: Michael Snyder [mailto:msnyder@redhat.com]
> Sent: Friday, November 08, 2002 11:43 AM
> To: Donn Terry
> Cc: gdb-patches@sources.redhat.com
> Subject: Re: Single step vs. "tail recursion" optimization
>
> Donn Terry wrote:
> >
> > While debugging gdb, I ran across a really nasty little issue: the gcc
>
> > guys (for the "bleeding edge", at least) have generated an
> > optimization such that if the last thing in function x is a function
> > call to y, it will short circut the return from x, and set things up
> > so it returns directly from y. (A special case of tail recursion
> > optimizations.)
> >
> > If you try to n (or s) over that, the debugged program runs away
> > because gdb doesn't know about that magic. The real example is
> > regcache_raw_read, which ends in a memcpy. Instead of jsr-ing to the
> > memcpy and then returning, it fiddles with the stack and jmps to
> > memcpy. Is this a known issue, and is it being worked, or have I just
> > run across something new to worry about?
> >
> > (This is on Interix (x86, obviously from the code below) with a gcc
> > that's less than a week old. I have no idea how long it might
> > actually have been this way. I doubt
> > the problem is actually unique to the x86 as this is a very general
> > optimization.)
> >
> > Donn
>
> Tail-recursion isn't a new optimization, but I have almost no (only the
> vaguest) recollection of ever having run up against
> it before. Could be there's a change with the way GCC is
> implementing it. Could be we never handled it before.
>
> This sounds like a good argument for parsing the epilogue... ;-(
>
> Michael
>
> >
> > Heres the code:
> >
> > 0x466e37 <regcache_raw_read+151>: mov 0x1c(%eax),%ecx
> > 0x466e3a <regcache_raw_read+154>: mov 0x18(%eax),%eax
> > 0x466e3d <regcache_raw_read+157>: mov (%eax,%esi,4),%edx
> > 0x466e40 <regcache_raw_read+160>: mov 0x4(%ebx),%eax
> > 0x466e43 <regcache_raw_read+163>: add %eax,%edx
> > 0x466e45 <regcache_raw_read+165>: mov (%ecx,%esi,4),%eax
> > 0x466e48 <regcache_raw_read+168>: mov %eax,0x10(%ebp)
> > 0x466e4b <regcache_raw_read+171>: mov %edx,0xc(%ebp)
> > 0x466e4e <regcache_raw_read+174>: mov %edi,0x8(%ebp)
> > 0x466e51 <regcache_raw_read+177>: lea 0xfffffff4(%ebp),%esp
> > 0x466e54 <regcache_raw_read+180>: pop %ebx
> > 0x466e55 <regcache_raw_read+181>: pop %esi
> > 0x466e56 <regcache_raw_read+182>: pop %edi
> > 0x466e57 <regcache_raw_read+183>: pop %ebp
> > 0x466e58 <regcache_raw_read+184>: jmp 0x77d91e60 <memcpy>
> > 0x466e5d <regcache_raw_read+189>: lea 0x0(%esi),%esi