This is the mail archive of the gdb@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

GDB loops forever until it crashes when it runs out of memory


Hi

I had to debug an embedded target (ARMv7, extended-remote) running an RTOS with roughly 30 threads. Due to a programming error, one of the threads stack was completely screwed up and that caused that a 'bt' on that thread to terminate GDB with the following error which occurred after some seconds:

---
Recursive internal problem.

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
---

That crash occured becuase GDB looped in 'int value_fetch_lazy (struct value *val)' within 'while (VALUE_LVAL (new_val) == lval_register && value_lazy (new_val))' forever - but suddenly run out of memory, because one of the called functions allocated heap on every call.

The issue was that, due to the screwed up thread stack, 'new_val = get_frame_register_value (frame, regnum);' happened to return the same data on every call (new_val was a different allocation each time, but contained the same data as the previously returned). As you know: If you have to debug a crashed system, you're happy to see just anything rather than a crashing debugger. The point is that using a graphical debugger, such as Eclipse/CDT, you can't really avoid the 'bt' on that thread and therefore the only thing you notice is that GDB crashes without the chance of investigating anything.

I don't know about the philosophy of GDB, whether it is supposed to handle such situation. However, for me, the following additional code helped to avoid the GDB crash which gave me a chance to inspect the rest of the system with a Eclipse/CDT:

      new_val = get_frame_register_value (frame, regnum);
      if( (regnum == VALUE_REGNUM(new_val))
              && (frame == frame_find_by_id (VALUE_FRAME_ID (new_val)) )) {
          set_value_lazy (val, 0);
          mark_value_bytes_unavailable (val,
                              value_embedded_offset (val),
                              TYPE_LENGTH (type));
          return 0;
      }

As I'm unfamiliar with GDB internals, I don't know whether I compared the right properties of and new_val and whether the implementation is "ok" like this - but at least this code helped to make GDB properly abort trying to unwind the stack after it received the same information twice.

---

Last line of the GDB traces ('set debug frame 1') without the patch, having executed a 'bt' on the screwed up thread:

{ frame_id_p (l={stack=0x316800,code=0x1c15c0,!special}) -> 1 }
{ frame_id_eq (l={stack=0x316800,code=0x1c15c0,!special},r={stack=0x316800,code=0x1c15c0,!special}) -> 1 } { frame_unwind_register_value (frame=1,regnum=98(s7),...) { frame_unwind_register_value (frame=0,regnum=98(s7),...) { frame_id_p (l={stack=0x3167d0,code=0x199850,!special}) -> 1 }
-> register=98 lazy }
-> register=98 lazy }
{ frame_id_p (l={stack=0x3167d0,code=0x199850,!special}) -> 1 }
{ frame_id_eq (l={stack=0x3167d0,code=0x199850,!special},r={stack=0x3167d0,code=0x199850,!special}) -> 1 } { frame_unwind_register_value (frame=-1,regnum=98(s7),...) -> register=98 bytes=[00000000] }
{ frame_id_p (l={stack=0x316938,code=0x20a18,!special}) -> 1 }
{ frame_id_eq (l={stack=0x316938,code=0x20a18,!special},r={stack=0x316938,code=0x20a18,!special}) -> 1 } { value_fetch_lazy (frame=8,regnum=98(s7),...) -> register=98 bytes=[00000000] } #9 0x00050f2c in CINOSBaseRamp::Pull (this=0x10da0 <CHcsController::Register_AllAutoComt(unsigned short)+36>, arS=16.000000000000057, arV=0, arA=0, arJ=0) at ../../inos/os/inos/src/cinosbaseramp.cpp:4844 { get_prev_frame_1 (this_frame=9) -> {level=10,type=<unknown>,unwind=<unknown>,pc=0x50f2c,id=<unknown>,func=<unknown>} // cached
{ get_frame_func (this_frame=10) -> 0x50f28 }
{ frame_unwind_register_value (frame=9,regnum=13(sp),...) -> computed bytes=[38693100] }
{ frame_unwind_arch (next_frame=10) -> arm }
{ frame_unwind_register_value (frame=10,regnum=15(pc),...) { frame_unwind_register_value (frame=10,regnum=14(lr),...) { get_frame_id (fi=10) { frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
-> {stack=0x316938,code=0x50f28,!special} }
{ frame_id_eq (l={stack=0x316938,code=0x50f28,!special},r={stack=0x316938,code=0x50f28,!special}) -> 1 }
{ frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
-> register=14 lazy }
{ frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
{ frame_id_eq (l={stack=0x316938,code=0x50f28,!special},r={stack=0x316938,code=0x50f28,!special}) -> 1 } { frame_unwind_register_value (frame=9,regnum=14(lr),...) { frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
-> register=14 lazy }
{ frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
{ frame_id_eq (l={stack=0x316938,code=0x50f28,!special},r={stack=0x316938,code=0x50f28,!special}) -> 1 } { frame_unwind_register_value (frame=9,regnum=14(lr),...) { frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
-> register=14 lazy }
{ frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
{ frame_id_eq (l={stack=0x316938,code=0x50f28,!special},r={stack=0x316938,code=0x50f28,!special}) -> 1 } { frame_unwind_register_value (frame=9,regnum=14(lr),...) { frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
-> register=14 lazy }
... (this repeats endlessly)


with the patch, it ends like this:

-> register=97 lazy }
{ frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
{ frame_id_eq (l={stack=0x316938,code=0x50f28,!special},r={stack=0x316938,code=0x50f28,!special}) -> 1 } { frame_unwind_register_value (frame=9,regnum=97(s6),...) { frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
-> register=97 lazy }
{ frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
{ frame_id_eq (l={stack=0x316938,code=0x50f28,!special},r={stack=0x316938,code=0x50f28,!special}) -> 1 } { frame_unwind_register_value (frame=9,regnum=98(s7),...) { frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
-> register=98 lazy }
{ frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
{ frame_id_eq (l={stack=0x316938,code=0x50f28,!special},r={stack=0x316938,code=0x50f28,!special}) -> 1 } { frame_unwind_register_value (frame=9,regnum=98(s7),...) { frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
-> register=98 lazy }
{ frame_id_p (l={stack=0x316938,code=0x50f28,!special}) -> 1 }
{ frame_id_eq (l={stack=0x316938,code=0x50f28,!special},r={stack=0x316938,code=0x50f28,!special}) -> 1 } #10 0x00050f2c in CINOSBaseRamp::Pull (this=<unavailable>, arS=<unavailable>, arV=<unavailable>, arA=<unavailable>, arJ=<unavailable>) at ../../inos/os/inos/src/cinosbaseramp.cpp:4844
{ get_prev_frame_1 (this_frame=10) -> <NULL frame> // cached
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

Raphael


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]