This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Increase timeout on gdb.base/exitsignal.exp


On 08/25/2015 06:42 AM, Sergio Durigan Junior wrote:
> I have noticed that BuildBot is showing random failures of
> gdb.base/exitsignal.exp, specifically when testing on the
> Fedora-ppc64be-native-gdbserver-m64 builder.  Since I wrote this test
> a while ago, I decided to investigate this further.
> 
> This is what you see when you examine gdb.log:
> 
>   Breakpoint 1, main (argc=1, argv=0x3fffffffe3c8) at ../../../binutils-gdb/gdb/testsuite/gdb.base/segv.c:26
>   26	     raise (SIGSEGV);
>   (gdb) print $_exitsignal
>   $1 = void
>   (gdb) PASS: gdb.base/exitsignal.exp: $_exitsignal is void before running
>   print $_exitcode
>   $2 = void
>   (gdb) PASS: gdb.base/exitsignal.exp: $_exitcode is void before running
>   continue
>   Continuing.
> 
>   Program received signal SIGSEGV, Segmentation fault.
>   0x00003fffb7cbf808 in .raise () from target:/lib64/libc.so.6
>   (gdb) PASS: gdb.base/exitsignal.exp: trigger SIGSEGV
>   continue
>   Continuing.
>   FAIL: gdb.base/exitsignal.exp: program terminated with SIGSEGV (timeout)
>   print $_exitsignal
>   FAIL: gdb.base/exitsignal.exp: $_exitsignal is 11 (SIGSEGV) after SIGSEGV. (timeout)
>   print $_exitcode
>   FAIL: gdb.base/exitsignal.exp: $_exitcode is still void after SIGSEGV (timeout)
>   kill
> 
>   Program terminated with signal SIGSEGV, Segmentation fault.
>   The program no longer exists.
>   (gdb) print $_exitsignal
>   $3 = 11
>   (gdb) print $_exitcode
>   $4 = void
> 
> Clearly a timeout issue: one can see that even though the tests failed
> because the program was still running, both 'print' commands actually
> succeeded later.
>

I recently bumped time outs for a few reverse/record tests, but in that
case, it's justified because recording requires single-stepping all
instructions, so it naturally takes a while.  In this case, I don't see what
could reasonably be causing the delay.  It shouldn't really ever take 60
seconds just to deliver a signal and have the kernel report back
process exit.  What could cause this delay?  I'm not sure whether the
process's signalled exit status is reported to the parent before or after
the kernel fully writes the core dump --- it occurred to me that if after,
then writing a big core dump could explain a delay.  So I would
suggest switching to a signal that does cause a core dump by default,
like e.g., SIGKILL/SIGTERM.  Though in this case, the core dump generated
should be small, so I'm mystified.  This could be papering over some
latent problem...

> I could not reproduce this timeout here, but I decided to propose this
> timeout increase anyway.  I have chosen to increase it by a factor of
> 10; that should give GDB/gdbserver plenty of time to reach the SEGV
> point.
> 
> For clarity, I am also attaching the output of 'git diff -w' here; it
> makes things much easier to visualize.
> 
> OK to apply?


>  gdb_continue_to_end
>  
> -# Checking $_exitcode.  It should be 0.
> -gdb_test "print \$_exitcode" " = 0" \
> -    "\$_exitcode is zero after normal inferior is executed"
> +with_timeout_factor 10 {
> +    # Checking $_exitcode.  It should be 0.
> +    gdb_test "print \$_exitcode" " = 0" \
> +	"\$_exitcode is zero after normal inferior is executed"
>  
> -# Checking $_exitsignal.  It should still be void, since the inferior
> -# has not received any signal.
> -gdb_test "print \$_exitsignal" " = void" \
> -    "\$_exitsignal is still void after normal inferior is executed"
> +    # Checking $_exitsignal.  It should still be void, since the inferior
> +    # has not received any signal.
> +    gdb_test "print \$_exitsignal" " = void" \
> +	"\$_exitsignal is still void after normal inferior is executed"
> +}
> 

This (many instances) doesn't make sense to me.  And I think wouldn't
fix anything.  Seems to me the bumped timeout, if any, should be around
the continue that caused the first time out:

# Continue until the end.
gdb_test "continue" "Program terminated with signal SIGSEGV.*" \
    "program terminated with SIGSEGV"

Thanks,
Pedro Alves


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]