This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Segmentation fault / Illegal instruction - udevd - ntpd


Hi all,

The problem:
I'm performing continuous reboot test on a MIPS based board running
linux-2.6.18/glibc-2.3.5/gcc-3.3.6. After several hours of rebooting (say
after 80, purely random) Iâve observed Segmentation fault or Illegal
instruction error while starting the udevd and ntpd programs during the
startup. The error appears pretty much random, it doesnât usually take more
than an hour or two to catch an instance of the Segfault.

Steps I have tried to resolve the problem:
1. Used environment variable LD_DEBUG=all for debugging ntpd and udevd.
After a few hours of reboot test I could observe segmentation fault. The
last line printed on the console at the time of segmentation fault was
âcalling initâ. From the log file it was observed that the segmentation
fault occur after the âinitâ function call of libnss_dns.so (in the case of
ntpd) and libnss_compat.so/libnss_files.so (in the case of udevd). But the
function âinitâ doesnât present in both libraries, I think a default âinitâ
would be used if none is present.


2. Added printf statements in the call_init function in
glibc-2.3.5/elf/dl-init.c. Since putting those printf statements into the
call_init function I have not seen a single segfault or illegal instruction
in over 2400 reboots!


3. Run the application udevd and ntpd during startup using gdb. Couldnât
observe segfault/illegal instruction error for long hours of reboot.


4. Registered a signal handler in ntpd and udevd to get the backtrace using
sigaction at the time of segmentation fault, but got only a single address
location (NOT a full stack trace). After that, tried to map this address
with the application using 'nm' utility, but couldnât locate the same.


5. Generated core dump at the time of segfault. Analyzed the core dump using
gdb and observed that the segfault occur at init function of
libnss_compat.so (in the case of udevd) and libnss_dns.so (in the case of
ntpd). 

6. As per my understanding, the default init that would be used is located
in sysdep/generic/initfini.c. Compared this file for glibc version 2.3.5 and
2.5 but no differences were found. 


7. Compiled ntpd and udevd statically and performed the reboot test. But
segfault was observed after 34 reboots.


8. Added a dummy _init/_fini function in both the nss libraries
(libnss_dns.so and libnss_compat.so) to override the default init functions.
But segfault was observed after 207 reboots.


9. Added constructor/destructor routine in libnss_dns.so and
libnss_compat.so. Segmentation fault occurred after several hours of reboot. 


10. Used LD_BIND_NOW option (enabled this option just before ntpd/udevd),
using this option all symbols will be resolved at the loading time. Even
after enabling this feature, didnât observe any difference in the issue i.e.
observed segfault after 265 reboots 

11. Changed the order of invocation of the programs udevd and ntpd.
Previously, in the init scripts, udevd was started first and ntpd was called
somewhere near the last stage of init scripts. Now ntpd is started
immediately after invoking udevd. Surprisingly, the frequency of appearing
segfault was increased, ie previously it would take nearly 100 reboots to
observe a segfault, but now it would take nearly 10 reboots to observe a
segfault!

Hope these will give you inputs to comment on this problem. Waiting for your
valuable replies

Pradeepkumar S 




Pradeepkumar S wrote:
> 
> Hi all,
> 
> I have been doing continuous reboot test on a MIPS based board running
> linux-2.6.18/glibc-2.3.5. After several hours of rebooting (say after 80
> reboots), Iâve observed Segmentation fault or Illegal instruction errors
> while starting the udevd and ntpd programs during startup. Iâve tried
> debugging techniques such as gdb, core dump, signal handler, printf'ing
> etc. When I tried debugging with printf and gdb there is no
> segfault/illegal instruction error. Upon observing the core dump, I found
> that the error was from the _init() function of libnss_compat-2.3.5.so (in
> the case of udevd) and libnss_dns-2.3.5.so (in the case of ntpd). Seems
> that there isnât any function _init in both the libraries. As per my
> understanding a default init routine will be called and is located in
> sysdeps/generic/initfini.c. I have made a workaround by adding a dummy
> _init/_fini functions in nss_compat and nss_dns libraries. After that I
> havenât  observed any Segmentation fault/Illegal instruction errors during
> reboot test for over 800 times. Since I knew overriding _init/_fini is
> dangerous, I have added constructor/destructor functions using gcc
> function attributes replacing the dummy _init/_fini function. This time
> again the Segmentation fault/Illegal instruction error starts appearing.
> Could anyone please help me on this?
> 
> Any help will be appreciated,
> Pradeep
> 
> 

-- 
View this message in context: http://www.nabble.com/Segmentation-fault---Illegal-instruction---udevd---ntpd-tf3982796.html#a11429892
Sent from the Sourceware - glibc-bugs mailing list archive at Nabble.com.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]