This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
[Bug runtime/10651] very rare BUG_ON kernel/timer.c:619 due to runtime/time.c
- From: "jistone at redhat dot com" <sourceware-bugzilla at sourceware dot org>
- To: systemtap at sources dot redhat dot com
- Date: 18 Sep 2009 20:06:36 -0000
- Subject: [Bug runtime/10651] very rare BUG_ON kernel/timer.c:619 due to runtime/time.c
- References: <20090917022834.10651.fche@redhat.com>
- Reply-to: sourceware-bugzilla at sourceware dot org
------- Additional Comments From jistone at redhat dot com 2009-09-18 20:06 -------
(In reply to comment #0)
> Something is calling mod_timer with a timer->function==NULL.
This seems to indicate either memory corruption, or the memory was freed and
cleared by the next owner. It seems like particularly poor timing for the
function pointer to have been valid enough to enter the handler but invalid at
the end of the handler.
> It appears as if the _stp_kill_time function is needlessly racy
> (amongst the stp_timer_reregister flag, which should probably be
> an atomic_t), and the del_timer_sync()'s. It wouldn't hurt to
> plop a synchronize_sched() in there too before the free_percpu
> goo.
Can you get a crash dump? I'd like to confirm that _stp_kill_time was actually
attempted, possibly by looking at the backtraces on other cpus and checking if
stp_time==NULL.
The promises of del_timer_sync when it returns are that the handler is not
active and the timer is not queued. I think this actually makes the reregister
flag superfluous. It should then be perfectly safe to free the memory, unless
the for_each_online_cpu somehow missed one of the timers...
--
http://sourceware.org/bugzilla/show_bug.cgi?id=10651
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.