This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Re: systemtap 2.2.1 installcheck => kernel BUG at .. kprobes.c:707


Hi,

I've found the root cause of this bug at last.

In the normal path, an optprobe on an init function is
unregistered when a module goes live.

unregister_kprobe(kp)
 -> __unregister_kprobe_top
   ->__disable_kprobe
     ->disarm_kprobe(ap == op)
       ->__disarm_kprobe
        ->unoptimize_kprobe : the op is queued
                              on unoptimizing_list
and do nothing in __unregister_kprobe_bottom

After a while (usually wait 5 jiffies), kprobe_optimizer runs
to unoptimize and free optprobe.

kprobe_optimizer
 ->do_unoptimize_kprobes
   ->arch_unoptimize_kprobes : moved to free_list
 ->do_free_cleaned_kprobes
   ->hlist_del: the op is removed
   ->free_aggr_kprobe
     ->arch_remove_optimized_kprobe
     ->arch_remove_kprobe
     ->kfree: the op is freed

Here, if kprobes_module_callback is called and the delayed
unoptimizing probe is picked BEFORE kprobe_optimizer runs,

kprobes_module_callback
 ->kill_kprobe
   ->kill_optimized_kprobe : dequeued from unoptimizing_list <=!!!
     ->arch_remove_optimized_kprobe
   ->arch_remove_kprobe
   (but op is not freed, and on the kprobe hash table)

This shouldn't happen if the probe unregistration is done AFTER
kprobes_module_callback is called (because at that time the op
is gone), and kprobe-tracer does it.

So, ftrace never gets this bug, but systemtap does.

Of course, this is an actual bug, I'll fix it. And also, the stap
side can avoid it by unregistering probes after kprobes_module_callback
is called.

Thank you,

(2013/05/17 21:21), Frank Ch. Eigler wrote:
> 
> timo.lindfors@iki.fi wrote:
> 
>> [35567.567939] stap_ea692c11a3d17766a0d577ba42aeeaaa_14143: systemtap: 2.2.1/0.153, base: ffffffffa05b4000, memory: 24data/28text/12ctx/2058net/33alloc kb, probes: 13
>> [35567.567946] Warning: found a stray unused aggrprobe@ffffffffa01b6000
>> [35567.567963] ------------[ cut here ]------------
>> [35567.567967] kernel BUG at /build/buildd-linux_3.8.12-1-amd64-RaG_7r/linux-3.8.12/kernel/kprobes.c:707!
>> [35567.567972] invalid opcode: 0000 [#1] SMP 
>> [35567.567976] Modules linked in: stap_ea692c11a3d17766a0d577ba42aeeaaa_14143 systemtap_test_module1(O) systemtap_test_module2(O) zlib_deflate mtd binfmt_misc fuse nfsv4 nfsd auth_rpcgss nfs_acl nfs lockd dns_resolver fscache sunrpc loop evdev snd_pcm_oss snd_mixer_oss snd_pcm acpi_cpufreq snd_page_alloc snd_timer mperf snd processor soundcore thermal_sys pcspkr ext3 mbcache jbd virtio_rng rng_core virtio_net virtio_blk virtio_balloon virtio_pci virtio_ring virtio xen_netfront xen_blkfront [last unloaded: stap_4ed9fdb2cd6708e2efa0c51aeb4b64f_13962]
>> [35567.568137] Call Trace:
>> [35567.568142]  [<ffffffffa01b6000>] ? 0xffffffffa01b5fff
>> [35567.568148]  [<ffffffffa05b93a3>] ? _stp_ctl_write_cmd+0x3d8/0x7f9 [stap_ea692c11a3d17766a0d577ba42aeeaaa_14143]
>> [...]
>> [35567.568185] Code: 75 12 48 8b 75 28 48 c7 c7 76 17 4f 81 31 c0 e8 ab 5f ff ff 83 65 78 fd 48 81 7d 40 39 69 38 81 75 09 83 bd a0 00 00 00 00 75 02 <0f> 0b 48 89 ef e8 48 f6 ff ff f6 45 78 01 74 27 48 89 ef e8 e6 
>> [35567.568233] RIP  [<ffffffff81387fb6>] register_kprobe+0x1c8/0x418
>> [35567.568238]  RSP <ffff88003d791e48>
>> [35567.568261] ---[ end trace 66e9937400424719 ]---
> 
> 
> This seems like a recurrence of an old kernel bug related to
> optimized-kprobes, see <http://sourceware.org/bugzilla/show_bug.cgi?id=13193>.
> It could also be something related to ftrace/kprobes perhaps.
> 
> 
> - FChE
> 


-- 
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]