This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [RFC -mm][PATCH 5/6] prepare kprobes code for x86 unification
- From: Masami Hiramatsu <mhiramat at redhat dot com>
- To: Srinivasa Ds <srinivasa at in dot ibm dot com>
- Cc: ananth at in dot ibm dot com, Jim Keniston <jkenisto at us dot ibm dot com>, Roland McGrath <roland at redhat dot com>, Arjan van de Ven <arjan at infradead dot org>, prasanna at in dot ibm dot com, anil dot s dot keshavamurthy at intel dot com, davem at davemloft dot net, systemtap-ml <systemtap at sources dot redhat dot com>
- Date: Tue, 11 Dec 2007 12:57:16 -0500
- Subject: Re: [RFC -mm][PATCH 5/6] prepare kprobes code for x86 unification
- References: <475DC362.9000707@redhat.com> <475E952F.90403@in.ibm.com>
Hi Srinivasa,
Thank you for reporting.
Srinivasa Ds wrote:
> Hi Masami
>
> I was testing your patch on x86_64 by executing systemtap tests. I got this oops message.
I ran systemtap testsuite on x86-64. But I could not reproduce it yet.
Would you apply all of these patches or just first 5 patches?
> Unable to handle kernel paging request at ffffffff8086ccb3 RIP:
> [<ffffffff804739c5>] arch_prepare_kprobe+0x22/0x217
> PGD 203067 PUD 207063 PMD 7e086163 PTE 86c000
> Oops: 0000 [1] SMP
> last sysfs file: /sys/module/stap_60ea9007c2ab1a78963339fffdc0a88e_356908/sections/.bss
> CPU 1
> Modules linked in: stap_60ea9007c2ab1a78963339fffdc0a88e_356908 systemtap_test_module1 systemtap_test_module2 ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_multipath video output sbs sbshc battery acpi_memhotplug ac power_supply lp sg tg3 button i2c_i801 ide_cd parport_pc shpchp parport serio_raw cdrom i2c_core e752x_edac floppy edac_core pcspkr dm_snapshot dm_zero dm_mirror dm_mod ata_piix libata aic79xx scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
> Pid: 3171, comm: stapio Tainted: GF 2.6.24-rc4-mm1 #2
> RIP: 0010:[<ffffffff804739c5>] [<ffffffff804739c5>] arch_prepare_kprobe+0x22/0x217
> RSP: 0018:ffff81003b6d7e48 EFLAGS: 00010282
> RAX: ffffffff8086ccb3 RBX: ffffffff884371d0 RCX: ffffffff88224f20
> RDX: 0000000000000f20 RSI: 6600000000000000 RDI: ffffffff884371d0
> RBP: ffffffff884371d0 R08: ffff810035daf000 R09: ffff81007f834000
> R10: ffffffff8024bf9c R11: 0000000000000000 R12: 00000000000036b0
> R13: 0000000000000000 R14: ffffffff8840e3b2 R15: 0000000000000000
> FS: 00002b2e14748b00(0000) GS:ffff81007fbac840(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: ffffffff8086ccb3 CR3: 000000007c5a3000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
> Process stapio (pid: 3171, threadinfo ffff81003b6d6000, task ffff8100187bc340)
> Stack: 0000000000000000 ffffffff80474c5c 0000000000000000 ffffffff884371d0
> 0000000000000000 00000000000036b0 00000000000001d2 ffffffff8840485b
> 00000000000036b0 00000000000a7fc4 ffff81003b6d7ee8 0000000000000008
> Call Trace:
> [<ffffffff80474c5c>] __register_kprobe+0x1f0/0x2e8
> [<ffffffff8840485b>] :stap_60ea9007c2ab1a78963339fffdc0a88e_356908:systemtap_module_init+0x202/0x45f
> [<ffffffff88404ac1>] :stap_60ea9007c2ab1a78963339fffdc0a88e_356908:probe_start+0x9/0x12
> [<ffffffff88404aeb>] :stap_60ea9007c2ab1a78963339fffdc0a88e_356908:_stp_handle_start+0x21/0x7c
> [<ffffffff88404bb8>] :stap_60ea9007c2ab1a78963339fffdc0a88e_356908:_stp_ctl_write_cmd+0x72/0xc3
> [<ffffffff80265748>] audit_syscall_entry+0x141/0x174
> [<ffffffff80296349>] vfs_write+0xc6/0x14f
> [<ffffffff8029689f>] sys_write+0x45/0x6e
> [<ffffffff8020c0dc>] tracesys+0xdc/0xe1
>
>
> Code: 48 8b 10 48 89 11 48 8b 40 08 48 89 41 08 48 8b 53 70 8a 02
> RIP [<ffffffff804739c5>] arch_prepare_kprobe+0x22/0x217
> RSP <ffff81003b6d7e48>
> CR2: ffffffff8086ccb3
>
>
> On debugging it further I found that, current.stp is causing oops(easily reproducible too)
> Current.stp probes all functions in kernel/sched.c and some module functions.
Unfortunately, I could not reproduce it by executing current.stp too.
>
> 743 arch_copy_kprobe():
> 744 /root/linux-2.6.24-rc4/arch/x86/kernel/kprobes.c:312
> 745 6bf: 48 8b 43 30 mov 0x30(%rbx),%rax
> 746 6c3: 48 8b 10 mov (%rax),%rdx <<<<<<<<IP is here>>>>>
> 747 6c6: 48 89 11 mov %rdx,(%rcx)
> 748 6c9: 48 8b 40 08 mov 0x8(%rax),%rax
> 749 6cd: 48 89 41 08 mov %rax,0x8(%rcx)
> ==============================================
> 310 static void __kprobes arch_copy_kprobe(struct kprobe *p)
> 311 {
> 312 memcpy(p->ainsn.insn, p->addr, MAX_INSN_SIZE * sizeof(kprobe_opcode_t)); <<IP>>
> 313 fix_riprel(p);
> 314 if (can_boost(p->addr)) {
>
> That means on accessing rax(p->addr), kernel has crashed.
> RAX=p->addr= ffffffff8086ccb3
>
> cat /proc/kallsyms | grep ffffffff8086ccb3
> ffffffff8086ccb3 t init_sched_debug_procfs
>
> Since "init_sched_debug_procfs" is a __init function and deallocation of memory
> of __init function may be causing the problem.I tried probing this function directly,
> but didn't see any oops.
Sure, I also tested it (stap -e 'probe kernel.function("init_sched_debug_procfs"){}')
but it could not cause oops.
By the way, as far as I can see, the current.stp does not probe "init_sched_debug_procfs".
So it could be caused by incorrect debuginfo...
Best Regards,
>
>
> Thanks
> Srinivasa DS
--
Masami Hiramatsu
Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division
e-mail: mhiramat@redhat.com, masami.hiramatsu.pt@hitachi.com