This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Error removing module: Device or resource busy


Hi,

Thanks for looking at this...

On Fri, Dec 23, 2011 at 12:11:53PM -0800, Josh Stone wrote:
> On 12/22/2011 09:00 PM, Chris Dunlop wrote:
>> Linux-3.0.13, systemtap 6.1 and now HEAD(b6a3da4)
> 
> Your dump looks like x86_64 - is this on any particular distro?  Are you
> building this kernel yourself?

Yes, x86_64, on debian wheezy/sid, self-built kernel and systemtap.

Note: just updated to linux-3.0.14.

>> Whenever I run stap[2], the output ends with:
>> 
>>   Error removing module 'stap_fcac0085842418e34d8094455dc203e8_1_21605': Device or resource busy.
>> 
>> (obviously the module name changes) and the module is still loaded:
>> 
>>   # lsmod | grep stap
>>   stap_fcac0085842418e34d8094455dc203e8_1_21605  2896285 1027571582 [permanent]
> 
> This is odd.  Reading kernel/module.c, the "[permanent]" should only
> come about if the module has no ->exit callback.  And 1027571582 is the
> field for the module refcount, which doesn't look plausible.

That refcount is consistently odd, e.g. after a few runs:

# lsmod | grep stap
stap_b3ee5e5f7f7df4f11fdf95b215e43f6_7050    26018  4294967295 [permanent]
stap_b3ee5e5f7f7df4f11fdf95b215e43f6_6574    26018  4294967295 [permanent]
stap_6bbd9bcdc91b5b9122793a314d03458_5786    26018  4294967295 [permanent]
stap_8969cc5adcc470f954f5b37c4134b9a_5609    26034  4294967295 [permanent]

...actually that's equal to 0xFFFFFFFF, or -1.  But the previously seen
1027571582 is 0x3D3F7F7E which doesn't mean anything obvious to me.

> You must have CONFIG_MODULE_UNLOAD=y, or else the kernel just prints a
> dummy " - -" in place of the refcount.  Though I'm not certain how lsmod
> translates that, so you might check /proc/modules directly.

# grep stap /proc/modules
stap_b3ee5e5f7f7df4f11fdf95b215e43f6_7050 26018 4294967295 [permanent], Live 0xffffffffa037e000
stap_b3ee5e5f7f7df4f11fdf95b215e43f6_6574 26018 4294967295 [permanent], Live 0xffffffffa0372000
stap_6bbd9bcdc91b5b9122793a314d03458_5786 26018 4294967295 [permanent], Live 0xffffffffa0366000
stap_8969cc5adcc470f954f5b37c4134b9a_5609 26034 4294967295 [permanent], Live 0xffffffffa035a000

> But it seems like your stap modules are being built without
> this machinery, so corruption ensues.

By "this machinery", do you mean the ->exit callback, and if so, is this the
->exit callback?

# grep module_exit /root/.systemtap/cache/6b/stap_6bbd9bcdc91b5b9122793a314d034588_795.c
static void systemtap_module_exit (void) {

> Are you sure that /lib/modules/`uname -r`/build matches the running kernel?

Yup, and, as above, I've just updated to 3.0.14 to be sure:

# uname -a
Linux b5 3.0.14-otn-00018-g2c7c13d #1 SMP Mon Dec 26 07:11:13 EST 2011 x86_64 GNU/Linux
# ls -l /lib/modules/`uname -r`/build
lrwxrwxrwx 1 root root 53 2011-12-26 07:22 /lib/modules/3.0.14-otn-00018-g2c7c13d/build -> /home/chris/git/linux-build/3.0.14-otn-00018-g2c7c13d

> If you can boot without any of the zfs stuff, then I'd experiment with
> that first, just to make sure stap is lined up with all the correct
> kernel build infrastructure.  Try something simple, like one of the
> syscall examples.

Without any of the zfs modules loaded...

# stap -v -e 'probe begin {printf("foo\n"); exit()}'
Pass 1: parsed user script and 78 library script(s) using 77348virt/21892res/2620shr kb, in 110usr/20sys/170real ms.
Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 embed(s), 0 global(s) using 77876virt/22684res/2852shr kb, in 0usr/0sys/3real ms.
Pass 3: translated to C into "/tmp/stapSBQNZp/stap_6bbd9bcdc91b5b9122793a314d034588_795_src.c" using 77876virt/22768res/2924shr kb, in 0usr/0sys/1real ms.
Pass 4: compiled C into "stap_6bbd9bcdc91b5b9122793a314d034588_795.ko" in 1120usr/160sys/1889real ms.
Pass 5: starting run.
foo
Error removing module 'stap_6bbd9bcdc91b5b9122793a314d03458_5786': Device or resource busy.
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run completed in 20usr/0sys/423real ms.
Pass 5: run failed.  Try again with another '--vp 00001' option.

Cheers,

Chris.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]