This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: futex and soft lockup


thanks for the reply.
I understand in theory user space would never lock up the whole system, but futex is different in that it requires the user space, which would be the glibc for all practical purposes, to do the right thing too. If you search the history of futex when it was first introduced, locking up the system wasn't un-heard of. However, now it is been used for a while, I didnt' expect to see this.

-gong




----- Original Message ----
From: Américo Wang <xiyou.wangcong@gmail.com>
To: Gong Cheng <chengg11@yahoo.com>
Cc: libc-help@sourceware.org
Sent: Tuesday, October 20, 2009 2:35:16 AM
Subject: Re: futex and soft lockup

On Tue, Oct 20, 2009 at 4:55 AM, Gong Cheng <chengg11@yahoo.com> wrote:
> Hi,
>    I am running glibc-2.5-34.x86_64.rpm (for CentOS) on top of a 2.6.31 (tried 2.6.30 too) kernel, and I am consistently seeing system soft lockups like the following:
>
> BUG: soft lockup - CPU#0 stuck for 61s! [<my program>:3068]
> <snip>
> Call Trace:
>  [<ffffffff8130e8d6>] ? _spin_lock+0x16/0x40
>  [<ffffffff8105fe85>] ? futex_wait_setup+0x75/0x100
>  [<ffffffff81060109>] ? futex_wait+0xf9/0x270
>  [<ffffffff8108c80b>] ? zone_statistics+0x5b/0x90
>  [<ffffffff810619fb>] ? do_futex+0xbb/0xcb0
>  [<ffffffff81082f98>] ? ____pagevec_lru_add+0x138/0x150
>  [<ffffffff810317ac>] ? update_curr+0x6c/0xc0
>  [<ffffffff810831b1>] ? __lru_cache_add+0x71/0xb0
>  [<ffffffff81083204>] ? lru_cache_add_lru+0x14/0x30
>  [<ffffffff8130eda1>] ? _spin_unlock+0x11/0x40
>  [<ffffffff8108f0de>] ? do_wp_page+0x28e/0x7b0
>  [<ffffffff81090e3a>] ? handle_mm_fault+0x59a/0x7c0
>  [<ffffffff8130ea12>] ? _spin_lock_irqsave+0x22/0x50
>  [<ffffffff8130ee63>] ? _spin_unlock_irqrestore+0x13/0x40
>  [<ffffffff81062680>] ? sys_futex+0x90/0x150
>  [<ffffffff81029417>] ? do_page_fault+0x187/0x2d0
>  [<ffffffff8100bceb>] ? system_call_fastpath+0x16/0x1b
>
> previously when running glibc-2.5.18 I didn't have this problem. In fact, if I switch back to 2.5.18 while keeping everything else  the same, the problem immediately stops.
>
> My program uses pthread and futex extensively. If I run the program in single-threaded mode, then I don't have the issue.
>
> I am aware I am not providing a lot of information here, but just want to quickly check if this issue is known to anyone here?
> Also in general, is it a bad idea to combine 2.5-34 glibc with the latest kernel?
>
> I'd appreciate any tips on this issue!

This is more than a kernel problem. :)

Kernel is not supposed to have a 'soft lockup' no matter how you use futex in
user-space. Would mind to try the latest git kernel with glibc-2.5-34?

Thanks.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]