This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [PATCH] Relay CPU Hotplug support
- From: Oleg Nesterov <oleg at tv-sign dot ru>
- To: Andrew Morton <akpm at osdl dot org>
- Cc: ego at in dot ibm dot com, Mathieu Desnoyers <mathieu dot desnoyers at polymtl dot ca>, linux-kernel at vger dot kernel dot org, David Wilder <dwilder at us dot ibm dot com>, Tom Zanussi <zanussi at us dot ibm dot com>, Ingo Molnar <mingo at redhat dot com>, Greg Kroah-Hartman <gregkh at suse dot de>, Christoph Hellwig <hch at infradead dot org>, ltt-dev at shafik dot org, systemtap at sources dot redhat dot com, Douglas Niehaus <niehaus at eecs dot ku dot edu>, "Martin J. Bligh" <mbligh at mbligh dot org>, Thomas Gleixner <tglx at linutronix dot de>, kiran at scalex86 dot org, venkatesh dot pallipadi at intel dot com, dipankar at in dot ibm dot com, vatsa at in dot ibm dot com, torvalds at osdl dot org, davej at redhat dot com
- Date: Fri, 22 Dec 2006 19:20:44 +0300
- Subject: Re: [PATCH] Relay CPU Hotplug support
- References: <20061221003101.GA28643@Krystal> <20061220232350.eb4b6a46.akpm@osdl.org> <20061222103724.GA29348@in.ibm.com> <20061222024458.322adffd.akpm@osdl.org>
On 12/22, Andrew Morton wrote:
>
> On Fri, 22 Dec 2006 16:07:24 +0530
> Gautham R Shenoy <ego@in.ibm.com> wrote:
>
> > While we are at this per-subsystem cpuhotplug "locking", here's a
> > proposal that might put an end to the workqueue deadlock woes.
>
> Oleg is working on some patches which will permit us to cancel or wait upon
> a particular work_struct, rather than upon all pending work_structs.
I hope there are completed. I am waiting for the next -mm release to
send a "final" patch, I need too look at set_wq_data/set_wq_data when
workqueue.c will be in sync with Linus's changes.
> This will fix the problem where we accidentlly wait upon some unrelated
> work_struct which takes a lock which is related to one which we already
> hold.
>
> I hope. It'll be a bit tricky to implement: if some foreign work_struct is
> running right now, we cannot wait upon it - we must non-blockingly dequeue
> the work_struct which we want to kill before it gets to run.
The previous patch I sent
[PATCH, RFC rc1-mm1] implement flush_work()
http://marc.theaimsgroup.com/?l=linux-kernel&m=116647310413104
has a race.
+static void wait_on_work(struct cpu_workqueue_struct *cwq,
+ struct work_struct *work)
+{
+ struct wq_barrier barr;
+ int running = 0;
+
+ spin_lock_irq(&cwq->lock);
+ if (get_wq_data(work) == cwq) {
+ list_del_init(&work->entry);
+ work_release(work);
+ }
If that work is pending on CPU 1 it, and this CPU goes down, it may be
moved to CPU 0 after flush_work() already checked CPU 0.
I think we can do this:
static void wait_on_work(struct cpu_workqueue_struct *cwq,
struct work_struct *work)
{
struct wq_barrier barr;
int running = 0;
spin_lock_irq(&cwq->lock);
if (unlikely(cwq->current_work == work)) {
init_wq_barrier(&barr);
insert_work(cwq, &barr.work, 0);
running = 1;
}
spin_unlock_irq(&cwq->lock);
if (unlikely(running)) {
mutex_unlock(&workqueue_mutex);
wait_for_completion(&barr.done);
mutex_lock(&workqueue_mutex);
}
}
void flush_work(struct workqueue_struct *wq, struct work_struct *work)
{
struct cpu_workqueue_struct *cwq;
cwq = get_wq_data(work);
if (!cwq)
return;
spin_lock_irq(&cwq->lock);
list_del_init(&work->entry);
work_release(work);
spin_unlock_irq(&cwq->lock);
mutex_lock(&workqueue_mutex);
if (is_single_threaded(wq)) {
/* Always use first cpu's area. */
wait_on_work(per_cpu_ptr(wq->cpu_wq, singlethread_cpu), work);
} else {
int cpu;
for_each_online_cpu(cpu)
wait_on_work(per_cpu_ptr(wq->cpu_wq, cpu), work);
}
mutex_unlock(&workqueue_mutex);
}
Do you see any problems? When wait_on_work() unlocks workqueue_mutex (or
whatever we choose to protect against CPU hotplug), CPU may go away. But
in that case take_over_work() will move a barrier we queued to another
CPU, it will be fired sometime, and wait_on_work() will be woken.
Actually, we are doing cleanup_workqueue_thread()->kthread_stop() before
take_over_work(), so cwq->thread should complete its ->worklist (and thus
the barrier), because currently we don't check kthread_should_stop() in
run_workqueue(). But even if we did, everything looks safe to me.
Oleg.