This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/14829] sched_* functions wrongly alter thread scheduling, rather than process


http://sourceware.org/bugzilla/show_bug.cgi?id=14829

--- Comment #5 from Carlos O'Donell <carlos at redhat dot com> ---
(In reply to Rich Felker from comment #4)
> On Mon, Jun 17, 2013 at 07:34:51PM +0000, carlos at redhat dot com wrote:
> > The only uses which would break are AFAICT:
> > 
> > * tgkill
> > - Which could be enhanced to accept a tid, find the tgid, and then kill the
> > associated group.
> > 
> > * rt_tgsigqueueinfo
> > - Same problem.
> > 
> > They can be solved because the expected semantics is that you call these for a
> > thread group, and if you instead pass a tid, it is a one-way translation from
> > thread to thread group. Even with cgroups I don't think this invariant is
> > broken.
> > 
> > Did I miss anything?
> 
> Yes. getpid() uses the cached pid from the TCB. So if set_tid_address
> returned the tid, then getpid() would return the wrong pid. 

It's not the wrong pid, it's the identifier of the schedulable object which is
the kernel task associated with the process. It's as correct as you can get.

The kernel has always treated pid and tid as the same kind of identifier for
schedulable objects, so yes, calling any syscall that accepts pid will accept
tid. The Linux kernel makes no difference between the two.

The exception is tgkill and rt_tgsigqueueinfo, both which assert that the id of
the thread group leader matches the id passed into the function as the tgid.
This is why these two functions need slightly looser semantics.

And this specific code:
kernel/signal.c:
~~~
static int
do_send_specific(pid_t tgid, pid_t pid, int sig, struct siginfo *info)
{
        struct task_struct *p;
        int error = -ESRCH;

        rcu_read_lock();
        p = find_task_by_vpid(pid);
        if (p && (tgid <= 0 || task_tgid_vnr(p) == tgid)) {
                error = check_kill_permission(sig, info, p);
~~~
Should use:
~~~
static int
do_send_specific(pid_t tgid, pid_t pid, int sig, struct siginfo *info)
{
        struct task_struct *p, *pt;
        int error = -ESRCH;

        rcu_read_lock();
        p = find_task_by_vpid(pid);
        pt = find_task_by_vpid(tgid);
        if (p && (tgid <= 0 || task_tgid_vnr(p) == task_tgid_vnr(pt))) {
                error = check_kill_permission(sig, info, p);
~~~
Thus allowing any schedulable object to be passed in as the tgid,
and that object be used to find the thread group leader for doing
the work.

> On the other hand, if set_tid_address returned the pid, and all syscalls that
> accept tids would also accept a pid and treat it as the tid of the
> initial thread in that process, then things might not be so bad, but I
> think there might be risks involved in threads having two distinct
> identifiers, especially when the tid is used to mark the owner of a
> synchronization object. 

All kernel functions already do accept pid and tid interchangeably.

You can already have several different identifiers for the same task.

e.g.

include/linux/pid.h
~~~
enum pid_type
{
        PIDTYPE_PID,
        PIDTYPE_PGID,
        PIDTYPE_SID,
        PIDTYPE_MAX
};
~~~

> I'm not sure that there would be problems with
> this approach, but I still think it sounds risky.

AFAIK it would work, modulo the fix.

The return of getpid() would be the tid, and it would work.

However, the more I write out the design the uglier it gets.

It's probably better to just add another PIDTYPE_* value to support process
scheduling and a new syscall.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]