This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.

Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: new syscall stub support for ia64 libc


Executive Summary:

 Please apply this patch.  It's Good.

Long version:

The patch below adds new syscall support for ia64 Linux.  Compared to
the earlier versions, it has a new autoconf test which ensures that
USE_DL_SYSINFO only gets defined if the compiler uses an unwinder that
is based on libunwind.  As explained earlier, the built-in unwinder
for GCC is hopeless and so there is no point trying to support it.

I tested these 4 configurations:

 (1) NPTL with a libunwind-enabled GCC
	(gcc 3.3.2 with libunwind fixes, kernel v2.6.0-test11, libunwind v0.95)
 (2) linuxthreads with a libunwind-enabled GCC
	(gcc 3.3.2 with libunwind fixes, kernel v2.6.0-test11, libunwind v0.95)
 (3) NPTL without a libunwind-enabled GCC
	(gcc 3.3.2 from Debian/unstable, kernel v2.6.0-test11)
 (4) linuxthreads without a libunwind-enabled GCC
	(gcc 3.3.2 from Debian/unstable, kernel v2.6.0-test11)

The linuxthreads versions (configs 2 and 4) have no failures (apart
from the normal tst-numeric failure, which is due to some locale files
that aren't installed on my machine).

NPTL with libunwind-enabled GCC (config 1) likewise has no failures.

NPTL *without* libunwind-enabled GCC (config 3) has several failures
(cancel{6,17), cancelx{4,5,6,16,17,18,oncex4} but the behavior is
identical compared to stock CVS libc (as of today), provided the
kernel does _not_ pass the gate DSO address via AT_SYSINFO_EHDR.  If
the kernel _does_ pass this address, then there are a couple of
additional failures (e.g., cancel2 and cancel3 would also fail).
These are due to the fact that the GCC built-in unwinder can only
unwind across the signal trampoline if there is _no_ unwind info for
the trampoline.  Since AT_SYSINFO_EHDR registers the unwind-info for
the signal trampoline and the built-in unwinder is not capable of
properly handling this info (e.g., it ignores the UNWABI directive),
this causes the additional failures.  If somebody _really_ cared, this
particular issue could be fixed relatively easily (e.g., if the Linux
sigtramp UNWABI directive is found, MD_FALLBACK_FRAME_STATE_FOR()
could be applied to step over the signal trampoline).  However, given
all the other problems with the built-in unwinder, I suspect it's just
snot worth bothering with it.

	--david

-------------------------------------------------------------------
ChangeLog

2003-12-02  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/ia64/elf/initfini.c: Add missing unwind directives.

	* sysdeps/ia64/dl-machine.h (elf_machine_matches_host): Mark with
	attribute "unused".
	(elf_machine_dynamic): Mark with attributes "unused" and "const".
	(elf_machine_runtime_setup): Likewise.

	* sysdeps/generic/dl-fptr.c (make_fptr_table): Mark with
	attribute "always_inline".
	* sysdeps/ia64/dl-machine.h (__ia64_init_bootstrap_fdesc_table):
	Likewise.

	* configure.in: Check whether compiler has libunwind support.

	* config.make.in (have-cc-with-libunwind): New variable.

	* config.h.in (HAVE_CC_WITH_LIBUNWIND): New macro.

	* Makeconfig (gnulib): If have-cc-withh-libunwind is "yes", also
	mention -lunwind.

2003-11-12  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/vfork.S: Use DO_CALL_VIA_BREAK()
	instead of DO_CALL().

	* sysdeps/unix/sysv/linux/ia64/brk.S (__curbrk): Restructure it
	to take advantage of DO_CALL() macro.
	* sysdeps/unix/sysv/linux/ia64/setcontext.S: Ditto.
	* sysdeps/unix/sysv/linux/ia64/getcontext.S: Ditto.

	* elf/rtld.c (dl_main): Restrict dl_sysinfo_dso check to first
	program header.  On ia64, the check failed previously because
	there are two program headers.

-------------------------------------------------------------------
linuxthreads/ChangeLog

2003-11-19  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/dl-sysdep.h: New file.

	* sysdeps/unix/sysv/linux/ia64/pt-initfini.c (INIT_NEW_WAY): New
	macro.
	(INIT_OLD_WAY): Likewise.  Define these macros depending on
	whether or not HAVE_INITFINI_ARRAY is defined.  If it is, use
	.init_array to invoke __pthread_initialize_minimal.  Also, add
	proper unwind-directives for _init and _fini.

-------------------------------------------------------------------
nptl/ChangeLog

2003-12-02  David Mosberger  <davidm@hpl.hp.com>

	* Makefile (link-libc-static): Remove -lgcc_eh---it's already mentioned
	in $(gnulib).  Also, remove stale comment.

2003-11-19  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/pt-initfini.c (INIT_NEW_WAY): New
	macro.
	(INIT_OLD_WAY): Likewise.  Define these macros depending on
	whether or not HAVE_INITFINI_ARRAY is defined.  If it is, use
	.init_array to invoke __pthread_initialize_minimal_internal.
	Also, add proper unwind-directives for _init and _fini.

2003-11-12  David Mosberger  <davidm@hpl.hp.com>

	* sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h (PSEUDO): Take
	advantage of new syscall stub and optimize accordingly.

	* sysdeps/unix/sysv/linux/ia64/lowlevellock.h (__NR_futex): Rename
	from SYS_futex, to match expectations of
	sysdep.h:DO_INLINE_SYSCALL.
	(lll_futex_clobbers): Remove.
	(lll_futex_timed_wait): Rewrite in terms of DO_INLINE_SYSCALL.
	(lll_futex_wake): Ditto.
	(lll_futex_requeue): Ditto.
	(__lll_mutex_trylock): Rewrite to a macro, so we can include this
	file before DO_INLINE_SYSCALL is defined (proposed by Jakub
	Jelinek).
	(__lll_mutex_lock): Ditto.
	(__lll_mutex_cond_lock): Ditto.
	(__lll_mutex_timed_lock): Ditto.
	(__lll_mutex_unlock): Ditto.
	(__lll_mutex_unlock_force): Ditto.

	* sysdeps/pthread/createthread.c (create_thread): Use
	THREAD_SELF_SYSINFO and THREAD_SYSINFO instead of open code.

	* sysdeps/ia64/tls.h: Move declaration of __thread_self up so it
	comes before the include of <sysdep.h>.
	(THREAD_SELF_SYSINFO): New macro.
	(THREAD_SYSINFO): Ditto.
	(INIT_SYSINFO): New macro.
	(TLS_INIT_TP): Call INIT_SYSINFO.

	* sysdeps/ia64/tcb-offsets.sym: Add SYSINFO_OFFSET.

	* allocatestack.c (allocate_stack): Use THREAD_SYSINFO and
	THREAD_SELF_SYSINFO instead of open code.

	* sysdeps/unix/sysv/linux/ia64/dl-sysdep.h: New file.

	* sysdeps/i386/tls.h (THREAD_SELF_SYSINFO): New macro.
	(THREAD_SYSINFO): Ditto.

-------------------------------------------------------------------
Index: Makeconfig
--- Makeconfig
+++ Makeconfig
@@ -511,7 +511,11 @@
 link-extra-libs-bounded = $(foreach lib,$(LDLIBS-$(@F:%-bp=%)),$(common-objpfx)$(lib)_b.a)
 
 ifndef gnulib
-gnulib := -lgcc -lgcc_eh
+ifneq ($(have-cc-with-libunwind),yes)
+ gnulib := -lgcc -lgcc_eh
+else
+ gnulib := -lgcc -lgcc_eh -lunwind
+endif
 endif
 ifeq ($(elf),yes)
 +preinit = $(addprefix $(csu-objpfx),crti.o)
Index: config.h.in
--- config.h.in
+++ config.h.in
@@ -153,6 +153,9 @@
    sections.  */
 #undef	HAVE_INITFINI_ARRAY
 
+/* Define if the compiler's exception support is based on libunwind.  */
+#undef	HAVE_CC_WITH_LIBUNWIND
+
 /* Define if the access to static and hidden variables is position independent
    and does not need relocations.  */
 #undef	PI_STATIC_AND_HIDDEN
Index: config.make.in
--- config.make.in
+++ config.make.in
@@ -55,6 +55,7 @@
 enable-check-abi = @enable_check_abi@
 have-forced-unwind = @libc_cv_forced_unwind@
 have-fpie = @libc_cv_fpie@
+have-cc-with-libunwind = @libc_cv_cc_with_libunwind@
 fno-unit-at-a-time = @fno_unit_at_a_time@
 
 static-libgcc = @libc_cv_gcc_static_libgcc@
Index: configure.in
--- configure.in
+++ configure.in
@@ -1219,6 +1219,19 @@
     AC_DEFINE(HAVE_INITFINI_ARRAY)
   fi
 
+  AC_CACHE_CHECK(for libunwind-support in compiler,
+		 libc_cv_cc_with_libunwind, [dnl
+    AC_TRY_LINK([#include <libunwind.h>], [
+      unw_context_t uc;
+      unw_cursor_t c;
+      unw_getcontext (&uc);
+      unw_init_local (&c, &uc)],
+        libc_cv_cc_with_libunwind=yes, libc_cv_cc_with_libunwind=no)])
+  AC_SUBST(libc_cv_cc_with_libunwind)
+  if test $libc_cv_cc_with_libunwind = yes; then
+    AC_DEFINE(HAVE_CC_WITH_LIBUNWIND)
+  fi
+
   AC_CACHE_CHECK(for -z nodelete option,
 		 libc_cv_z_nodelete, [dnl
   cat > conftest.c <<EOF
Index: elf/rtld.c
--- elf/rtld.c
+++ elf/rtld.c
@@ -1163,6 +1163,9 @@
       if (__builtin_expect (l != NULL, 1))
 	{
 	  static ElfW(Dyn) dyn_temp[DL_RO_DYN_TEMP_CNT];
+#ifndef NDEBUG
+	  uint_fast16_t pt_load_num = 0;
+#endif
 
 	  l->l_phdr = ((const void *) GL(dl_sysinfo_dso)
 		       + GL(dl_sysinfo_dso)->e_phoff);
@@ -1176,8 +1179,14 @@
 		  l->l_ldnum = ph->p_memsz / sizeof (ElfW(Dyn));
 		  break;
 		}
+#ifndef NDEBUG
 	      if (ph->p_type == PT_LOAD)
-		assert ((void *) ph->p_vaddr == GL(dl_sysinfo_dso));
+		{
+		  assert (pt_load_num
+			  || (void *) ph->p_vaddr == GL(dl_sysinfo_dso));
+		  pt_load_num++;
+		}
+#endif
 	    }
 	  elf_get_dynamic_info (l, dyn_temp);
 	  _dl_setup_hash (l);
Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
--- /dev/null
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
@@ -0,0 +1,45 @@
+/* System-specific settings for dynamic linker code.  IA-64 version.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_SYSDEP_H
+#define _DL_SYSDEP_H	1
+
+#define NEED_DL_SYSINFO	1
+#undef USE_DL_SYSINFO
+
+#if defined NEED_DL_SYSINFO && !defined __ASSEMBLER__
+/* Don't declare this as a function---we want it's entry-point, not
+   it's function descriptor... */
+extern int _dl_sysinfo_break attribute_hidden;
+# define DL_SYSINFO_DEFAULT ((uintptr_t) &_dl_sysinfo_break)
+# define DL_SYSINFO_IMPLEMENTATION		\
+  asm (".text\n\t"				\
+       ".hidden _dl_sysinfo_break\n\t"		\
+       ".proc _dl_sysinfo_break\n\t"		\
+       "_dl_sysinfo_break:\n\t"			\
+       ".prologue\n\t"				\
+       ".altrp b6\n\t"				\
+       ".body\n\t"				\
+       "break 0x100000;\n\t"			\
+       "br.ret.sptk.many b6;\n\t"		\
+       ".endp _dl_sysinfo_break"		\
+       ".previous");
+#endif
+
+#endif	/* dl-sysdep.h */
Index: linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
--- linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
+++ linuxthreads/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
@@ -1,5 +1,5 @@
 /* Special .init and .fini section support for ia64. LinuxThreads version.
-   Copyright (C) 2000, 2001, 2002 Free Software Foundation, Inc.
+   Copyright (C) 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it
@@ -36,34 +36,51 @@
    * crtn.s puts the corresponding function epilogues
    in the .init and .fini sections. */
 
+#include <stddef.h>
+
+#ifdef HAVE_INITFINI_ARRAY
+
+# define INIT_NEW_WAY \
+    ".xdata8 \".init_array\", @fptr(__pthread_initialize_minimal)\n"
+# define INIT_OLD_WAY ""
+#else
+# define INIT_NEW_WAY ""
+# define INIT_OLD_WAY \
+	"\n\
+	st8 [r12] = gp, -16\n\
+	br.call.sptk.many b0 = __pthread_initialize_minimal# ;;\n\
+	;;\n\
+	adds r12 = 16, r12\n\
+	;;\n\
+	ld8 gp = [r12]\n\
+	;;\n"
+#endif
+
 __asm__ ("\n\
 \n\
 #include \"defs.h\"\n\
 \n\
 /*@HEADER_ENDS*/\n\
 \n\
-/*@_init_PROLOG_BEGINS*/\n\
-	.section .init\n\
+/*@_init_PROLOG_BEGINS*/\n"
+	INIT_NEW_WAY
+	".section .init\n\
 	.align 16\n\
 	.global _init#\n\
 	.proc _init#\n\
 _init:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
-	;;\n\
-/* we could use r35 to save gp, but we use the stack since that's what\n\
- * all the other init routines will do --davidm 00/04/05 */\n\
-	st8 [r12] = gp, -16\n\
-	br.call.sptk.many b0 = __pthread_initialize_minimal# ;;\n\
-	;;\n\
-	adds r12 = 16, r12\n\
-	;;\n\
-	ld8 gp = [r12]\n\
-	;;\n\
-	.align 16\n\
-	.endp _init#\n\
+	;;\n"
+	INIT_OLD_WAY
+	".endp _init#\n\
 \n\
 /*@_init_PROLOG_ENDS*/\n\
 \n\
@@ -83,12 +100,16 @@
 	.global _fini#\n\
 	.proc _fini#\n\
 _fini:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-	.align 16\n\
 	.endp _fini#\n\
 \n\
 /*@_fini_PROLOG_ENDS*/\n\
Index: nptl/Makefile
--- nptl/Makefile
+++ nptl/Makefile
@@ -319,8 +319,7 @@
 CFLAGS-ftrylockfile.c = -D_IO_MTSAFE_IO
 CFLAGS-funlockfile.c = -D_IO_MTSAFE_IO
 
-# Ugly, ugly.  We have to link with libgcc_eh but how?
-link-libc-static := $(common-objpfx)libc.a $(gnulib) -lgcc_eh $(common-objpfx)libc.a
+link-libc-static := $(common-objpfx)libc.a $(gnulib) $(common-objpfx)libc.a
 
 ifeq ($(build-static),yes)
 tests-static += tst-locale1 tst-locale2
Index: nptl/allocatestack.c
--- nptl/allocatestack.c
+++ nptl/allocatestack.c
@@ -352,7 +352,7 @@
 
 #ifdef NEED_DL_SYSINFO
       /* Copy the sysinfo value from the parent.  */
-      pd->header.sysinfo = THREAD_GETMEM (THREAD_SELF, header.sysinfo);
+      THREAD_SYSINFO(pd) = THREAD_SELF_SYSINFO;
 #endif
 
       /* The process ID is also the same as that of the caller.  */
@@ -488,7 +488,7 @@
 
 #ifdef NEED_DL_SYSINFO
 	  /* Copy the sysinfo value from the parent.  */
-	  pd->header.sysinfo = THREAD_GETMEM (THREAD_SELF, header.sysinfo);
+	  THREAD_SYSINFO(pd) = THREAD_SELF_SYSINFO;
 #endif
 
 	  /* The process ID is also the same as that of the caller.  */
Index: nptl/sysdeps/i386/tls.h
--- nptl/sysdeps/i386/tls.h
+++ nptl/sysdeps/i386/tls.h
@@ -128,6 +128,8 @@
 # define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#define THREAD_SELF_SYSINFO	THREAD_GETMEM (THREAD_SELF, header.sysinfo)
+#define THREAD_SYSINFO(pd)	((pd)->header.sysinfo)
 
 /* Macros to load from and store into segment registers.  */
 # ifndef TLS_GET_GS
Index: nptl/sysdeps/ia64/tcb-offsets.sym
--- nptl/sysdeps/ia64/tcb-offsets.sym
+++ nptl/sysdeps/ia64/tcb-offsets.sym
@@ -2,3 +2,4 @@
 #include <tls.h>
 
 MULTIPLE_THREADS_OFFSET offsetof (struct pthread, header.multiple_threads) - sizeof (struct pthread)
+SYSINFO_OFFSET		offsetof (tcbhead_t, private)
Index: nptl/sysdeps/ia64/tls.h
--- nptl/sysdeps/ia64/tls.h
+++ nptl/sysdeps/ia64/tls.h
@@ -42,6 +42,8 @@
   void *private;
 } tcbhead_t;
 
+register struct pthread *__thread_self __asm__("r13");
+
 # define TLS_MULTIPLE_THREADS_IN_TCB 1
 
 #else /* __ASSEMBLER__ */
@@ -64,8 +66,6 @@
 /* Get system call information.  */
 # include <sysdep.h>
 
-register struct pthread *__thread_self __asm__("r13");
-
 /* This is the size of the initial TCB.  */
 # define TLS_INIT_TCB_SIZE sizeof (tcbhead_t)
 
@@ -100,11 +100,20 @@
 #  define GET_DTV(descr) \
   (((tcbhead_t *) (descr))->dtv)
 
+#define THREAD_SELF_SYSINFO	(((tcbhead_t *) __thread_self)->private)
+#define THREAD_SYSINFO(pd)	(((tcbhead_t *) ((pd) + 1))->private)
+
+#if defined NEED_DL_SYSINFO
+# define INIT_SYSINFO   THREAD_SELF_SYSINFO = (void *) GL(dl_sysinfo)
+#else
+# define INIT_SYSINFO   NULL
+#endif
+
 /* Code to initially initialize the thread pointer.  This might need
    special attention since 'errno' is not yet available and if the
    operation can cause a failure 'errno' must not be touched.  */
 # define TLS_INIT_TP(thrdescr, secondcall) \
-  (__thread_self = (thrdescr), NULL)
+  (__thread_self = (thrdescr), INIT_SYSINFO, NULL)
 
 /* Return the address of the dtv for the current thread.  */
 #  define THREAD_DTV() \
Index: nptl/sysdeps/pthread/createthread.c
--- nptl/sysdeps/pthread/createthread.c
+++ nptl/sysdeps/pthread/createthread.c
@@ -226,7 +226,7 @@
     }
 
 #ifdef NEED_DL_SYSINFO
-  assert (THREAD_GETMEM (THREAD_SELF, header.sysinfo) == pd->header.sysinfo);
+  assert (THREAD_SELF_SYSINFO == THREAD_SYSINFO(pd));
 #endif
 
   /* Actually create the thread.  */
Index: nptl/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
--- /dev/null
+++ nptl/sysdeps/unix/sysv/linux/ia64/dl-sysdep.h
@@ -0,0 +1,70 @@
+/* System-specific settings for dynamic linker code.  IA-64 version.
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _DL_SYSDEP_H
+#define _DL_SYSDEP_H	1
+
+/* This macro must be defined to either 0 or 1.
+
+   If 1, then an errno global variable hidden in ld.so will work right with
+   all the errno-using libc code compiled for ld.so, and there is never a
+   need to share the errno location with libc.  This is appropriate only if
+   all the libc functions that ld.so uses are called without PLT and always
+   get the versions linked into ld.so rather than the libc ones.  */
+
+#ifdef IS_IN_rtld
+# define RTLD_PRIVATE_ERRNO 1
+#else
+# define RTLD_PRIVATE_ERRNO 0
+#endif
+
+/* Traditionally system calls have been made using break 0x100000.  A
+   second method was introduced which, if possible, will use the EPC
+   instruction.  To signal the presence and where to find the code the
+   kernel passes an AT_SYSINFO_EHDR pointer in the auxiliary vector to
+   the application.  */
+#define NEED_DL_SYSINFO	1
+#ifdef HAVE_CC_WITH_LIBUNWIND
+# define USE_DL_SYSINFO	1
+#else
+  /* GCC's built-in unwinder is too broken for the new syscall stubs
+     to work properly.  */
+# undef USE_DL_SYSINFO
+#endif
+
+#if defined NEED_DL_SYSINFO && !defined __ASSEMBLER__
+/* Don't declare this as a function---we want it's entry-point, not
+   it's function descriptor... */
+extern int _dl_sysinfo_break attribute_hidden;
+# define DL_SYSINFO_DEFAULT ((uintptr_t) &_dl_sysinfo_break)
+# define DL_SYSINFO_IMPLEMENTATION		\
+  asm (".text\n\t"				\
+       ".hidden _dl_sysinfo_break\n\t"		\
+       ".proc _dl_sysinfo_break\n\t"		\
+       "_dl_sysinfo_break:\n\t"			\
+       ".prologue\n\t"				\
+       ".altrp b6\n\t"				\
+       ".body\n\t"				\
+       "break 0x100000;\n\t"			\
+       "br.ret.sptk.many b6;\n\t"		\
+       ".endp _dl_sysinfo_break"		\
+       ".previous");
+#endif
+
+#endif	/* dl-sysdep.h */
Index: nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
--- nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
+++ nptl/sysdeps/unix/sysv/linux/ia64/lowlevellock.h
@@ -26,7 +26,7 @@
 #include <ia64intrin.h>
 #include <atomic.h>
 
-#define SYS_futex		1230
+#define __NR_futex		1230
 #define FUTEX_WAIT		0
 #define FUTEX_WAKE		1
 #define FUTEX_REQUEUE		3
@@ -34,112 +34,52 @@
 /* Initializer for compatibility lock.	*/
 #define LLL_MUTEX_LOCK_INITIALIZER (0)
 
-#define lll_futex_clobbers \
-  "out5", "out6", "out7",						      \
-  /* Non-stacked integer registers, minus r8, r10, r15.  */		      \
-  "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	      \
-  "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	      \
-  "r28", "r29", "r30", "r31",						      \
-  /* Predicate registers.  */						      \
-  "p6", "p7", "p8", "p9", "p10", "p11", "p12", "p13", "p14", "p15",	      \
-  /* Non-rotating fp registers.  */					      \
-  "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	      \
-  /* Branch registers.  */						      \
-  "b6", "b7",								      \
-  "memory"
-
 #define lll_futex_wait(futex, val) lll_futex_timed_wait (futex, val, 0)
 
-#define lll_futex_timed_wait(futex, val, timespec) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_WAIT;			      \
-     register int __o2 asm ("out2") = (int) (val);			      \
-     register long int __o3 asm ("out3") = (long int) (timespec);	      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %7;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2), "=r" (__o3)   \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-		 	 "5" (__o2), "6" (__o3)				      \
-		       : "out4", lll_futex_clobbers);			      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
-
-
-#define lll_futex_wake(futex, nr) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_WAKE;			      \
-     register int __o2 asm ("out2") = (int) (nr);			      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %6;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2)		      \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-		 	 "5" (__o2)					      \
-		       : "out3", "out4", lll_futex_clobbers);		      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
+#define lll_futex_timed_wait(ftx, val, timespec)			\
+({									\
+   DO_INLINE_SYSCALL(futex, 4, (long) (ftx), FUTEX_WAIT, (int) (val),	\
+		     (long) (timespec));				\
+   _r10 == -1 ? -_retval : _retval;					\
+})
+
+#define lll_futex_wake(ftx, nr)						\
+({									\
+   DO_INLINE_SYSCALL(futex, 3, (long) (ftx), FUTEX_WAKE, (int) (nr));	\
+   _r10 == -1 ? -_retval : _retval;					\
+})
+
+#define lll_futex_requeue(ftx, nr_wake, nr_move, mutex)			     \
+({									     \
+   DO_INLINE_SYSCALL(futex, 5, (long) (ftx), FUTEX_REQUEUE, (int) (nr_wake), \
+		     (int) (nr_move), (long) (mutex));			     \
+   _r10 == -1 ? -_retval : _retval;					     \
+})
 
 
-#define lll_futex_requeue(futex, nr_wake, nr_move, mutex) \
-  ({									      \
-     register long int __o0 asm ("out0") = (long int) (futex);		      \
-     register long int __o1 asm ("out1") = FUTEX_REQUEUE;		      \
-     register int __o2 asm ("out2") = (int) (nr_wake);			      \
-     register int __o3 asm ("out3") = (int) (nr_move);			      \
-     register long int __o4 asm ("out4") = (long int) (mutex);		      \
-     register long int __r8 asm ("r8");					      \
-     register long int __r10 asm ("r10");				      \
-     register long int __r15 asm ("r15") = SYS_futex;			      \
-									      \
-     __asm __volatile ("break %8;;"					      \
-		       : "=r" (__r8), "=r" (__r10), "=r" (__r15),	      \
-			 "=r" (__o0), "=r" (__o1), "=r" (__o2), "=r" (__o3),  \
-			 "=r" (__o4)					      \
-		       : "i" (0x100000), "2" (__r15), "3" (__o0), "4" (__o1), \
-			 "5" (__o2), "6" (__o3), "7" (__o4)		      \
-		       : lll_futex_clobbers);				      \
-     __r10 == -1 ? -__r8 : __r8;					      \
-  })
-
-
-static inline int
-__attribute__ ((always_inline))
-__lll_mutex_trylock (int *futex)
-{
-  return atomic_compare_and_exchange_val_acq (futex, 1, 0) != 0;
-}
+#define __lll_mutex_trylock(futex) \
+  (atomic_compare_and_exchange_val_acq (futex, 1, 0) != 0)
 #define lll_mutex_trylock(futex) __lll_mutex_trylock (&(futex))
 
 
 extern void __lll_lock_wait (int *futex) attribute_hidden;
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_lock (int *futex)
-{
-  if (atomic_compare_and_exchange_bool_acq (futex, 1, 0) != 0)
-    __lll_lock_wait (futex);
-}
+#define __lll_mutex_lock(futex)						\
+  ((void) ({								\
+    int *__futex = (futex);						\
+    if (atomic_compare_and_exchange_bool_acq (__futex, 1, 0) != 0)	\
+      __lll_lock_wait (__futex);					\
+  }))
 #define lll_mutex_lock(futex) __lll_mutex_lock (&(futex))
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_cond_lock (int *futex)
-{
-  if (atomic_compare_and_exchange_bool_acq (futex, 2, 0) != 0)
-    __lll_lock_wait (futex);
-}
+#define __lll_mutex_cond_lock(futex)					\
+  ((void) ({								\
+    int *__futex = (futex);						\
+    if (atomic_compare_and_exchange_bool_acq (__futex, 2, 0) != 0)	\
+      __lll_lock_wait (__futex);					\
+  }))
 #define lll_mutex_cond_lock(futex) __lll_mutex_cond_lock (&(futex))
 
 
@@ -147,41 +87,37 @@
      attribute_hidden;
 
 
-static inline int
-__attribute__ ((always_inline))
-__lll_mutex_timedlock (int *futex, const struct timespec *abstime)
-{
-  int result = 0;
-
-  if (atomic_compare_and_exchange_bool_acq (futex, 1, 0) != 0)
-    result = __lll_timedlock_wait (futex, abstime);
-
-  return result;
-}
+#define __lll_mutex_timedlock(futex, abstime)				\
+  ({									\
+     int *__futex = (futex);						\
+     int __val = 0;							\
+									\
+     if (atomic_compare_and_exchange_bool_acq (__futex, 1, 0) != 0)	\
+       __val = __lll_timedlock_wait (__futex, abstime);			\
+     __val;								\
+  })
 #define lll_mutex_timedlock(futex, abstime) \
   __lll_mutex_timedlock (&(futex), abstime)
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_unlock (int *futex)
-{
-  int val = atomic_exchange_rel (futex, 0);
-
-  if (__builtin_expect (val > 1, 0))
-    lll_futex_wake (futex, 1);
-}
+#define __lll_mutex_unlock(futex)			\
+  ((void) ({						\
+    int *__futex = (futex);				\
+    int __val = atomic_exchange_rel (__futex, 0);	\
+							\
+    if (__builtin_expect (__val > 1, 0))		\
+      lll_futex_wake (__futex, 1);			\
+  }))
 #define lll_mutex_unlock(futex) \
   __lll_mutex_unlock(&(futex))
 
 
-static inline void
-__attribute__ ((always_inline))
-__lll_mutex_unlock_force (int *futex)
-{
-  (void) atomic_exchange_rel (futex, 0);
-  lll_futex_wake (futex, 1);
-}
+#define __lll_mutex_unlock_force(futex)		\
+  ((void) ({					\
+    int *__futex = (futex);			\
+    (void) atomic_exchange_rel (__futex, 0);	\
+    lll_futex_wake (__futex, 1);		\
+  }))
 #define lll_mutex_unlock_force(futex) \
   __lll_mutex_unlock_force(&(futex))
 
Index: nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
--- nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
+++ nptl/sysdeps/unix/sysv/linux/ia64/pt-initfini.c
@@ -36,6 +36,22 @@
    * crtn.s puts the corresponding function epilogues
    in the .init and .fini sections. */
 
+#include <stddef.h>
+
+#ifdef HAVE_INITFINI_ARRAY
+
+__asm__ ("\n\
+#include \"defs.h\"\n\
+\n\
+/*@HEADER_ENDS*/\n\
+\n\
+/*@_init_PROLOG_BEGINS*/\n\
+	.xdata8 \".init_array\",@fptr(__pthread_initialize_minimal_internal)\n\
+/*@_init_PROLOG_ENDS*/\n\
+");
+
+#else
+
 __asm__ ("\n\
 \n\
 #include \"defs.h\"\n\
@@ -48,13 +64,16 @@
 	.global _init#\n\
 	.proc _init#\n\
 _init:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-/* we could use r35 to save gp, but we use the stack since that's what\n\
- * all the other init routines will do --davidm 00/04/05 */\n\
 	st8 [r12] = gp, -16\n\
 	br.call.sptk.many b0 = __pthread_initialize_minimal_internal# ;;\n\
 	;;\n\
@@ -62,13 +81,18 @@
 	;;\n\
 	ld8 gp = [r12]\n\
 	;;\n\
-	.align 16\n\
 	.endp _init#\n\
 \n\
 /*@_init_PROLOG_ENDS*/\n\
 \n\
 /*@_init_EPILOG_BEGINS*/\n\
 	.section .init\n\
+	.proc _init#\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
+	.vframe r32\n\
+	.save rp, r33\n\
+	.body\n\
 	.regstk 0,2,0,0\n\
 	mov r12 = r32\n\
 	mov ar.pfs = r34\n\
@@ -83,18 +107,28 @@
 	.global _fini#\n\
 	.proc _fini#\n\
 _fini:\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
 	alloc r34 = ar.pfs, 0, 3, 0, 0\n\
+	.vframe r32\n\
 	mov r32 = r12\n\
+	.save rp, r33\n\
 	mov r33 = b0\n\
+	.body\n\
 	adds r12 = -16, r12\n\
 	;;\n\
-	.align 16\n\
 	.endp _fini#\n\
 \n\
 /*@_fini_PROLOG_ENDS*/\n\
 \n\
 /*@_fini_EPILOG_BEGINS*/\n\
 	.section .fini\n\
+	.proc _fini#\n\
+	.prologue\n\
+	.save ar.pfs, r34\n\
+	.vframe r32\n\
+	.save rp, r33\n\
+	.body\n\
 	mov r12 = r32\n\
 	mov ar.pfs = r34\n\
 	mov b0 = r33\n\
@@ -106,3 +140,5 @@
 /*@TRAILER_BEGINS*/\n\
 	.weak	__gmon_start__#\n\
 ");
+
+#endif
Index: nptl/sysdeps/unix/sysv/linux/ia64/pt-vfork.S
--- nptl/sysdeps/unix/sysv/linux/ia64/pt-vfork.S
+++ nptl/sysdeps/unix/sysv/linux/ia64/pt-vfork.S
@@ -30,6 +30,8 @@
 /* Implemented as __clone_syscall(CLONE_VFORK | CLONE_VM | SIGCHLD, 0)	*/
 
 ENTRY(__vfork)
+	.prologue	// work around a GAS bug which triggers if
+	.body		// first .prologue is not at the beginning of proc.
 	alloc r2=ar.pfs,0,0,2,0
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
Index: nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
--- nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
+++ nptl/sysdeps/unix/sysv/linux/ia64/sysdep-cancel.h
@@ -26,6 +26,9 @@
 #if !defined NOT_IN_libc || defined IS_IN_libpthread || defined IS_IN_librt
 
 # undef PSEUDO
+
+#ifndef USE_DL_SYSINFO
+
 # define PSEUDO(name, syscall_name, args)				      \
 .text;									      \
 ENTRY (name)								      \
@@ -88,6 +91,83 @@
      mov r8 = -1;							      \
      mov ar.pfs = loc0
 
+#else /* USE_DL_SYSINFO */
+
+# define PSEUDO(name, syscall_name, args)				      \
+.text;									      \
+ENTRY (name)								      \
+     .prologue;								      \
+     adds r2 = SYSINFO_OFFSET, r13;					      \
+     adds r14 = MULTIPLE_THREADS_OFFSET, r13;				      \
+     .save ar.pfs, r11;							      \
+     mov r11 = ar.pfs;;							      \
+     .body;								      \
+     ld4 r14 = [r14];							      \
+     ld8 r2 = [r2];							      \
+     mov r15 = SYS_ify(syscall_name);;					      \
+     cmp4.ne p6, p7 = 0, r14;						      \
+     mov b7 = r2;							      \
+(p6) br.cond.spnt .Lpseudo_cancel;					      \
+     br.call.sptk.many b6 = b7;;					      \
+     mov ar.pfs = r11;							      \
+     cmp.eq p6,p0 = -1, r10;						      \
+(p6) br.cond.spnt.few __syscall_error;					      \
+     ret;;								      \
+     .endp name;							      \
+     .proc __GC_##name;							      \
+     .globl __GC_##name;						      \
+     .hidden __GC_##name;						      \
+__GC_##name:								      \
+.Lpseudo_cancel:							      \
+     .prologue;								      \
+     .regstk args, 5, args, 0;						      \
+     .save ar.pfs, loc0;						      \
+     alloc loc0 = ar.pfs, args, 5, args, 0;				      \
+     adds loc4 = SYSINFO_OFFSET, r13;					      \
+     .save rp, loc1;							      \
+     mov loc1 = rp;;							      \
+     .body;								      \
+     ld8 loc4 = [loc4];							      \
+     CENABLE;;								      \
+     mov loc2 = r8;							      \
+     mov b7 = loc4;							      \
+     COPY_ARGS_##args							      \
+     mov r15 = SYS_ify(syscall_name);					      \
+     br.call.sptk.many b6 = b7;;					      \
+     mov loc3 = r8;							      \
+     mov loc4 = r10;							      \
+     mov out0 = loc2;							      \
+     CDISABLE;;								      \
+     cmp.eq p6,p0=-1,loc4;						      \
+(p6) br.cond.spnt.few __syscall_error_##args;				      \
+     mov r8 = loc3;							      \
+     mov rp = loc1;							      \
+     mov ar.pfs = loc0;							      \
+.Lpseudo_end:								      \
+     ret;								      \
+     .endp __GC_##name;							      \
+.section .gnu.linkonce.t.__syscall_error_##args, "ax";			      \
+     .align 32;								      \
+     .proc __syscall_error_##args;					      \
+     .global __syscall_error_##args;					      \
+     .hidden __syscall_error_##args;					      \
+     .size __syscall_error_##args, 64;					      \
+__syscall_error_##args:							      \
+     .prologue;								      \
+     .regstk args, 5, args, 0;						      \
+     .save ar.pfs, loc0;						      \
+     .save rp, loc1;							      \
+     .body;								      \
+     mov loc4 = r1;;							      \
+     br.call.sptk.many b0 = __errno_location;;				      \
+     st4 [r8] = loc3;							      \
+     mov r1 = loc4;							      \
+     mov rp = loc1;							      \
+     mov r8 = -1;							      \
+     mov ar.pfs = loc0
+
+#endif /* USE_DL_SYSINFO */
+
 #undef PSEUDO_END
 #define PSEUDO_END(name) .endp
 
Index: sysdeps/generic/dl-fptr.c
--- sysdeps/generic/dl-fptr.c
+++ sysdeps/generic/dl-fptr.c
@@ -163,7 +163,7 @@
 }
 
 
-static inline ElfW(Addr) *
+static inline ElfW(Addr) * __attribute__ ((always_inline))
 make_fptr_table (struct link_map *map)
 {
   const ElfW(Sym) *symtab
Index: sysdeps/ia64/dl-machine.h
--- sysdeps/ia64/dl-machine.h
+++ sysdeps/ia64/dl-machine.h
@@ -33,7 +33,7 @@
    in l_info array.  */
 #define DT_IA_64(x) (DT_IA_64_##x - DT_LOPROC + DT_NUM)
 
-static inline void
+static inline void __attribute__ ((always_inline))
 __ia64_init_bootstrap_fdesc_table (struct link_map *map)
 {
   Elf64_Addr *boot_table;
@@ -49,7 +49,7 @@
 	__ia64_init_bootstrap_fdesc_table (&bootstrap_map);
 
 /* Return nonzero iff ELF header is compatible with the running host.  */
-static inline int
+static inline int __attribute__ ((unused))
 elf_machine_matches_host (const Elf64_Ehdr *ehdr)
 {
   return ehdr->e_machine == EM_IA_64;
@@ -57,7 +57,7 @@
 
 
 /* Return the link-time address of _DYNAMIC.  */
-static inline Elf64_Addr
+static inline Elf64_Addr __attribute__ ((unused, const))
 elf_machine_dynamic (void)
 {
   Elf64_Addr *p;
@@ -77,7 +77,7 @@
 
 
 /* Return the run-time load address of the shared object.  */
-static inline Elf64_Addr
+static inline Elf64_Addr __attribute__ ((unused))
 elf_machine_load_address (void)
 {
   Elf64_Addr ip;
@@ -98,7 +98,7 @@
 /* Set up the loaded object described by L so its unrelocated PLT
    entries will jump to the on-demand fixup code in dl-runtime.c.  */
 
-static inline int __attribute__ ((always_inline))
+static inline int __attribute__ ((unused, always_inline))
 elf_machine_runtime_setup (struct link_map *l, int lazy, int profile)
 {
   extern void _dl_runtime_resolve (void);
Index: sysdeps/ia64/elf/initfini.c
--- sysdeps/ia64/elf/initfini.c
+++ sysdeps/ia64/elf/initfini.c
@@ -61,16 +61,20 @@
 #endif
 
 __asm__ (".section .init\n"
-"	.align 16\n"
 "	.global _init#\n"
 "	.proc _init#\n"
 "_init:\n"
+"	.prologue\n"
+"	.save ar.pfs, r34\n"
 "	alloc r34 = ar.pfs, 0, 3, 0, 0\n"
+"	.vframe r32\n"
 "	mov r32 = r12\n"
+"	.save rp, r33\n"
 "	mov r33 = b0\n"
+"	.body\n"
 "	adds r12 = -16, r12\n"
 #ifdef HAVE_INITFINI_ARRAY
- "	;;\n"		/* see gmon_initializer() below */
+"	;;\n"		/* see gmon_initializer() above */
 #else
 "	.weak	__gmon_start__#\n"
 "	addl r14 = @ltoff(@fptr(__gmon_start__#)), gp\n"
@@ -90,12 +94,17 @@
 "	;;\n"
 ".L5:\n"
 #endif
-"	.align 16\n"
 "	.endp _init#\n"
 "\n"
 "/*@_init_PROLOG_ENDS*/\n"
 "\n"
 "/*@_init_EPILOG_BEGINS*/\n"
+"	.proc _init#\n"
+"	.prologue\n"
+"	.save ar.pfs, r34\n"
+"	.vframe r32\n"
+"	.save rp, r33\n"
+"	.body\n"
 "	.section .init\n"
 "	.regstk 0,2,0,0\n"
 "	mov r12 = r32\n"
@@ -107,16 +116,19 @@
 "\n"
 "/*@_fini_PROLOG_BEGINS*/\n"
 "	.section .fini\n"
-"	.align 16\n"
 "	.global _fini#\n"
 "	.proc _fini#\n"
 "_fini:\n"
+"	.prologue\n"
+"	.save ar.pfs, r34\n"
 "	alloc r34 = ar.pfs, 0, 3, 0, 0\n"
+"	.vframe r32\n"
 "	mov r32 = r12\n"
+"	.save rp, r33\n"
 "	mov r33 = b0\n"
+"	.body\n"
 "	adds r12 = -16, r12\n"
 "	;;\n"
-"	.align 16\n"
 "	.endp _fini#\n"
 "\n"
 "/*@_fini_PROLOG_ENDS*/\n"
@@ -125,6 +137,12 @@
 "\n"
 "/*@_fini_EPILOG_BEGINS*/\n"
 "	.section .fini\n"
+"	.proc _fini#\n"
+"	.prologue\n"
+"	.save ar.pfs, r34\n"
+"	.vframe r32\n"
+"	.save rp, r33\n"
+"	.body\n"
 "	mov r12 = r32\n"
 "	mov ar.pfs = r34\n"
 "	mov b0 = r33\n"
Index: sysdeps/unix/sysv/linux/ia64/brk.S
--- sysdeps/unix/sysv/linux/ia64/brk.S
+++ sysdeps/unix/sysv/linux/ia64/brk.S
@@ -35,19 +35,17 @@
 weak_alias (__curbrk, ___brk_addr)
 
 LEAF(__brk)
-	mov	r15=__NR_brk
-	break.i	__BREAK_SYSCALL
+	.regstk 1, 0, 0, 0
+	DO_CALL(__NR_brk)
+	cmp.ltu	p6, p0 = ret0, in0
+	addl r9 = @ltoff(__curbrk), gp
 	;;
-	cmp.ltu	p6,p0=ret0,r32	/* r32 is the input register, even though we
-				   haven't allocated a frame */
-	addl	r9=@ltoff(__curbrk),gp
-	;;
-	ld8	r9=[r9]
-(p6) 	mov	ret0=ENOMEM
+	ld8 r9 = [r9]
+(p6) 	mov ret0 = ENOMEM
 (p6)	br.cond.spnt.few __syscall_error
 	;;
-	st8	[r9]=ret0
-	mov 	ret0=0
+	st8 [r9] = ret0
+	mov ret0 = 0
 	ret
 END(__brk)
 
Index: sysdeps/unix/sysv/linux/ia64/clone2.S
--- sysdeps/unix/sysv/linux/ia64/clone2.S
+++ sysdeps/unix/sysv/linux/ia64/clone2.S
@@ -25,49 +25,56 @@
 /* 	         size_t child_stack_size, int flags, void *arg,		*/
 /*	         pid_t *parent_tid, void *tls, pid_t *child_tid)	*/
 
+#define CHILD	p8
+#define PARENT	p9
+
 ENTRY(__clone2)
-	alloc r2=ar.pfs,8,2,6,0
+	.prologue
+	alloc r2=ar.pfs,8,0,6,0
 	cmp.eq p6,p0=0,in0
 	mov r8=EINVAL
-(p6)	br.cond.spnt.few __syscall_error
-	;;
-	flushrs			/* This is necessary, since the child	*/
-				/* will be running with the same 	*/
-				/* register backing store for a few 	*/
-				/* instructions.  We need to ensure	*/
-				/* that it will not read or write the	*/
-				/* backing store.			*/
-	mov loc0=in0		/* save fn	*/
-	mov loc1=in4		/* save arg	*/
 	mov out0=in3		/* Flags are first syscall argument.	*/
 	mov out1=in1		/* Stack address.			*/
+(p6)	br.cond.spnt.many __syscall_error
+	;;
 	mov out2=in2		/* Stack size.				*/
 	mov out3=in5		/* Parent TID Pointer			*/
 	mov out4=in7		/* Child TID Pointer			*/
  	mov out5=in6		/* TLS pointer				*/
-        DO_CALL (SYS_ify (clone2))
+	/*
+	 * clone2() is special: the child cannot execute br.ret right
+	 * after the system call returns, because it starts out
+	 * executing on an empty stack.  Because of this, we can't use
+	 * the new (lightweight) syscall convention here.  Instead, we
+	 * just fall back on always using "break".
+	 *
+	 * Furthermore, since the child starts with an empty stack, we
+	 * need to avoid unwinding past invalid memory.  To that end,
+	 * we'll pretend now that __clone2() is the end of the
+	 * call-chain.  This is wrong for the parent, but only until
+	 * it returns from clone2() but it's better than the
+	 * alternative.
+	 */
+	mov r15=SYS_ify (clone2)
+	.save rp, r0
+	break __BREAK_SYSCALL
+	.body
         cmp.eq p6,p0=-1,r10
+	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?   */
+(p6)	br.cond.spnt.many __syscall_error
 	;;
-(p6)	br.cond.spnt.few __syscall_error
-
-#	define CHILD p6
-#	define PARENT p7
-	cmp.eq CHILD,PARENT=0,r8 /* Are we the child?	*/
-	;;
-(CHILD)	ld8 out1=[loc0],8	/* Retrieve code pointer.	*/
-(CHILD)	mov out0=loc1		/* Pass proper argument	to fn */
+(CHILD)	ld8 out1=[in0],8	/* Retrieve code pointer.	*/
+(CHILD)	mov out0=in4		/* Pass proper argument	to fn */
 (PARENT) ret
 	;;
-	ld8 gp=[loc0]		/* Load function gp.		*/
+	ld8 gp=[in0]		/* Load function gp.		*/
 	mov b6=out1
-	;;
-	br.call.dptk.few rp=b6	/* Call fn(arg) in the child 	*/
+	br.call.dptk.many rp=b6	/* Call fn(arg) in the child 	*/
 	;;
 	mov out0=r8		/* Argument to _exit		*/
 	.globl _exit
-	br.call.dpnt.few rp=_exit /* call _exit with result from fn.	*/
+	br.call.dpnt.many rp=_exit /* call _exit with result from fn.	*/
 	ret			/* Not reached.		*/
-
 PSEUDO_END(__clone2)
 
 /* For now we leave __clone undefined.  This is unlikely to be a	*/
Index: sysdeps/unix/sysv/linux/ia64/getcontext.S
--- sysdeps/unix/sysv/linux/ia64/getcontext.S
+++ sysdeps/unix/sysv/linux/ia64/getcontext.S
@@ -35,26 +35,27 @@
 
 ENTRY(__getcontext)
 	.prologue
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_BLOCK, NULL, &sc->sc_mask):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_BLOCK
-	mov out1 = 0
-	add out2 = r2, in0
-	mov out3 = 8	// sizeof kernel sigset_t
 
-	break __BREAK_SYSCALL
 	flushrs					// save dirty partition on rbs
+	mov out1 = 0
+	add out2 = r3, in0
+
+	mov out3 = 8	// sizeof kernel sigset_t
+	DO_CALL(__NR_rt_sigprocmask)
 
 	mov.m rFPSR = ar.fpsr
 	mov.m rRSC = ar.rsc
 	add r2 = SC_GR+1*8, r32
 	;;
 	mov.m rBSP = ar.bsp
+	.prologue
 	.save ar.unat, rUNAT
 	mov.m rUNAT = ar.unat
 	.body
@@ -63,7 +64,7 @@
 
 .mem.offset 0,0; st8.spill [r2] = r1, (5*8 - 1*8)
 .mem.offset 8,0; st8.spill [r3] = r4, 16
-	mov.i rPFS = ar.pfs
+	mov rPFS = r11
 	;;
 .mem.offset 0,0; st8.spill [r2] = r5, 16
 .mem.offset 8,0; st8.spill [r3] = r6, 48
Index: sysdeps/unix/sysv/linux/ia64/setcontext.S
--- sysdeps/unix/sysv/linux/ia64/setcontext.S
+++ sysdeps/unix/sysv/linux/ia64/setcontext.S
@@ -32,20 +32,21 @@
   other than the PRESERVED state.  */
 
 ENTRY(__setcontext)
-	alloc r16 = ar.pfs, 1, 0, 4, 0
+	.prologue
+	.body
+	alloc r11 = ar.pfs, 1, 0, 4, 0
 
 	// sigprocmask (SIG_SETMASK, &sc->sc_mask, NULL):
 
-	mov r2 = SC_MASK
-	mov r15 = __NR_rt_sigprocmask
-	;;
+	mov r3 = SC_MASK
 	mov out0 = SIG_SETMASK
-	add out1 = r2, in0
+	;;
+	add out1 = r3, in0
 	mov out2 = 0
 	mov out3 = 8	// sizeof kernel sigset_t
 
 	invala
-	break __BREAK_SYSCALL
+	DO_CALL(__NR_rt_sigprocmask)
 	add r2 = SC_NAT, r32
 
 	add r3 = SC_RNAT, r32			// r3 <- &sc_ar_rnat
Index: sysdeps/unix/sysv/linux/ia64/sysdep.h
--- sysdeps/unix/sysv/linux/ia64/sysdep.h
+++ sysdeps/unix/sysv/linux/ia64/sysdep.h
@@ -23,6 +23,8 @@
 
 #include <sysdeps/unix/sysdep.h>
 #include <sysdeps/ia64/sysdep.h>
+#include <dl-sysdep.h>
+#include <tls.h>
 
 /* As of GAS v2.4.90.0.7, including a ".align" directive inside a
    function will cause bad unwind info to be emitted (GAS doesn't know
@@ -58,6 +60,14 @@
 # define __NR_semtimedop 1247
 #endif
 
+#if defined USE_DL_SYSINFO \
+	&& (!defined NOT_IN_libc \
+	    || defined IS_IN_libpthread || defined IS_IN_librt)
+# define IA64_USE_NEW_STUB
+#else
+# undef IA64_USE_NEW_STUB
+#endif
+
 #ifdef __ASSEMBLER__
 
 #undef CALL_MCOUNT
@@ -102,9 +112,45 @@
 	cmp.eq p6,p0=-1,r10;			\
 (p6)	br.cond.spnt.few __syscall_error;
 
-#define DO_CALL(num)				\
+#define DO_CALL_VIA_BREAK(num)			\
 	mov r15=num;				\
-	break __BREAK_SYSCALL;
+	break __BREAK_SYSCALL
+
+#ifdef IA64_USE_NEW_STUB
+# ifdef SHARED
+#  define DO_CALL(num)				\
+	.prologue;				\
+	adds r2 = SYSINFO_OFFSET, r13;;		\
+	ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+	mov r11 = ar.pfs;;			\
+	.body;					\
+	mov r15 = num;				\
+	mov b7 = r2;				\
+	br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+	mov ar.pfs = r11;			\
+	.prologue;				\
+	.body
+# else /* !SHARED */
+#  define DO_CALL(num)				\
+	.prologue;				\
+	mov r15 = num;				\
+	movl r2 = _dl_sysinfo;;			\
+	ld8 r2 = [r2];				\
+	.save ar.pfs, r11;			\
+	mov r11 = ar.pfs;;			\
+	.body;					\
+	mov b7 = r2;				\
+	br.call.sptk.many b6 = b7;;		\
+	.restore sp;				\
+	mov ar.pfs = r11;			\
+	.prologue;				\
+	.body
+# endif
+#else
+# define DO_CALL(num)				DO_CALL_VIA_BREAK(num)
+#endif
 
 #undef PSEUDO_END
 #define PSEUDO_END(name)	.endp C_SYMBOL_NAME(name);
@@ -150,45 +196,64 @@
    from a syscall.  r10 is set to -1 on error, whilst r8 contains the
    (non-negative) errno on error or the return value on success.
  */
-#undef INLINE_SYSCALL
-#define INLINE_SYSCALL(name, nr, args...)			\
-  ({								\
+
+#ifdef IA64_USE_NEW_STUB
+
+#define DO_INLINE_SYSCALL(name, nr, args...)					\
+    register long _r8 __asm ("r8");						\
+    register long _r10 __asm ("r10");						\
+    register long _r15 __asm ("r15") = __NR_##name;				\
+    register void *_b7 __asm ("b7") = ((tcbhead_t *) __thread_self)->private;	\
+    long _retval;								\
+    LOAD_ARGS_##nr (args);							\
+    /*										\
+     * Don't specify any unwind info here.  We mark ar.pfs as			\
+     * clobbered.  This will force the compiler to save ar.pfs			\
+     * somewhere and emit appropriate unwind info for that save.		\
+     */										\
+    __asm __volatile ("br.call.sptk.many b6=%0;;\n"				\
+		      : "=b"(_b7), "=r" (_r8), "=r" (_r10), "=r" (_r15)		\
+			ASM_OUTARGS_##nr					\
+		      : "0" (_b7), "3" (_r15) ASM_ARGS_##nr			\
+		      : "memory", "ar.pfs" ASM_CLOBBERS_##nr);			\
+    _retval = _r8;
+
+#else /* !IA64_USE_NEW_STUB */
+
+#define DO_INLINE_SYSCALL(name, nr, args...)			\
     register long _r8 asm ("r8");				\
     register long _r10 asm ("r10");				\
     register long _r15 asm ("r15") = __NR_##name;		\
     long _retval;						\
     LOAD_ARGS_##nr (args);					\
     __asm __volatile (BREAK_INSN (__BREAK_SYSCALL)		\
-                      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
+		      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
 			ASM_OUTARGS_##nr			\
-                      : "2" (_r15) ASM_ARGS_##nr		\
-                      : "memory" ASM_CLOBBERS_##nr);		\
-    _retval = _r8;						\
-    if (_r10 == -1)						\
-      {								\
-        __set_errno (_retval);					\
-        _retval = -1;						\
-      }								\
+		      : "2" (_r15) ASM_ARGS_##nr		\
+		      : "memory" ASM_CLOBBERS_##nr);		\
+    _retval = _r8;
+
+#endif /* !IA64_USE_NEW_STUB */
+
+#undef INLINE_SYSCALL
+#define INLINE_SYSCALL(name, nr, args...)	\
+  ({						\
+    DO_INLINE_SYSCALL(name, nr, args)		\
+    if (_r10 == -1)				\
+      {						\
+	__set_errno (_retval);			\
+	_retval = -1;				\
+      }						\
     _retval; })
 
 #undef INTERNAL_SYSCALL_DECL
 #define INTERNAL_SYSCALL_DECL(err) long int err
 
 #undef INTERNAL_SYSCALL
-#define INTERNAL_SYSCALL(name, err, nr, args...)		\
-  ({								\
-    register long _r8 asm ("r8");				\
-    register long _r10 asm ("r10");				\
-    register long _r15 asm ("r15") = __NR_##name;		\
-    long _retval;						\
-    LOAD_ARGS_##nr (args);					\
-    __asm __volatile (BREAK_INSN (__BREAK_SYSCALL)		\
-                      : "=r" (_r8), "=r" (_r10), "=r" (_r15)	\
-			ASM_OUTARGS_##nr			\
-                      : "2" (_r15) ASM_ARGS_##nr		\
-                      : "memory" ASM_CLOBBERS_##nr);		\
-    _retval = _r8;						\
-    err = _r10;							\
+#define INTERNAL_SYSCALL(name, err, nr, args...)	\
+  ({							\
+    DO_INLINE_SYSCALL(name, nr, args)			\
+    err = _r10;						\
     _retval; })
 
 #undef INTERNAL_SYSCALL_ERROR_P
@@ -225,6 +290,15 @@
 #define ASM_OUTARGS_5	ASM_OUTARGS_4, "=r" (_out4)
 #define ASM_OUTARGS_6	ASM_OUTARGS_5, "=r" (_out5)
 
+#ifdef IA64_USE_NEW_STUB
+#define ASM_ARGS_0
+#define ASM_ARGS_1	ASM_ARGS_0, "4" (_out0)
+#define ASM_ARGS_2	ASM_ARGS_1, "5" (_out1)
+#define ASM_ARGS_3	ASM_ARGS_2, "6" (_out2)
+#define ASM_ARGS_4	ASM_ARGS_3, "7" (_out3)
+#define ASM_ARGS_5	ASM_ARGS_4, "8" (_out4)
+#define ASM_ARGS_6	ASM_ARGS_5, "9" (_out5)
+#else
 #define ASM_ARGS_0
 #define ASM_ARGS_1	ASM_ARGS_0, "3" (_out0)
 #define ASM_ARGS_2	ASM_ARGS_1, "4" (_out1)
@@ -232,6 +306,7 @@
 #define ASM_ARGS_4	ASM_ARGS_3, "6" (_out3)
 #define ASM_ARGS_5	ASM_ARGS_4, "7" (_out4)
 #define ASM_ARGS_6	ASM_ARGS_5, "8" (_out5)
+#endif
 
 #define ASM_CLOBBERS_0	ASM_CLOBBERS_1, "out0"
 #define ASM_CLOBBERS_1	ASM_CLOBBERS_2, "out1"
@@ -239,7 +314,7 @@
 #define ASM_CLOBBERS_3	ASM_CLOBBERS_4, "out3"
 #define ASM_CLOBBERS_4	ASM_CLOBBERS_5, "out4"
 #define ASM_CLOBBERS_5	ASM_CLOBBERS_6, "out5"
-#define ASM_CLOBBERS_6	, "out6", "out7",				\
+#define ASM_CLOBBERS_6_COMMON	, "out6", "out7",			\
   /* Non-stacked integer registers, minus r8, r10, r15.  */		\
   "r2", "r3", "r9", "r11", "r12", "r13", "r14", "r16", "r17", "r18",	\
   "r19", "r20", "r21", "r22", "r23", "r24", "r25", "r26", "r27",	\
@@ -249,7 +324,13 @@
   /* Non-rotating fp registers.  */					\
   "f6", "f7", "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15",	\
   /* Branch registers.  */						\
-  "b6", "b7"
+  "b6"
+
+#ifdef IA64_USE_NEW_STUB
+# define ASM_CLOBBERS_6	ASM_CLOBBERS_6_COMMON
+#else
+# define ASM_CLOBBERS_6	ASM_CLOBBERS_6_COMMON , "b7"
+#endif
 
 #endif /* not __ASSEMBLER__ */
 
Index: sysdeps/unix/sysv/linux/ia64/vfork.S
--- sysdeps/unix/sysv/linux/ia64/vfork.S
+++ sysdeps/unix/sysv/linux/ia64/vfork.S
@@ -34,9 +34,8 @@
 	mov out0=CLONE_VM+CLONE_VFORK+SIGCHLD
 	mov out1=0		/* Standard sp value.			*/
 	;;
-	DO_CALL (SYS_ify (clone))
+	DO_CALL_VIA_BREAK (SYS_ify (clone))
 	cmp.eq p6,p0=-1,r10
-	;;
 (p6)	br.cond.spnt.few __syscall_error
 	ret
 PSEUDO_END(__vfork)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]