This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: GDBserver fast tracepoints support


On Thursday 06 May 2010 04:30:28, Pedro Alves wrote:
> This patch adds support of fast dynamic jump-based tracepoints support
> to 32-bit and 64-bit x86 linux GDBserver.

I've checked this in, as below.

> [NEWS and docs patch bits included.  Are those okay?]

These had been split into a separate thread, and Eli approved
them (with minor tweaks, that have been addressed).

> Instead of teaching GDBserver about that, the patch adds a new packet
> to the remote protocol --- qRelocInsn, that GDBserver and other stubs
> can use to ask GDB to do the work instead.

This had been split as an independent patch meanwhile, and had already
been checked in.

-- 
Pedro Alves

gdb/gdbserver/
2010-06-01  Pedro Alves  <pedro@codesourcery.com>
	    Stan Shebs  <stan@codesourcery.com>

	* Makefile.in (IPA_DEPFILES, extra_libraries): New.
	(all): Depend on $(extra_libraries).
	(install-only): Install the IPA.
	(IPA_OBJS, IPA_LIB): New.
	(clean): Remove the IPA lib.
	(IPAGENT_CFLAGS): New.
	(tracepoint-ipa.o, utils-ipa.o, remote-utils-ipa.o)
	(regcache-ipa.o, i386-linux-ipa.o, linux-i386-ipa.o)
	(linux-amd64-ipa.o, amd64-linux-ipa.o): New rules.
	* linux-amd64-ipa.c, linux-i386-ipa.c: New files.
	* configure.ac: Check for atomic builtins support in the compiler.
	(IPA_DEPFILES, extra_libraries): Define.
	* configure.srv (ipa_obj): Add description.
	(ipa_i386_linux_regobj, ipa_amd64_linux_regobj): Define.
	(i[34567]86-*-linux*): Set ipa_obj.
	(x86_64-*-linux*): Set ipa_obj.
	* linux-low.c (stabilizing_threads): New.
	(supports_fast_tracepoints): New.
	(linux_detach): Stabilize threads before detaching.
	(handle_tracepoints): Handle internal tracing breakpoints.  Assert
	the lwp is either not stabilizing, or is moving out of a jump pad.
	(linux_fast_tracepoint_collecting): New.
	(maybe_move_out_of_jump_pad): New.
	(enqueue_one_deferred_signal): New.
	(dequeue_one_deferred_signal): New.
	(linux_wait_for_event_1): If moving out of a jump pad, defer
	pending signals to later.
	(linux_stabilize_threads): New.
	(linux_wait_1): Check if threads need moving out of jump pads, and
	do it if so.
	(stuck_in_jump_pad_callback): New.
	(move_out_of_jump_pad_callback): New.
	(lwp_running): New.
	(linux_resume_one_lwp): Handle moving out of jump pads.
	(linux_set_resume_request): Dequeue deferred signals.
	(need_step_over_p): Also step over fast tracepoint jumps.
	(start_step_over): Also uninsert fast tracepoint jumps.
	(finish_step_over): Also reinsert fast tracepoint jumps.
	(linux_install_fast_tracepoint_jump): New.
	(linux_target_ops): Install linux_stabilize_threads and
	linux_install_fast_tracepoint_jump_pad.
	* linux-low.h (linux_target_ops) <get_thread_area,
	install_fast_tracepoint_jump_pad>: New fields.
	(struct lwp_info) <collecting_fast_tracepoint,
	pending_signals_to_report, exit_jump_pad_bkpt>: New fields.
	(linux_get_thread_area): Declare.
	* linux-x86-low.c (jump_insn): New.
	(x86_get_thread_area): New.
	(append_insns): New.
	(push_opcode): New.
	(amd64_install_fast_tracepoint_jump_pad): New.
	(i386_install_fast_tracepoint_jump_pad): New.
	(x86_install_fast_tracepoint_jump_pad): New.
	(the_low_target): Install x86_get_thread_area and
	x86_install_fast_tracepoint_jump_pad.
	* mem-break.c (set_raw_breakpoint_at): Use read_inferior_memory.
	(struct fast_tracepoint_jump): New.
	(fast_tracepoint_jump_insn): New.
	(fast_tracepoint_jump_shadow): New.
	(find_fast_tracepoint_jump_at): New.
	(fast_tracepoint_jump_here): New.
	(delete_fast_tracepoint_jump): New.
	(set_fast_tracepoint_jump): New.
	(uninsert_fast_tracepoint_jumps_at): New.
	(reinsert_fast_tracepoint_jumps_at): New.
	(set_breakpoint_at): Use write_inferior_memory.
	(uninsert_raw_breakpoint): Use write_inferior_memory.
	(check_mem_read): Mask out fast tracepoint jumps.
	(check_mem_write): Mask out fast tracepoint jumps.
	* mem-break.h (struct fast_tracepoint_jump): Forward declare.
	(set_fast_tracepoint_jump): Declare.
	(delete_fast_tracepoint_jump)
	(fast_tracepoint_jump_here, uninsert_fast_tracepoint_jumps_at)
	(reinsert_fast_tracepoint_jumps_at): Declare.
	* regcache.c: Don't compile many functions when building the
	in-process agent library.
	(init_register_cache) [IN_PROCESS_AGENT]: Don't allow allocating
	the register buffer in the heap.
	(free_register_cache): If the register buffer isn't owned by the
	regcache, don't free it.
	(set_register_cache) [IN_PROCESS_AGENT]: Don't re-alocate
	pre-existing register caches.
	* remote-utils.c (convert_int_to_ascii): Constify `from' parameter
	type.
	(convert_ascii_to_int): : Constify `from' parameter type.
	(decode_M_packet, decode_X_packet): Replace the `to' parameter by
	a `to_p' pointer to pointer parameter.  If TO_P is NULL, malloc
	the needed buffer in-place.
	(relocate_instruction): New.
	* server.c (handle_query) <qSymbols>: If the target supports
	tracepoints, give it a chance of looking up symbols.  Report
	support for fast tracepoints.
	(handle_status): Stabilize threads.
	(process_serial_event): Adjust.
	* server.h (struct fast_tracepoint_jump): Forward declare.
	(struct process_info) <fast_tracepoint_jumps>: New field.
	(convert_ascii_to_int, convert_int_to_ascii): Adjust.
	(decode_X_packet, decode_M_packet): Adjust.
	(relocate_instruction): Declare.
	(in_process_agent_loaded): Declare.
	(tracepoint_look_up_symbols): Declare.
	(struct fast_tpoint_collect_status): Declare.
	(fast_tracepoint_collecting): Declare.
	(force_unlock_trace_buffer): Declare.
	(handle_tracepoint_bkpts): Declare.
	(initialize_low_tracepoint)
	(supply_fast_tracepoint_registers) [IN_PROCESS_AGENT]: Declare.
	* target.h (struct target_ops) <stabilize_threads,
	install_fast_tracepoint_jump_pad>: New fields.
	(stabilize_threads, install_fast_tracepoint_jump_pad): New.
	* tracepoint.c [HAVE_MALLOC_H]: Include malloc.h.
	[HAVE_STDINT_H]: Include stdint.h.
	(trace_debug_1): Rename to ...
	(trace_vdebug): ... this.
	(trace_debug): Rename to ...
	(trace_debug_1): ... this.  Add `level' parameter.
	(trace_debug): New.
	(ATTR_USED, ATTR_NOINLINE): New.
	(IP_AGENT_EXPORT): New.
	(gdb_tp_heap_buffer, gdb_jump_pad_buffer, gdb_jump_pad_buffer_end)
	(collecting, gdb_collect, stop_tracing, flush_trace_buffer)
	(about_to_request_buffer_space, trace_buffer_is_full)
	(stopping_tracepoint, expr_eval_result, error_tracepoint)
	(tracepoints, tracing, trace_buffer_ctrl, trace_buffer_ctrl_curr)
	(trace_buffer_lo, trace_buffer_hi, traceframe_read_count)
	(traceframe_write_count, traceframes_created)
	(trace_state_variables)
	New renaming defines.
	(struct ipa_sym_addresses): New.
	(STRINGIZE_1, STRINGIZE, IPA_SYM): New.
	(symbol_list): New.
	(ipa_sym_addrs): New.
	(all_tracepoint_symbols_looked_up): New.
	(in_process_agent_loaded): New.
	(write_e_ipa_not_loaded): New.
	(maybe_write_ipa_not_loaded): New.
	(tracepoint_look_up_symbols): New.
	(debug_threads) [IN_PROCESS_AGENT]: New.
	(read_inferior_memory) [IN_PROCESS_AGENT]: New.
	(UNKNOWN_SIDE_EFFECTS): New.
	(stop_tracing): New.
	(flush_trace_buffer): New.
	(stop_tracing_bkpt): New.
	(flush_trace_buffer_bkpt): New.
	(read_inferior_integer): New.
	(read_inferior_uinteger): New.
	(read_inferior_data_pointer): New.
	(write_inferior_data_pointer): New.
	(write_inferior_integer): New.
	(write_inferior_uinteger): New.
	(struct collect_static_trace_data_action): Delete.
	(enum tracepoint_type): New.
	(struct tracepoint) <type>: New field `type'.
	<actions_str, step_actions, step_actions_str>: Only include in GDBserver.
	<orig_size, obj_addr_on_target, adjusted_insn_addr>
	<adjusted_insn_addr_end, jump_pad, jump_pad_end>: New fields.
	(tracepoints): Use IP_AGENT_EXPORT.
	(last_tracepoint): Don't include in the IPA.
	(stopping_tracepoint): Use IP_AGENT_EXPORT.
	(trace_buffer_is_full): Use IP_AGENT_EXPORT.
	(alloced_trace_state_variables): New.
	(trace_state_variables): Use IP_AGENT_EXPORT.
	(traceframe_t): Delete unused variable.
	(circular_trace_buffer): Don't include in the IPA.
	(trace_buffer_start): Delete.
	(struct trace_buffer_control): New.
	(trace_buffer_free): Delete.
	(struct ipa_trace_buffer_control): New.
	(GDBSERVER_FLUSH_COUNT_MASK, GDBSERVER_FLUSH_COUNT_MASK_PREV)
	(GDBSERVER_FLUSH_COUNT_MASK_CURR, GDBSERVER_UPDATED_FLUSH_COUNT_BIT):
	New.
	(trace_buffer_ctrl): New.
	(TRACE_BUFFER_CTRL_CURR): New.
	(trace_buffer_start, trace_buffer_free, trace_buffer_end_free):
	Reimplement as macros.
	(trace_buffer_wrap): Delete.
	(traceframe_write_count, traceframe_read_count)
	(traceframes_created, tracing): Use IP_AGENT_EXPORT.
	(struct tracepoint_hit_ctx) <type>: New field.
	(struct fast_tracepoint_ctx): New.
	(memory_barrier): New.
	(cmpxchg): New.
	(record_tracepoint_error): Update atomically in the IPA.
	(clear_inferior_trace_buffer): New.
	(about_to_request_buffer_space): New.
	(trace_buffer_alloc): Handle GDBserver and inferior simulatenous
	updating the same buffer.
	(add_tracepoint): Default the tracepoint's type to trap
	tracepoint, and orig_size to -1.
	(get_trace_state_variable) [IN_PROCESS_AGENT]: Handle allocated
	internal variables.
	(create_trace_state_variable): New parameter `gdb'.  Handle it.
	(clear_installed_tracepoints): Clear fast tracepoint jumps.
	(cmd_qtdp): Handle fast tracepoints.
	(cmd_qtdv): Adjust.
	(max_jump_pad_size): New.
	(gdb_jump_pad_head): New.
	(get_jump_space_head): New.
	(claim_jump_space): New.
	(sort_tracepoints): New.
	(MAX_JUMP_SIZE): New.
	(cmd_qtstart): Handle fast tracepoints.  Sync tracepoints with the
	IPA.
	(stop_tracing) [IN_PROCESS_AGENT]: Don't include the tdisconnected
	support.  Upload fast traceframes, and delete internal IPA
	breakpoints.
	(stop_tracing_handler): New.
	(flush_trace_buffer_handler): New.
	(cmd_qtstop): Upload fast tracepoints.
	(response_tracepoint): Handle fast tracepoints.
	(tracepoint_finished_step): Upload fast traceframes.  Set the
	tracepoint hit context's tracepoint type.
	(handle_tracepoint_bkpts): New.
	(tracepoint_was_hit): Set the tracepoint hit context's tracepoint
	type.  Add comment about fast tracepoints.
	(collect_data_at_tracepoint) [IN_PROCESS_AGENT]: Don't access the
	non-existing action_str field.
	(get_context_regcache): Handle fast tracepoints.
	(do_action_at_tracepoint) [!IN_PROCESS_AGENT]: Don't write the PC
	to the regcache.
	(fast_tracepoint_from_jump_pad_address): New.
	(fast_tracepoint_from_ipa_tpoint_address): New.
	(collecting_t): New.
	(force_unlock_trace_buffer): New.
	(fast_tracepoint_collecting): New.
	(collecting): New.
	(gdb_collect): New.
	(write_inferior_data_ptr): New.
	(target_tp_heap): New.
	(target_malloc): New.
	(download_agent_expr): New.
	(UALIGN): New.
	(download_tracepoints): New.
	(download_trace_state_variables): New.
	(upload_fast_traceframes): New.
	(IPA_FIRST_TRACEFRAME): New.
	(IPA_NEXT_TRACEFRAME_1): New.
	(IPA_NEXT_TRACEFRAME): New.
	[IN_PROCESS_AGENT]: Include sys/mman.h and fcntl.h.
	[IN_PROCESS_AGENT] (gdb_tp_heap_buffer, gdb_jump_pad_buffer)
	(gdb_jump_pad_buffer_end): New.
	[IN_PROCESS_AGENT] (initialize_tracepoint_ftlib): New.
	(initialize_tracepoint): Adjust.
	[IN_PROCESS_AGENT]: Allocate the IPA heap, and jump pad scratch
	buffer.  Initialize the low module.
	* utils.c (PREFIX, TOOLNAME): New.
	(malloc_failure): Use PREFIX.
	(error): In the IPA, an error causes an exit.
	(fatal, warning): Use PREFIX.
	(internal_error): Use TOOLNAME.
	(NUMCELLS): Increase to 10.
	* configure, config.in: Regenerate.

gdb/
2010-06-01  Pedro Alves  <pedro@codesourcery.com>

	* NEWS: Mention gdbserver fast tracepoints support.

gdb/doc/
2010-06-01  Pedro Alves  <pedro@codesourcery.com>

	* gdb.texinfo (Set Tracepoints): Mention tracepoints support in
	gdbserver, and add cross reference.
	(Tracepoints support in gdbserver): New subsection.

---
 gdb/NEWS                        |    6 
 gdb/doc/gdb.texinfo             |   79 +
 gdb/gdbserver/Makefile.in       |   43 
 gdb/gdbserver/config.in         |    3 
 gdb/gdbserver/configure         |   69 +
 gdb/gdbserver/configure.ac      |   32 
 gdb/gdbserver/configure.srv     |    7 
 gdb/gdbserver/linux-amd64-ipa.c |   77 +
 gdb/gdbserver/linux-i386-ipa.c  |  106 +
 gdb/gdbserver/linux-low.c       |  659 +++++++++++
 gdb/gdbserver/linux-low.h       |   33 
 gdb/gdbserver/linux-x86-low.c   |  433 +++++++
 gdb/gdbserver/mem-break.c       |  390 ++++++
 gdb/gdbserver/mem-break.h       |   27 
 gdb/gdbserver/regcache.c        |   35 
 gdb/gdbserver/remote-utils.c    |  113 +
 gdb/gdbserver/server.c          |   16 
 gdb/gdbserver/server.h          |   49 
 gdb/gdbserver/target.h          |   50 
 gdb/gdbserver/tracepoint.c      | 2319 +++++++++++++++++++++++++++++++++++++---
 gdb/gdbserver/utils.c           |   28 
 21 files changed, 4406 insertions(+), 168 deletions(-)

Index: src/gdb/gdbserver/Makefile.in
===================================================================
--- src.orig/gdb/gdbserver/Makefile.in	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/Makefile.in	2010-06-01 13:57:18.000000000 +0100
@@ -141,12 +141,15 @@ XML_DIR = $(srcdir)/../features
 XML_FILES = @srv_xmlfiles@
 XML_BUILTIN = @srv_xmlbuiltin@
 
+IPA_DEPFILES = @IPA_DEPFILES@
+extra_libraries = @extra_libraries@
+
 # Prevent Sun make from putting in the machine type.  Setting
 # TARGET_ARCH to nothing works for SunOS 3, 4.0, but not for 4.1.
 .c.o:
 	${CC} -c ${INTERNAL_CFLAGS} $<
 
-all: gdbserver$(EXEEXT) gdbreplay$(EXEEXT)
+all: gdbserver$(EXEEXT) gdbreplay$(EXEEXT) $(extra_libraries)
 
 # Traditionally "install" depends on "all".  But it may be useful
 # not to; for example, if the user has made some trivial change to a
@@ -157,6 +160,10 @@ install: all install-only
 install-only:
 	n=`echo gdbserver | sed '$(program_transform_name)'`; \
 	if [ x$$n = x ]; then n=gdbserver; else true; fi; \
+	if [ x$IPA_DEPFILES != x ]; then \
+		$(SHELL) $(srcdir)/../../mkinstalldirs $(DESTDIR)$(libdir); \
+		$(INSTALL_PROGRAM) $(IPA_LIB) $(DESTDIR)$(libdir)/$(IPA_LIB); \
+	fi; \
 	$(SHELL) $(srcdir)/../../mkinstalldirs $(DESTDIR)$(bindir); \
 	$(INSTALL_PROGRAM) gdbserver$(EXEEXT) $(DESTDIR)$(bindir)/$$n$(EXEEXT); \
 	$(SHELL) $(srcdir)/../../mkinstalldirs $(DESTDIR)$(man1dir); \
@@ -186,6 +193,15 @@ gdbreplay$(EXEEXT): $(GDBREPLAY_OBS)
 	${CC-LD} $(INTERNAL_CFLAGS) $(INTERNAL_LDFLAGS) -o gdbreplay$(EXEEXT) $(GDBREPLAY_OBS) \
 	  $(XM_CLIBS)
 
+IPA_OBJS=tracepoint-ipa.o utils-ipa.o regcache-ipa.o ${IPA_DEPFILES}
+
+IPA_LIB=libinproctrace.so
+
+$(IPA_LIB): $(IPA_OBJS) ${ADD_DEPS} ${CDEPS}
+	rm -f $(IPA_LIB)
+	${CC-LD} -shared -fPIC -Wl,--no-undefined $(INTERNAL_CFLAGS) \
+	$(INTERNAL_LDFLAGS) -o $(IPA_LIB) ${IPA_OBJS}
+
 # Put the proper machine-specific files first, so M-. on a machine
 # specific routine gets the one for the correct machine.
 # The xyzzy stuff below deals with empty DEPFILES
@@ -205,6 +221,7 @@ clean:
 	rm -f *.o ${ADD_FILES} *~
 	rm -f version.c
 	rm -f gdbserver$(EXEEXT) gdbreplay$(EXEEXT) core make.log
+	rm -f $(IPA_LIB)
 	rm -f reg-arm.c i386.c reg-ia64.c reg-m32r.c reg-m68k.c
 	rm -f reg-sh.c reg-sparc.c reg-spu.c amd64.c i386-linux.c
 	rm -f reg-cris.c reg-crisv32.c amd64-linux.c reg-xtensa.c
@@ -278,6 +295,30 @@ linux_low_h = $(srcdir)/linux-low.h
 
 nto_low_h = $(srcdir)/nto-low.h
 
+# Note, we only build the IPA if -fvisibility=hidden is supported in
+# the first place.
+IPAGENT_CFLAGS = $(CPPFLAGS) $(INTERNAL_CFLAGS) \
+	-fPIC -DGDBSERVER -DIN_PROCESS_AGENT \
+	-fvisibility=hidden
+
+# In-process agent object rules
+tracepoint-ipa.o: tracepoint.c $(server_h)
+	$(CC) -c $(IPAGENT_CFLAGS) $< -o tracepoint-ipa.o
+utils-ipa.o: utils.c $(server_h)
+	$(CC) -c $(IPAGENT_CFLAGS) $< -o utils-ipa.o
+remote-utils-ipa.o: remote-utils.c $(server_h)
+	$(CC) -c $(IPAGENT_CFLAGS) $< -o remote-utils-ipa.o
+regcache-ipa.o: regcache.c $(server_h)
+	$(CC) -c $(IPAGENT_CFLAGS) $< -o regcache-ipa.o
+i386-linux-ipa.o : i386-linux.c $(regdef_h)
+	$(CC) -c $(IPAGENT_CFLAGS) $< -o i386-linux-ipa.o
+linux-i386-ipa.o: linux-i386-ipa.c $(server_h)
+	$(CC) -c $(IPAGENT_CFLAGS) $< -o linux-i386-ipa.o
+linux-amd64-ipa.o: linux-amd64-ipa.c $(server_h)
+	$(CC) -c $(IPAGENT_CFLAGS) $< -o linux-amd64-ipa.o
+amd64-linux-ipa.o : amd64-linux.c $(regdef_h)
+	$(CC) -c $(IPAGENT_CFLAGS) $< -o amd64-linux-ipa.o
+
 event-loop.o: event-loop.c $(server_h)
 hostio.o: hostio.c $(server_h)
 hostio-errno.o: hostio-errno.c $(server_h)
Index: src/gdb/gdbserver/linux-amd64-ipa.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ src/gdb/gdbserver/linux-amd64-ipa.c	2010-06-01 13:57:18.000000000 +0100
@@ -0,0 +1,77 @@
+/* GNU/Linux/x86-64 specific low level interface, for the in-process
+   agent library for GDB.
+
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "server.h"
+
+/* Defined in auto-generated file amd64-linux.c.  */
+void init_registers_amd64_linux (void);
+
+/* fast tracepoints collect registers.  */
+
+#define FT_CR_RIP 0
+#define FT_CR_EFLAGS 1
+#define FT_CR_R8 2
+#define FT_CR_R9 3
+#define FT_CR_R10 4
+#define FT_CR_R11 5
+#define FT_CR_R12 6
+#define FT_CR_R13 7
+#define FT_CR_R14 8
+#define FT_CR_R15 9
+#define FT_CR_RAX 10
+#define FT_CR_RBX 11
+#define FT_CR_RCX 12
+#define FT_CR_RDX 13
+#define FT_CR_RSI 14
+#define FT_CR_RDI 15
+#define FT_CR_RBP 16
+#define FT_CR_RSP 17
+
+static const int x86_64_ft_collect_regmap[] = {
+  FT_CR_RAX * 8, FT_CR_RBX * 8, FT_CR_RCX * 8, FT_CR_RDX * 8,
+  FT_CR_RSI * 8, FT_CR_RDI * 8, FT_CR_RBP * 8, FT_CR_RSP * 8,
+  FT_CR_R8 * 8,  FT_CR_R9 * 8,  FT_CR_R10 * 8, FT_CR_R11 * 8,
+  FT_CR_R12 * 8, FT_CR_R13 * 8, FT_CR_R14 * 8, FT_CR_R15 * 8,
+  FT_CR_RIP * 8, FT_CR_EFLAGS * 8
+};
+
+#define X86_64_NUM_FT_COLLECT_GREGS \
+  (sizeof (x86_64_ft_collect_regmap) / sizeof(x86_64_ft_collect_regmap[0]))
+
+void
+supply_fast_tracepoint_registers (struct regcache *regcache,
+				  const unsigned char *buf)
+{
+  int i;
+
+  for (i = 0; i < X86_64_NUM_FT_COLLECT_GREGS; i++)
+    supply_register (regcache, i,
+		     ((char *) buf) + x86_64_ft_collect_regmap[i]);
+}
+
+/* This is only needed because reg-i386-linux-lib.o references it.  We
+   may use it proper at some point.  */
+const char *gdbserver_xmltarget;
+
+void
+initialize_low_tracepoint (void)
+{
+  init_registers_amd64_linux ();
+}
Index: src/gdb/gdbserver/linux-i386-ipa.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ src/gdb/gdbserver/linux-i386-ipa.c	2010-06-01 13:57:18.000000000 +0100
@@ -0,0 +1,106 @@
+/* GNU/Linux/x86 specific low level interface, for the in-process
+   agent library for GDB.
+
+   Copyright (C) 2010 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "server.h"
+
+/* GDB register numbers.  */
+
+enum i386_gdb_regnum
+{
+  I386_EAX_REGNUM,		/* %eax */
+  I386_ECX_REGNUM,		/* %ecx */
+  I386_EDX_REGNUM,		/* %edx */
+  I386_EBX_REGNUM,		/* %ebx */
+  I386_ESP_REGNUM,		/* %esp */
+  I386_EBP_REGNUM,		/* %ebp */
+  I386_ESI_REGNUM,		/* %esi */
+  I386_EDI_REGNUM,		/* %edi */
+  I386_EIP_REGNUM,		/* %eip */
+  I386_EFLAGS_REGNUM,		/* %eflags */
+  I386_CS_REGNUM,		/* %cs */
+  I386_SS_REGNUM,		/* %ss */
+  I386_DS_REGNUM,		/* %ds */
+  I386_ES_REGNUM,		/* %es */
+  I386_FS_REGNUM,		/* %fs */
+  I386_GS_REGNUM,		/* %gs */
+  I386_ST0_REGNUM		/* %st(0) */
+};
+
+#define i386_num_regs 16
+
+/* Defined in auto-generated file i386-linux.c.  */
+void init_registers_i386_linux (void);
+
+#define FT_CR_EAX 15
+#define FT_CR_ECX 14
+#define FT_CR_EDX 13
+#define FT_CR_EBX 12
+#define FT_CR_UESP 11
+#define FT_CR_EBP 10
+#define FT_CR_ESI 9
+#define FT_CR_EDI 8
+#define FT_CR_EIP 7
+#define FT_CR_EFL 6
+#define FT_CR_DS 5
+#define FT_CR_ES 4
+#define FT_CR_FS 3
+#define FT_CR_GS 2
+#define FT_CR_SS 1
+#define FT_CR_CS 0
+
+/* Mapping between the general-purpose registers in jump tracepoint
+   format and GDB's register array layout.  */
+
+static const int i386_ft_collect_regmap[] =
+{
+  FT_CR_EAX * 4, FT_CR_ECX * 4, FT_CR_EDX * 4, FT_CR_EBX * 4,
+  FT_CR_UESP * 4, FT_CR_EBP * 4, FT_CR_ESI * 4, FT_CR_EDI * 4,
+  FT_CR_EIP * 4, FT_CR_EFL * 4, FT_CR_CS * 4, FT_CR_SS * 4,
+  FT_CR_DS * 4, FT_CR_ES * 4, FT_CR_FS * 4, FT_CR_GS * 4
+};
+
+void
+supply_fast_tracepoint_registers (struct regcache *regcache,
+				  const unsigned char *buf)
+{
+  int i;
+
+  for (i = 0; i < i386_num_regs; i++)
+    {
+      int regval;
+
+      if (i >= I386_CS_REGNUM && i <= I386_GS_REGNUM)
+	regval = *(short *) (((char *) buf) + i386_ft_collect_regmap[i]);
+      else
+	regval = *(int *) (((char *) buf) + i386_ft_collect_regmap[i]);
+
+      supply_register (regcache, i, &regval);
+    }
+}
+
+/* This is only needed because reg-i386-linux-lib.o references it.  We
+   may use it proper at some point.  */
+const char *gdbserver_xmltarget;
+
+void
+initialize_low_tracepoint (void)
+{
+  init_registers_i386_linux ();
+}
Index: src/gdb/gdbserver/configure.ac
===================================================================
--- src.orig/gdb/gdbserver/configure.ac	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/configure.ac	2010-06-01 13:57:18.000000000 +0100
@@ -245,11 +245,43 @@ fi
 GDBSERVER_DEPFILES="$srv_regobj $srv_tgtobj $srv_hostio_err_objs $srv_thread_depfiles"
 GDBSERVER_LIBS="$srv_libs"
 
+dnl Check whether the target supports __sync_*_compare_and_swap.
+AC_CACHE_CHECK([whether the target supports __sync_*_compare_and_swap],
+		gdbsrv_cv_have_sync_builtins, [
+AC_TRY_LINK([], [int foo, bar; bar = __sync_val_compare_and_swap(&foo, 0, 1);],
+		gdbsrv_cv_have_sync_builtins=yes,
+		gdbsrv_cv_have_sync_builtins=no)])
+if test $gdbsrv_cv_have_sync_builtins = yes; then
+  AC_DEFINE(HAVE_SYNC_BUILTINS, 1,
+    [Define to 1 if the target supports __sync_*_compare_and_swap])
+fi
+
+dnl Check for -fvisibility=hidden support in the compiler.
+saved_cflags="$CFLAGS"
+CFLAGS="$CFLAGS -fvisibility=hidden"
+AC_COMPILE_IFELSE(AC_LANG_PROGRAM([]),
+		        [gdbsrv_cv_have_visibility_hidden=yes],
+	        	[gdbsrv_cv_have_visibility_hidden=no])
+CFLAGS="$saved_cflags"
+
+IPA_DEPFILES=""
+
+# Rather than allowing to build a broken IPA, we simply disable it if
+# we don't find a compiler supporting all the features we need.
+if test "$ipa_obj" != "" \
+   -a "$gdbsrv_cv_have_sync_builtins" = yes \
+   -a "$gdbsrv_cv_have_visibility_hidden" = yes; then
+   IPA_DEPFILES="$ipa_obj"
+   extra_libraries="libinproctrace.so"
+fi
+
 AC_SUBST(GDBSERVER_DEPFILES)
 AC_SUBST(GDBSERVER_LIBS)
 AC_SUBST(USE_THREAD_DB)
 AC_SUBST(srv_xmlbuiltin)
 AC_SUBST(srv_xmlfiles)
+AC_SUBST(IPA_DEPFILES)
+AC_SUBST(extra_libraries)
 
 AC_OUTPUT(Makefile,
 [case x$CONFIG_HEADERS in
Index: src/gdb/gdbserver/configure.srv
===================================================================
--- src.orig/gdb/gdbserver/configure.srv	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/configure.srv	2010-06-01 13:57:18.000000000 +0100
@@ -10,6 +10,8 @@
 #			target method.
 #   srv_xmlfiles	All XML files which should be available for
 #			gdbserver in this configuration.
+#   ipa_obj		Any other target-specific modules appropriate
+#			for this target's in-process agent.
 #
 # In addition, on GNU/Linux the following shell variables will be set:
 #   srv_linux_regsets	Set to "yes" if ptrace(PTRACE_GETREGS) and friends
@@ -27,6 +29,9 @@ srv_i386_linux_regobj="i386-linux.o i386
 srv_amd64_regobj="amd64.o amd64-avx.o"
 srv_amd64_linux_regobj="amd64-linux.o amd64-avx-linux.o"
 
+ipa_i386_linux_regobj=i386-linux-ipa.o
+ipa_amd64_linux_regobj=amd64-linux-ipa.o
+
 srv_i386_32bit_xmlfiles="i386/32bit-core.xml i386/32bit-sse.xml i386/32bit-avx.xml"
 srv_i386_64bit_xmlfiles="i386/64bit-core.xml i386/64bit-sse.xml i386/64bit-avx.xml"
 srv_i386_xmlfiles="i386/i386.xml i386/i386-avx.xml i386/i386-mmx.xml $srv_i386_32bit_xmlfiles"
@@ -86,6 +91,7 @@ case "${target}" in
 			srv_linux_usrregs=yes
 			srv_linux_regsets=yes
 			srv_linux_thread_db=yes
+			ipa_obj="${ipa_i386_linux_regobj} linux-i386-ipa.o"
 			;;
   i[34567]86-*-mingw32ce*)
 			srv_regobj="$srv_i386_regobj"
@@ -230,6 +236,7 @@ case "${target}" in
 			srv_linux_usrregs=yes # This is for i386 progs.
 			srv_linux_regsets=yes
 			srv_linux_thread_db=yes
+			ipa_obj="${ipa_amd64_linux_regobj} linux-amd64-ipa.o"
 			;;
   x86_64-*-mingw*)	srv_regobj="$srv_amd64_regobj"
 			srv_tgtobj="i386-low.o i387-fp.o win32-low.o win32-i386-low.o"
Index: src/gdb/gdbserver/linux-low.c
===================================================================
--- src.orig/gdb/gdbserver/linux-low.c	2010-06-01 13:44:36.000000000 +0100
+++ src/gdb/gdbserver/linux-low.c	2010-06-01 13:57:18.000000000 +0100
@@ -127,6 +127,10 @@ int stopping_threads;
 /* FIXME make into a target method?  */
 int using_threads = 1;
 
+/* True if we're presently stabilizing threads (moving them out of
+   jump pads).  */
+static int stabilizing_threads;
+
 /* This flag is true iff we've just created or attached to our first
    inferior but it has not stopped yet.  As soon as it does, we need
    to call the low target's arch_setup callback.  Doing this only on
@@ -170,6 +174,16 @@ supports_breakpoints (void)
   return (the_low_target.get_pc != NULL);
 }
 
+/* Returns true if this target can support fast tracepoints.  This
+   does not mean that the in-process agent has been loaded in the
+   inferior.  */
+
+static int
+supports_fast_tracepoints (void)
+{
+  return the_low_target.install_fast_tracepoint_jump_pad != NULL;
+}
+
 struct pending_signals
 {
   int signal;
@@ -846,6 +860,9 @@ linux_detach (int pid)
   thread_db_detach (process);
 #endif
 
+  /* Stabilize threads (move out of jump pads).  */
+  stabilize_threads ();
+
   find_inferior (&all_threads, linux_detach_one_lwp, &pid);
 
   the_target->mourn (process);
@@ -1127,6 +1144,8 @@ handle_tracepoints (struct lwp_info *lwp
   /* Do any necessary step collect actions.  */
   tpoint_related_event |= tracepoint_finished_step (tinfo, lwp->stop_pc);
 
+  tpoint_related_event |= handle_tracepoint_bkpts (tinfo, lwp->stop_pc);
+
   /* See if we just hit a tracepoint and do its main collect
      actions.  */
   tpoint_related_event |= tracepoint_was_hit (tinfo, lwp->stop_pc);
@@ -1134,6 +1153,7 @@ handle_tracepoints (struct lwp_info *lwp
   lwp->suspended--;
 
   gdb_assert (lwp->suspended == 0);
+  gdb_assert (!stabilizing_threads || lwp->collecting_fast_tracepoint);
 
   if (tpoint_related_event)
     {
@@ -1145,6 +1165,231 @@ handle_tracepoints (struct lwp_info *lwp
   return 0;
 }
 
+/* Convenience wrapper.  Returns true if LWP is presently collecting a
+   fast tracepoint.  */
+
+static int
+linux_fast_tracepoint_collecting (struct lwp_info *lwp,
+				  struct fast_tpoint_collect_status *status)
+{
+  CORE_ADDR thread_area;
+
+  if (the_low_target.get_thread_area == NULL)
+    return 0;
+
+  /* Get the thread area address.  This is used to recognize which
+     thread is which when tracing with the in-process agent library.
+     We don't read anything from the address, and treat it as opaque;
+     it's the address itself that we assume is unique per-thread.  */
+  if ((*the_low_target.get_thread_area) (lwpid_of (lwp), &thread_area) == -1)
+    return 0;
+
+  return fast_tracepoint_collecting (thread_area, lwp->stop_pc, status);
+}
+
+/* The reason we resume in the caller, is because we want to be able
+   to pass lwp->status_pending as WSTAT, and we need to clear
+   status_pending_p before resuming, otherwise, linux_resume_one_lwp
+   refuses to resume.  */
+
+static int
+maybe_move_out_of_jump_pad (struct lwp_info *lwp, int *wstat)
+{
+  struct thread_info *saved_inferior;
+
+  saved_inferior = current_inferior;
+  current_inferior = get_lwp_thread (lwp);
+
+  if ((wstat == NULL
+       || (WIFSTOPPED (*wstat) && WSTOPSIG (*wstat) != SIGTRAP))
+      && supports_fast_tracepoints ()
+      && in_process_agent_loaded ())
+    {
+      struct fast_tpoint_collect_status status;
+      int r;
+
+      if (debug_threads)
+	fprintf (stderr, "\
+Checking whether LWP %ld needs to move out of the jump pad.\n",
+		 lwpid_of (lwp));
+
+      r = linux_fast_tracepoint_collecting (lwp, &status);
+
+      if (wstat == NULL
+	  || (WSTOPSIG (*wstat) != SIGILL
+	      && WSTOPSIG (*wstat) != SIGFPE
+	      && WSTOPSIG (*wstat) != SIGSEGV
+	      && WSTOPSIG (*wstat) != SIGBUS))
+	{
+	  lwp->collecting_fast_tracepoint = r;
+
+	  if (r != 0)
+	    {
+	      if (r == 1 && lwp->exit_jump_pad_bkpt == NULL)
+		{
+		  /* Haven't executed the original instruction yet.
+		     Set breakpoint there, and wait till it's hit,
+		     then single-step until exiting the jump pad.  */
+		  lwp->exit_jump_pad_bkpt
+		    = set_breakpoint_at (status.adjusted_insn_addr, NULL);
+		}
+
+	      if (debug_threads)
+		fprintf (stderr, "\
+Checking whether LWP %ld needs to move out of the jump pad...it does\n",
+		 lwpid_of (lwp));
+
+	      return 1;
+	    }
+	}
+      else
+	{
+	  /* If we get a synchronous signal while collecting, *and*
+	     while executing the (relocated) original instruction,
+	     reset the PC to point at the tpoint address, before
+	     reporting to GDB.  Otherwise, it's an IPA lib bug: just
+	     report the signal to GDB, and pray for the best.  */
+
+	  lwp->collecting_fast_tracepoint = 0;
+
+	  if (r != 0
+	      && (status.adjusted_insn_addr <= lwp->stop_pc
+		  && lwp->stop_pc < status.adjusted_insn_addr_end))
+	    {
+	      siginfo_t info;
+	      struct regcache *regcache;
+
+	      /* The si_addr on a few signals references the address
+		 of the faulting instruction.  Adjust that as
+		 well.  */
+	      if ((WSTOPSIG (*wstat) == SIGILL
+		   || WSTOPSIG (*wstat) == SIGFPE
+		   || WSTOPSIG (*wstat) == SIGBUS
+		   || WSTOPSIG (*wstat) == SIGSEGV)
+		  && ptrace (PTRACE_GETSIGINFO, lwpid_of (lwp), 0, &info) == 0
+		  /* Final check just to make sure we don't clobber
+		     the siginfo of non-kernel-sent signals.  */
+		  && (uintptr_t) info.si_addr == lwp->stop_pc)
+		{
+		  info.si_addr = (void *) (uintptr_t) status.tpoint_addr;
+		  ptrace (PTRACE_SETSIGINFO, lwpid_of (lwp), 0, &info);
+		}
+
+	      regcache = get_thread_regcache (get_lwp_thread (lwp), 1);
+	      (*the_low_target.set_pc) (regcache, status.tpoint_addr);
+	      lwp->stop_pc = status.tpoint_addr;
+
+	      /* Cancel any fast tracepoint lock this thread was
+		 holding.  */
+	      force_unlock_trace_buffer ();
+	    }
+
+	  if (lwp->exit_jump_pad_bkpt != NULL)
+	    {
+	      if (debug_threads)
+		fprintf (stderr,
+			 "Cancelling fast exit-jump-pad: removing bkpt. "
+			 "stopping all threads momentarily.\n");
+
+	      stop_all_lwps (1, lwp);
+	      cancel_breakpoints ();
+
+	      delete_breakpoint (lwp->exit_jump_pad_bkpt);
+	      lwp->exit_jump_pad_bkpt = NULL;
+
+	      unstop_all_lwps (1, lwp);
+
+	      gdb_assert (lwp->suspended >= 0);
+	    }
+	}
+    }
+
+  if (debug_threads)
+    fprintf (stderr, "\
+Checking whether LWP %ld needs to move out of the jump pad...no\n",
+	     lwpid_of (lwp));
+  return 0;
+}
+
+/* Enqueue one signal in the "signals to report later when out of the
+   jump pad" list.  */
+
+static void
+enqueue_one_deferred_signal (struct lwp_info *lwp, int *wstat)
+{
+  struct pending_signals *p_sig;
+
+  if (debug_threads)
+    fprintf (stderr, "\
+Deferring signal %d for LWP %ld.\n", WSTOPSIG (*wstat), lwpid_of (lwp));
+
+  if (debug_threads)
+    {
+      struct pending_signals *sig;
+
+      for (sig = lwp->pending_signals_to_report;
+	   sig != NULL;
+	   sig = sig->prev)
+	fprintf (stderr,
+		 "   Already queued %d\n",
+		 sig->signal);
+
+      fprintf (stderr, "   (no more currently queued signals)\n");
+    }
+
+  p_sig = xmalloc (sizeof (*p_sig));
+  p_sig->prev = lwp->pending_signals_to_report;
+  p_sig->signal = WSTOPSIG (*wstat);
+  memset (&p_sig->info, 0, sizeof (siginfo_t));
+  ptrace (PTRACE_GETSIGINFO, lwpid_of (lwp), 0, &p_sig->info);
+
+  lwp->pending_signals_to_report = p_sig;
+}
+
+/* Dequeue one signal from the "signals to report later when out of
+   the jump pad" list.  */
+
+static int
+dequeue_one_deferred_signal (struct lwp_info *lwp, int *wstat)
+{
+  if (lwp->pending_signals_to_report != NULL)
+    {
+      struct pending_signals **p_sig;
+
+      p_sig = &lwp->pending_signals_to_report;
+      while ((*p_sig)->prev != NULL)
+	p_sig = &(*p_sig)->prev;
+
+      *wstat = W_STOPCODE ((*p_sig)->signal);
+      if ((*p_sig)->info.si_signo != 0)
+	ptrace (PTRACE_SETSIGINFO, lwpid_of (lwp), 0, &(*p_sig)->info);
+      free (*p_sig);
+      *p_sig = NULL;
+
+      if (debug_threads)
+	fprintf (stderr, "Reporting deferred signal %d for LWP %ld.\n",
+		 WSTOPSIG (*wstat), lwpid_of (lwp));
+
+      if (debug_threads)
+	{
+	  struct pending_signals *sig;
+
+	  for (sig = lwp->pending_signals_to_report;
+	       sig != NULL;
+	       sig = sig->prev)
+	    fprintf (stderr,
+		     "   Still queued %d\n",
+		     sig->signal);
+
+	  fprintf (stderr, "   (no more queued signals)\n");
+	}
+
+      return 1;
+    }
+
+  return 0;
+}
+
 /* Arrange for a breakpoint to be hit again later.  We don't keep the
    SIGTRAP status and don't forward the SIGTRAP signal to the LWP.  We
    will handle the current event, eventually we will resume this LWP,
@@ -1226,6 +1471,21 @@ linux_wait_for_event_1 (ptid_t ptid, int
     {
       requested_child = find_lwp_pid (ptid);
 
+      if (!stopping_threads
+	  && requested_child->status_pending_p
+	  && requested_child->collecting_fast_tracepoint)
+	{
+	  enqueue_one_deferred_signal (requested_child,
+				       &requested_child->status_pending);
+	  requested_child->status_pending_p = 0;
+	  requested_child->status_pending = 0;
+	  linux_resume_one_lwp (requested_child, 0, 0, NULL);
+	}
+
+      if (requested_child->suspended
+	  && requested_child->status_pending_p)
+	fatal ("requesting an event out of a suspended child?");
+
       if (requested_child->status_pending_p)
 	event_child = requested_child;
     }
@@ -1601,6 +1861,113 @@ gdb_wants_all_stopped (void)
   for_each_inferior (&all_lwps, gdb_wants_lwp_stopped);
 }
 
+static void move_out_of_jump_pad_callback (struct inferior_list_entry *entry);
+static int stuck_in_jump_pad_callback (struct inferior_list_entry *entry,
+				       void *data);
+static int lwp_running (struct inferior_list_entry *entry, void *data);
+static ptid_t linux_wait_1 (ptid_t ptid,
+			    struct target_waitstatus *ourstatus,
+			    int target_options);
+
+/* Stabilize threads (move out of jump pads).
+
+   If a thread is midway collecting a fast tracepoint, we need to
+   finish the collection and move it out of the jump pad before
+   reporting the signal.
+
+   This avoids recursion while collecting (when a signal arrives
+   midway, and the signal handler itself collects), which would trash
+   the trace buffer.  In case the user set a breakpoint in a signal
+   handler, this avoids the backtrace showing the jump pad, etc..
+   Most importantly, there are certain things we can't do safely if
+   threads are stopped in a jump pad (or in its callee's).  For
+   example:
+
+     - starting a new trace run.  A thread still collecting the
+   previous run, could trash the trace buffer when resumed.  The trace
+   buffer control structures would have been reset but the thread had
+   no way to tell.  The thread could even midway memcpy'ing to the
+   buffer, which would mean that when resumed, it would clobber the
+   trace buffer that had been set for a new run.
+
+     - we can't rewrite/reuse the jump pads for new tracepoints
+   safely.  Say you do tstart while a thread is stopped midway while
+   collecting.  When the thread is later resumed, it finishes the
+   collection, and returns to the jump pad, to execute the original
+   instruction that was under the tracepoint jump at the time the
+   older run had been started.  If the jump pad had been rewritten
+   since for something else in the new run, the thread would now
+   execute the wrong / random instructions.  */
+
+static void
+linux_stabilize_threads (void)
+{
+  struct thread_info *save_inferior;
+  struct lwp_info *lwp_stuck;
+
+  lwp_stuck
+    = (struct lwp_info *) find_inferior (&all_lwps,
+					 stuck_in_jump_pad_callback, NULL);
+  if (lwp_stuck != NULL)
+    {
+      fprintf (stderr, "can't stabilize, LWP %ld is stuck in jump pad\n",
+	       lwpid_of (lwp_stuck));
+      return;
+    }
+
+  save_inferior = current_inferior;
+
+  stabilizing_threads = 1;
+
+  /* Kick 'em all.  */
+  for_each_inferior (&all_lwps, move_out_of_jump_pad_callback);
+
+  /* Loop until all are stopped out of the jump pads.  */
+  while (find_inferior (&all_lwps, lwp_running, NULL) != NULL)
+    {
+      struct target_waitstatus ourstatus;
+      struct lwp_info *lwp;
+      ptid_t ptid;
+      int wstat;
+
+      /* Note that we go through the full wait even loop.  While
+	 moving threads out of jump pad, we need to be able to step
+	 over internal breakpoints and such.  */
+      ptid = linux_wait_1 (minus_one_ptid, &ourstatus, 0);
+
+      if (ourstatus.kind == TARGET_WAITKIND_STOPPED)
+	{
+	  lwp = get_thread_lwp (current_inferior);
+
+	  /* Lock it.  */
+	  lwp->suspended++;
+
+	  if (ourstatus.value.sig != TARGET_SIGNAL_0
+	      || current_inferior->last_resume_kind == resume_stop)
+	    {
+	      wstat = W_STOPCODE (target_signal_to_host (ourstatus.value.sig));
+	      enqueue_one_deferred_signal (lwp, &wstat);
+	    }
+	}
+    }
+
+  find_inferior (&all_lwps, unsuspend_one_lwp, NULL);
+
+  stabilizing_threads = 0;
+
+  current_inferior = save_inferior;
+
+  lwp_stuck
+    = (struct lwp_info *) find_inferior (&all_lwps,
+					 stuck_in_jump_pad_callback, NULL);
+  if (lwp_stuck != NULL)
+    {
+      if (debug_threads)
+	fprintf (stderr, "couldn't stabilize, LWP %ld got stuck in jump pad\n",
+		 lwpid_of (lwp_stuck));
+    }
+}
+
 /* Wait for process, returns status.  */
 
 static ptid_t
@@ -1623,6 +1990,8 @@ linux_wait_1 (ptid_t ptid,
     options |= WNOHANG;
 
 retry:
+  bp_explains_trap = 0;
+  trace_event = 0;
   ourstatus->kind = TARGET_WAITKIND_IGNORE;
 
   /* If we were only supposed to resume one thread, only wait for
@@ -1765,8 +2134,110 @@ retry:
       /* We have some other signal, possibly a step-over dance was in
 	 progress, and it should be cancelled too.  */
       step_over_finished = finish_step_over (event_child);
+    }
+
+  /* We have all the data we need.  Either report the event to GDB, or
+     resume threads and keep waiting for more.  */
+
+  /* If we're collecting a fast tracepoint, finish the collection and
+     move out of the jump pad before delivering a signal.  See
+     linux_stabilize_threads.  */
+
+  if (WIFSTOPPED (w)
+      && WSTOPSIG (w) != SIGTRAP
+      && supports_fast_tracepoints ()
+      && in_process_agent_loaded ())
+    {
+      if (debug_threads)
+	fprintf (stderr,
+		 "Got signal %d for LWP %ld.  Check if we need "
+		 "to defer or adjust it.\n",
+		 WSTOPSIG (w), lwpid_of (event_child));
+
+      /* Allow debugging the jump pad itself.  */
+      if (current_inferior->last_resume_kind != resume_step
+	  && maybe_move_out_of_jump_pad (event_child, &w))
+	{
+	  enqueue_one_deferred_signal (event_child, &w);
+
+	  if (debug_threads)
+	    fprintf (stderr,
+		     "Signal %d for LWP %ld deferred (in jump pad)\n",
+		     WSTOPSIG (w), lwpid_of (event_child));
+
+	  linux_resume_one_lwp (event_child, 0, 0, NULL);
+	  goto retry;
+	}
+    }
+
+  if (event_child->collecting_fast_tracepoint)
+    {
+      if (debug_threads)
+	fprintf (stderr, "\
+LWP %ld was trying to move out of the jump pad (%d).  \
+Check if we're already there.\n",
+		 lwpid_of (event_child),
+		 event_child->collecting_fast_tracepoint);
 
-      trace_event = 0;
+      trace_event = 1;
+
+      event_child->collecting_fast_tracepoint
+	= linux_fast_tracepoint_collecting (event_child, NULL);
+
+      if (event_child->collecting_fast_tracepoint != 1)
+	{
+	  /* No longer need this breakpoint.  */
+	  if (event_child->exit_jump_pad_bkpt != NULL)
+	    {
+	      if (debug_threads)
+		fprintf (stderr,
+			 "No longer need exit-jump-pad bkpt; removing it."
+			 "stopping all threads momentarily.\n");
+
+	      /* Other running threads could hit this breakpoint.
+		 We don't handle moribund locations like GDB does,
+		 instead we always pause all threads when removing
+		 breakpoints, so that any step-over or
+		 decr_pc_after_break adjustment is always taken
+		 care of while the breakpoint is still
+		 inserted.  */
+	      stop_all_lwps (1, event_child);
+	      cancel_breakpoints ();
+
+	      delete_breakpoint (event_child->exit_jump_pad_bkpt);
+	      event_child->exit_jump_pad_bkpt = NULL;
+
+	      unstop_all_lwps (1, event_child);
+
+	      gdb_assert (event_child->suspended >= 0);
+	    }
+	}
+
+      if (event_child->collecting_fast_tracepoint == 0)
+	{
+	  if (debug_threads)
+	    fprintf (stderr,
+		     "fast tracepoint finished "
+		     "collecting successfully.\n");
+
+	  /* We may have a deferred signal to report.  */
+	  if (dequeue_one_deferred_signal (event_child, &w))
+	    {
+	      if (debug_threads)
+		fprintf (stderr, "dequeued one signal.\n");
+	    }
+	  else if (debug_threads)
+	    {
+	      fprintf (stderr, "no deferred signals.\n");
+
+	      if (stabilizing_threads)
+		{
+		  ourstatus->kind = TARGET_WAITKIND_STOPPED;
+		  ourstatus->value.sig = TARGET_SIGNAL_0;
+		  return ptid_of (event_child);
+		}
+	    }
+	}
     }
 
   /* Check whether GDB would be interested in this event.  */
@@ -1877,7 +2348,7 @@ retry:
 
   /* Alright, we're going to report a stop.  */
 
-  if (!non_stop)
+  if (!non_stop && !stabilizing_threads)
     {
       /* In all-stop, stop all threads.  */
       stop_all_lwps (0, NULL);
@@ -1902,6 +2373,9 @@ retry:
 	 See the comment in cancel_breakpoints_callback to find out
 	 why.  */
       find_inferior (&all_lwps, cancel_breakpoints_callback, event_child);
+
+      /* Stabilize threads (move out of jump pads).  */
+      stabilize_threads ();
     }
   else
     {
@@ -1939,6 +2413,9 @@ retry:
 
   gdb_assert (ptid_equal (step_over_bkpt, null_ptid));
 
+  if (stabilizing_threads)
+    return ptid_of (event_child);
+
   if (!non_stop)
     {
       /* From GDB's perspective, all-stop mode always stops all
@@ -2210,6 +2687,82 @@ wait_for_sigstop (struct inferior_list_e
     }
 }
 
+/* Returns true if LWP ENTRY is stopped in a jump pad, and we can't
+   move it out, because we need to report the stop event to GDB.  For
+   example, if the user puts a breakpoint in the jump pad, it's
+   because she wants to debug it.  */
+
+static int
+stuck_in_jump_pad_callback (struct inferior_list_entry *entry, void *data)
+{
+  struct lwp_info *lwp = (struct lwp_info *) entry;
+  struct thread_info *thread = get_lwp_thread (lwp);
+
+  gdb_assert (lwp->suspended == 0);
+  gdb_assert (lwp->stopped);
+
+  /* Allow debugging the jump pad, gdb_collect, etc..  */
+  return (supports_fast_tracepoints ()
+	  && in_process_agent_loaded ()
+	  && (gdb_breakpoint_here (lwp->stop_pc)
+	      || lwp->stopped_by_watchpoint
+	      || thread->last_resume_kind == resume_step)
+	  && linux_fast_tracepoint_collecting (lwp, NULL));
+}
+
+static void
+move_out_of_jump_pad_callback (struct inferior_list_entry *entry)
+{
+  struct lwp_info *lwp = (struct lwp_info *) entry;
+  struct thread_info *thread = get_lwp_thread (lwp);
+  int *wstat;
+
+  gdb_assert (lwp->suspended == 0);
+  gdb_assert (lwp->stopped);
+
+  wstat = lwp->status_pending_p ? &lwp->status_pending : NULL;
+
+  /* Allow debugging the jump pad, gdb_collect, etc.  */
+  if (!gdb_breakpoint_here (lwp->stop_pc)
+      && !lwp->stopped_by_watchpoint
+      && thread->last_resume_kind != resume_step
+      && maybe_move_out_of_jump_pad (lwp, wstat))
+    {
+      if (debug_threads)
+	fprintf (stderr,
+		 "LWP %ld needs stabilizing (in jump pad)\n",
+		 lwpid_of (lwp));
+
+      if (wstat)
+	{
+	  lwp->status_pending_p = 0;
+	  enqueue_one_deferred_signal (lwp, wstat);
+
+	  if (debug_threads)
+	    fprintf (stderr,
+		     "Signal %d for LWP %ld deferred "
+		     "(in jump pad)\n",
+		     WSTOPSIG (*wstat), lwpid_of (lwp));
+	}
+
+      linux_resume_one_lwp (lwp, 0, 0, NULL);
+    }
+  else
+    lwp->suspended++;
+}
+
+static int
+lwp_running (struct inferior_list_entry *entry, void *data)
+{
+  struct lwp_info *lwp = (struct lwp_info *) entry;
+
+  if (lwp->dead)
+    return 0;
+  if (lwp->stopped)
+    return 0;
+  return 1;
+}
+
 /* Stop all lwps that aren't stopped yet, except EXCEPT, if not NULL.
    If SUSPEND, then also increase the suspend count of every LWP,
    except EXCEPT.  */
@@ -2236,10 +2789,15 @@ linux_resume_one_lwp (struct lwp_info *l
 		      int step, int signal, siginfo_t *info)
 {
   struct thread_info *saved_inferior;
+  int fast_tp_collecting;
 
   if (lwp->stopped == 0)
     return;
 
+  fast_tp_collecting = lwp->collecting_fast_tracepoint;
+
+  gdb_assert (!stabilizing_threads || fast_tp_collecting);
+
   /* Cancel actions that rely on GDB not changing the PC (e.g., the
      user used the "jump" command, or "set $pc = foo").  */
   if (lwp->stop_pc != get_pc (lwp))
@@ -2253,8 +2811,10 @@ linux_resume_one_lwp (struct lwp_info *l
      signal.  Also enqueue the signal if we are waiting to reinsert a
      breakpoint; it will be picked up again below.  */
   if (signal != 0
-      && (lwp->status_pending_p || lwp->pending_signals != NULL
-	  || lwp->bp_reinsert != 0))
+      && (lwp->status_pending_p
+	  || lwp->pending_signals != NULL
+	  || lwp->bp_reinsert != 0
+	  || fast_tp_collecting))
     {
       struct pending_signals *p_sig;
       p_sig = xmalloc (sizeof (*p_sig));
@@ -2303,11 +2863,14 @@ linux_resume_one_lwp (struct lwp_info *l
 
       if (lwp->bp_reinsert != 0 && can_hardware_single_step ())
 	{
-	  if (step == 0)
-	    fprintf (stderr, "BAD - reinserting but not stepping.\n");
-	  if (lwp->suspended)
-	    fprintf (stderr, "BAD - reinserting and suspended(%d).\n",
-		     lwp->suspended);
+	  if (fast_tp_collecting == 0)
+	    {
+	      if (step == 0)
+		fprintf (stderr, "BAD - reinserting but not stepping.\n");
+	      if (lwp->suspended)
+		fprintf (stderr, "BAD - reinserting and suspended(%d).\n",
+			 lwp->suspended);
+	    }
 
 	  step = 1;
 	}
@@ -2316,6 +2879,33 @@ linux_resume_one_lwp (struct lwp_info *l
       signal = 0;
     }
 
+  if (fast_tp_collecting == 1)
+    {
+      if (debug_threads)
+	fprintf (stderr, "\
+lwp %ld wants to get out of fast tracepoint jump pad (exit-jump-pad-bkpt)\n",
+		 lwpid_of (lwp));
+
+      /* Postpone any pending signal.  It was enqueued above.  */
+      signal = 0;
+    }
+  else if (fast_tp_collecting == 2)
+    {
+      if (debug_threads)
+	fprintf (stderr, "\
+lwp %ld wants to get out of fast tracepoint jump pad single-stepping\n",
+		 lwpid_of (lwp));
+
+      if (can_hardware_single_step ())
+	step = 1;
+      else
+	fatal ("moving out of jump pad single-stepping"
+	       " not implemented on this target");
+
+      /* Postpone any pending signal.  It was enqueued above.  */
+      signal = 0;
+    }
+
   /* If we have while-stepping actions in this thread set it stepping.
      If we have a signal to deliver, it may or may not be set to
      SIG_IGN, we don't know.  Assume so, and allow collecting
@@ -2341,9 +2931,12 @@ linux_resume_one_lwp (struct lwp_info *l
       fprintf (stderr, "  resuming from pc 0x%lx\n", (long) pc);
     }
 
-  /* If we have pending signals, consume one unless we are trying to reinsert
-     a breakpoint.  */
-  if (lwp->pending_signals != NULL && lwp->bp_reinsert == 0)
+  /* If we have pending signals, consume one unless we are trying to
+     reinsert a breakpoint or we're trying to finish a fast tracepoint
+     collect.  */
+  if (lwp->pending_signals != NULL
+      && lwp->bp_reinsert == 0
+      && fast_tp_collecting == 0)
     {
       struct pending_signals **p_sig;
 
@@ -2440,6 +3033,23 @@ linux_set_resume_request (struct inferio
 
 	  lwp->resume = &r->resume[ndx];
 	  thread->last_resume_kind = lwp->resume->kind;
+
+	  /* If we had a deferred signal to report, dequeue one now.
+	     This can happen if LWP gets more than one signal while
+	     trying to get out of a jump pad.  */
+	  if (lwp->stopped
+	      && !lwp->status_pending_p
+	      && dequeue_one_deferred_signal (lwp, &lwp->status_pending))
+	    {
+	      lwp->status_pending_p = 1;
+
+	      if (debug_threads)
+		fprintf (stderr,
+			 "Dequeueing deferred signal %d for LWP %ld, "
+			 "leaving status pending.\n",
+			 WSTOPSIG (lwp->status_pending), lwpid_of (lwp));
+	    }
+
 	  return 0;
 	}
     }
@@ -2556,7 +3166,7 @@ need_step_over_p (struct inferior_list_e
   current_inferior = thread;
 
   /* We can only step over breakpoints we know about.  */
-  if (breakpoint_here (pc))
+  if (breakpoint_here (pc) || fast_tracepoint_jump_here (pc))
     {
       /* Don't step over a breakpoint that GDB expects to hit
 	 though.  */
@@ -2645,6 +3255,7 @@ start_step_over (struct lwp_info *lwp)
 
   lwp->bp_reinsert = pc;
   uninsert_breakpoints_at (pc);
+  uninsert_fast_tracepoint_jumps_at (pc);
 
   if (can_hardware_single_step ())
     {
@@ -2681,6 +3292,7 @@ finish_step_over (struct lwp_info *lwp)
       /* Reinsert any breakpoint at LWP->BP_REINSERT.  Note that there
 	 may be no breakpoint to reinsert there by now.  */
       reinsert_breakpoints_at (lwp->bp_reinsert);
+      reinsert_fast_tracepoint_jumps_at (lwp->bp_reinsert);
 
       lwp->bp_reinsert = 0;
 
@@ -4425,6 +5037,23 @@ linux_unpause_all (int unfreeze)
   unstop_all_lwps (unfreeze, NULL);
 }
 
+static int
+linux_install_fast_tracepoint_jump_pad (CORE_ADDR tpoint, CORE_ADDR tpaddr,
+					CORE_ADDR collector,
+					CORE_ADDR lockaddr,
+					ULONGEST orig_size,
+					CORE_ADDR *jump_entry,
+					unsigned char *jjump_pad_insn,
+					ULONGEST *jjump_pad_insn_size,
+					CORE_ADDR *adjusted_insn_addr,
+					CORE_ADDR *adjusted_insn_addr_end)
+{
+  return (*the_low_target.install_fast_tracepoint_jump_pad)
+    (tpoint, tpaddr, collector, lockaddr, orig_size,
+     jump_entry, jjump_pad_insn, jjump_pad_insn_size,
+     adjusted_insn_addr, adjusted_insn_addr_end);
+}
+
 static struct target_ops linux_target_ops = {
   linux_create_inferior,
   linux_attach,
@@ -4478,7 +5107,9 @@ static struct target_ops linux_target_op
   NULL,
   linux_pause_all,
   linux_unpause_all,
-  linux_cancel_breakpoints
+  linux_cancel_breakpoints,
+  linux_stabilize_threads,
+  linux_install_fast_tracepoint_jump_pad
 };
 
 static void
Index: src/gdb/gdbserver/linux-low.h
===================================================================
--- src.orig/gdb/gdbserver/linux-low.h	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/linux-low.h	2010-06-01 13:57:18.000000000 +0100
@@ -120,6 +120,22 @@ struct linux_target_ops
 
   /* Returns true if the low target supports tracepoints.  */
   int (*supports_tracepoints) (void);
+
+  /* Fill ADDRP with the thread area address of LWPID.  Returns 0 on
+     success, -1 on failure.  */
+  int (*get_thread_area) (int lwpid, CORE_ADDR *addrp);
+
+  /* Install a fast tracepoint jump pad.  See target.h for
+     comments.  */
+  int (*install_fast_tracepoint_jump_pad) (CORE_ADDR tpoint, CORE_ADDR tpaddr,
+					   CORE_ADDR collector,
+					   CORE_ADDR lockaddr,
+					   ULONGEST orig_size,
+					   CORE_ADDR *jump_entry,
+					   unsigned char *jjump_pad_insn,
+					   ULONGEST *jjump_pad_insn_size,
+					   CORE_ADDR *adjusted_insn_addr,
+					   CORE_ADDR *adjusted_insn_addr_end);
 };
 
 extern struct linux_target_ops the_low_target;
@@ -201,6 +217,22 @@ struct lwp_info
      and then processed and cleared in linux_resume_one_lwp.  */
   struct thread_resume *resume;
 
+  /* True if it is known that this lwp is presently collecting a fast
+     tracepoint (it is in the jump pad or in some code that will
+     return to the jump pad.  Normally, we won't care about this, but
+     we will if a signal arrives to this lwp while it is
+     collecting.  */
+  int collecting_fast_tracepoint;
+
+  /* If this is non-zero, it points to a chain of signals which need
+     to be reported to GDB.  These were deferred because the thread
+     was doing a fast tracepoint collect when they arrived.  */
+  struct pending_signals *pending_signals_to_report;
+
+  /* When collecting_fast_tracepoint is first found to be 1, we insert
+     a exit-jump-pad-quickly breakpoint.  This is it.  */
+  struct breakpoint *exit_jump_pad_bkpt;
+
   /* True if the LWP was seen stop at an internal breakpoint and needs
      stepping over later when it is resumed.  */
   int need_step_over;
@@ -223,6 +255,7 @@ int elf_64_file_p (const char *file);
 
 void linux_attach_lwp (unsigned long pid);
 struct lwp_info *find_lwp_pid (ptid_t ptid);
+int linux_get_thread_area (int lwpid, CORE_ADDR *area);
 
 /* From thread-db.c  */
 int thread_db_init (int use_events);
Index: src/gdb/gdbserver/linux-x86-low.c
===================================================================
--- src.orig/gdb/gdbserver/linux-x86-low.c	2010-06-01 13:44:36.000000000 +0100
+++ src/gdb/gdbserver/linux-x86-low.c	2010-06-01 13:57:18.000000000 +0100
@@ -40,6 +40,8 @@ void init_registers_amd64_avx_linux (voi
 /* Defined in auto-generated file i386-mmx-linux.c.  */
 void init_registers_i386_mmx_linux (void);
 
+static unsigned char jump_insn[] = { 0xe9, 0, 0, 0, 0 };
+
 /* Backward compatibility for gdb without XML support.  */
 
 static const char *xmltarget_i386_linux_no_xml = "@<target>\
@@ -191,6 +193,53 @@ ps_get_thread_area (const struct ps_proc
     return PS_OK;
   }
 }
+
+/* Get the thread area address.  This is used to recognize which
+   thread is which when tracing with the in-process agent library.  We
+   don't read anything from the address, and treat it as opaque; it's
+   the address itself that we assume is unique per-thread.  */
+
+static int
+x86_get_thread_area (int lwpid, CORE_ADDR *addr)
+{
+#ifdef __x86_64__
+  int use_64bit = register_size (0) == 8;
+
+  if (use_64bit)
+    {
+      void *base;
+      if (ptrace (PTRACE_ARCH_PRCTL, lwpid, &base, ARCH_GET_FS) == 0)
+	{
+	  *addr = (CORE_ADDR) (uintptr_t) base;
+	  return 0;
+	}
+
+      return -1;
+    }
+#endif
+
+  {
+    struct lwp_info *lwp = find_lwp_pid (pid_to_ptid (lwpid));
+    struct regcache *regcache = get_thread_regcache (get_lwp_thread (lwp), 1);
+    unsigned int desc[4];
+    ULONGEST gs = 0;
+    const int reg_thread_area = 3; /* bits to scale down register value.  */
+    int idx;
+
+    collect_register_by_name (regcache, "gs", &gs);
+
+    idx = gs >> reg_thread_area;
+
+    if (ptrace (PTRACE_GET_THREAD_AREA,
+		lwpid_of (lwp), (void *) (long) idx, (unsigned long) &desc) < 0)
+      return -1;
+
+    *addr = desc[1];
+    return 0;
+  }
+}
+
+
 
 static int
 i386_cannot_store_register (int regno)
@@ -1041,6 +1090,386 @@ x86_supports_tracepoints (void)
   return 1;
 }
 
+static void
+append_insns (CORE_ADDR *to, size_t len, const unsigned char *buf)
+{
+  write_inferior_memory (*to, buf, len);
+  *to += len;
+}
+
+static int
+push_opcode (unsigned char *buf, char *op)
+{
+  unsigned char *buf_org = buf;
+
+  while (1)
+    {
+      char *endptr;
+      unsigned long ul = strtoul (op, &endptr, 16);
+
+      if (endptr == op)
+	break;
+
+      *buf++ = ul;
+      op = endptr;
+    }
+
+  return buf - buf_org;
+}
+
+#ifdef __x86_64__
+
+/* Build a jump pad that saves registers and calls a collection
+   function.  Writes a jump instruction to the jump pad to
+   JJUMPAD_INSN.  The caller is responsible to write it in at the
+   tracepoint address.  */
+
+static int
+amd64_install_fast_tracepoint_jump_pad (CORE_ADDR tpoint, CORE_ADDR tpaddr,
+					CORE_ADDR collector,
+					CORE_ADDR lockaddr,
+					ULONGEST orig_size,
+					CORE_ADDR *jump_entry,
+					unsigned char *jjump_pad_insn,
+					ULONGEST *jjump_pad_insn_size,
+					CORE_ADDR *adjusted_insn_addr,
+					CORE_ADDR *adjusted_insn_addr_end)
+{
+  unsigned char buf[40];
+  int i, offset;
+  CORE_ADDR buildaddr = *jump_entry;
+
+  /* Build the jump pad.  */
+
+  /* First, do tracepoint data collection.  Save registers.  */
+  i = 0;
+  /* Need to ensure stack pointer saved first.  */
+  buf[i++] = 0x54; /* push %rsp */
+  buf[i++] = 0x55; /* push %rbp */
+  buf[i++] = 0x57; /* push %rdi */
+  buf[i++] = 0x56; /* push %rsi */
+  buf[i++] = 0x52; /* push %rdx */
+  buf[i++] = 0x51; /* push %rcx */
+  buf[i++] = 0x53; /* push %rbx */
+  buf[i++] = 0x50; /* push %rax */
+  buf[i++] = 0x41; buf[i++] = 0x57; /* push %r15 */
+  buf[i++] = 0x41; buf[i++] = 0x56; /* push %r14 */
+  buf[i++] = 0x41; buf[i++] = 0x55; /* push %r13 */
+  buf[i++] = 0x41; buf[i++] = 0x54; /* push %r12 */
+  buf[i++] = 0x41; buf[i++] = 0x53; /* push %r11 */
+  buf[i++] = 0x41; buf[i++] = 0x52; /* push %r10 */
+  buf[i++] = 0x41; buf[i++] = 0x51; /* push %r9 */
+  buf[i++] = 0x41; buf[i++] = 0x50; /* push %r8 */
+  buf[i++] = 0x9c; /* pushfq */
+  buf[i++] = 0x48; /* movl <addr>,%rdi */
+  buf[i++] = 0xbf;
+  *((unsigned long *)(buf + i)) = (unsigned long) tpaddr;
+  i += sizeof (unsigned long);
+  buf[i++] = 0x57; /* push %rdi */
+  append_insns (&buildaddr, i, buf);
+
+  /* Stack space for the collecting_t object.  */
+  i = 0;
+  i += push_opcode (&buf[i], "48 83 ec 18");	/* sub $0x18,%rsp */
+  i += push_opcode (&buf[i], "48 b8");          /* mov <tpoint>,%rax */
+  memcpy (buf + i, &tpoint, 8);
+  i += 8;
+  i += push_opcode (&buf[i], "48 89 04 24");    /* mov %rax,(%rsp) */
+  i += push_opcode (&buf[i],
+		    "64 48 8b 04 25 00 00 00 00"); /* mov %fs:0x0,%rax */
+  i += push_opcode (&buf[i], "48 89 44 24 08"); /* mov %rax,0x8(%rsp) */
+  append_insns (&buildaddr, i, buf);
+
+  /* spin-lock.  */
+  i = 0;
+  i += push_opcode (&buf[i], "48 be");		/* movl <lockaddr>,%rsi */
+  memcpy (&buf[i], (void *) &lockaddr, 8);
+  i += 8;
+  i += push_opcode (&buf[i], "48 89 e1");       /* mov %rsp,%rcx */
+  i += push_opcode (&buf[i], "31 c0");		/* xor %eax,%eax */
+  i += push_opcode (&buf[i], "f0 48 0f b1 0e"); /* lock cmpxchg %rcx,(%rsi) */
+  i += push_opcode (&buf[i], "48 85 c0");	/* test %rax,%rax */
+  i += push_opcode (&buf[i], "75 f4");		/* jne <again> */
+  append_insns (&buildaddr, i, buf);
+
+  /* Set up the gdb_collect call.  */
+  /* At this point, (stack pointer + 0x18) is the base of our saved
+     register block.  */
+
+  i = 0;
+  i += push_opcode (&buf[i], "48 89 e6");	/* mov %rsp,%rsi */
+  i += push_opcode (&buf[i], "48 83 c6 18");	/* add $0x18,%rsi */
+
+  /* tpoint address may be 64-bit wide.  */
+  i += push_opcode (&buf[i], "48 bf");		/* movl <addr>,%rdi */
+  memcpy (buf + i, &tpoint, 8);
+  i += 8;
+  append_insns (&buildaddr, i, buf);
+
+  /* The collector function being in the shared library, may be
+     >31-bits away off the jump pad.  */
+  i = 0;
+  i += push_opcode (&buf[i], "48 b8");          /* mov $collector,%rax */
+  memcpy (buf + i, &collector, 8);
+  i += 8;
+  i += push_opcode (&buf[i], "ff d0");          /* callq *%rax */
+  append_insns (&buildaddr, i, buf);
+
+  /* Clear the spin-lock.  */
+  i = 0;
+  i += push_opcode (&buf[i], "31 c0");		/* xor %eax,%eax */
+  i += push_opcode (&buf[i], "48 a3");		/* mov %rax, lockaddr */
+  memcpy (buf + i, &lockaddr, 8);
+  i += 8;
+  append_insns (&buildaddr, i, buf);
+
+  /* Remove stack that had been used for the collect_t object.  */
+  i = 0;
+  i += push_opcode (&buf[i], "48 83 c4 18");	/* add $0x18,%rsp */
+  append_insns (&buildaddr, i, buf);
+
+  /* Restore register state.  */
+  i = 0;
+  buf[i++] = 0x48; /* add $0x8,%rsp */
+  buf[i++] = 0x83;
+  buf[i++] = 0xc4;
+  buf[i++] = 0x08;
+  buf[i++] = 0x9d; /* popfq */
+  buf[i++] = 0x41; buf[i++] = 0x58; /* pop %r8 */
+  buf[i++] = 0x41; buf[i++] = 0x59; /* pop %r9 */
+  buf[i++] = 0x41; buf[i++] = 0x5a; /* pop %r10 */
+  buf[i++] = 0x41; buf[i++] = 0x5b; /* pop %r11 */
+  buf[i++] = 0x41; buf[i++] = 0x5c; /* pop %r12 */
+  buf[i++] = 0x41; buf[i++] = 0x5d; /* pop %r13 */
+  buf[i++] = 0x41; buf[i++] = 0x5e; /* pop %r14 */
+  buf[i++] = 0x41; buf[i++] = 0x5f; /* pop %r15 */
+  buf[i++] = 0x58; /* pop %rax */
+  buf[i++] = 0x5b; /* pop %rbx */
+  buf[i++] = 0x59; /* pop %rcx */
+  buf[i++] = 0x5a; /* pop %rdx */
+  buf[i++] = 0x5e; /* pop %rsi */
+  buf[i++] = 0x5f; /* pop %rdi */
+  buf[i++] = 0x5d; /* pop %rbp */
+  buf[i++] = 0x5c; /* pop %rsp */
+  append_insns (&buildaddr, i, buf);
+
+  /* Now, adjust the original instruction to execute in the jump
+     pad.  */
+  *adjusted_insn_addr = buildaddr;
+  relocate_instruction (&buildaddr, tpaddr);
+  *adjusted_insn_addr_end = buildaddr;
+
+  /* Finally, write a jump back to the program.  */
+  offset = (tpaddr + orig_size) - (buildaddr + sizeof (jump_insn));
+  memcpy (buf, jump_insn, sizeof (jump_insn));
+  memcpy (buf + 1, &offset, 4);
+  append_insns (&buildaddr, sizeof (jump_insn), buf);
+
+  /* The jump pad is now built.  Wire in a jump to our jump pad.  This
+     is always done last (by our caller actually), so that we can
+     install fast tracepoints with threads running.  This relies on
+     the agent's atomic write support.  */
+  offset = *jump_entry - (tpaddr + sizeof (jump_insn));
+  memcpy (buf, jump_insn, sizeof (jump_insn));
+  memcpy (buf + 1, &offset, 4);
+  memcpy (jjump_pad_insn, buf, sizeof (jump_insn));
+  *jjump_pad_insn_size = sizeof (jump_insn);
+
+  /* Return the end address of our pad.  */
+  *jump_entry = buildaddr;
+
+  return 0;
+}
+
+#endif /* __x86_64__ */
+
+/* Build a jump pad that saves registers and calls a collection
+   function.  Writes a jump instruction to the jump pad to
+   JJUMPAD_INSN.  The caller is responsible to write it in at the
+   tracepoint address.  */
+
+static int
+i386_install_fast_tracepoint_jump_pad (CORE_ADDR tpoint, CORE_ADDR tpaddr,
+				       CORE_ADDR collector,
+				       CORE_ADDR lockaddr,
+				       ULONGEST orig_size,
+				       CORE_ADDR *jump_entry,
+				       unsigned char *jjump_pad_insn,
+				       ULONGEST *jjump_pad_insn_size,
+				       CORE_ADDR *adjusted_insn_addr,
+				       CORE_ADDR *adjusted_insn_addr_end)
+{
+  unsigned char buf[0x100];
+  int i, offset;
+  CORE_ADDR buildaddr = *jump_entry;
+
+  /* Build the jump pad.  */
+
+  /* First, do tracepoint data collection.  Save registers.  */
+  i = 0;
+  buf[i++] = 0x60; /* pushad */
+  buf[i++] = 0x68; /* push tpaddr aka $pc */
+  *((int *)(buf + i)) = (int) tpaddr;
+  i += 4;
+  buf[i++] = 0x9c; /* pushf */
+  buf[i++] = 0x1e; /* push %ds */
+  buf[i++] = 0x06; /* push %es */
+  buf[i++] = 0x0f; /* push %fs */
+  buf[i++] = 0xa0;
+  buf[i++] = 0x0f; /* push %gs */
+  buf[i++] = 0xa8;
+  buf[i++] = 0x16; /* push %ss */
+  buf[i++] = 0x0e; /* push %cs */
+  append_insns (&buildaddr, i, buf);
+
+  /* Stack space for the collecting_t object.  */
+  i = 0;
+  i += push_opcode (&buf[i], "83 ec 08");	/* sub    $0x8,%esp */
+
+  /* Build the object.  */
+  i += push_opcode (&buf[i], "b8");		/* mov    <tpoint>,%eax */
+  memcpy (buf + i, &tpoint, 4);
+  i += 4;
+  i += push_opcode (&buf[i], "89 04 24");	   /* mov %eax,(%esp) */
+
+  i += push_opcode (&buf[i], "65 a1 00 00 00 00"); /* mov %gs:0x0,%eax */
+  i += push_opcode (&buf[i], "89 44 24 04");	   /* mov %eax,0x4(%esp) */
+  append_insns (&buildaddr, i, buf);
+
+  /* spin-lock.  Note this is using cmpxchg, which leaves i386 behind.
+     If we cared for it, this could be using xchg alternatively.  */
+
+  i = 0;
+  i += push_opcode (&buf[i], "31 c0");		/* xor %eax,%eax */
+  i += push_opcode (&buf[i], "f0 0f b1 25");    /* lock cmpxchg
+						   %esp,<lockaddr> */
+  memcpy (&buf[i], (void *) &lockaddr, 4);
+  i += 4;
+  i += push_opcode (&buf[i], "85 c0");		/* test %eax,%eax */
+  i += push_opcode (&buf[i], "75 f2");		/* jne <again> */
+  append_insns (&buildaddr, i, buf);
+
+
+  /* Set up arguments to the gdb_collect call.  */
+  i = 0;
+  i += push_opcode (&buf[i], "89 e0");		/* mov %esp,%eax */
+  i += push_opcode (&buf[i], "83 c0 08");	/* add $0x08,%eax */
+  i += push_opcode (&buf[i], "89 44 24 fc");	/* mov %eax,-0x4(%esp) */
+  append_insns (&buildaddr, i, buf);
+
+  i = 0;
+  i += push_opcode (&buf[i], "83 ec 08");	/* sub $0x8,%esp */
+  append_insns (&buildaddr, i, buf);
+
+  i = 0;
+  i += push_opcode (&buf[i], "c7 04 24");       /* movl <addr>,(%esp) */
+  memcpy (&buf[i], (void *) &tpoint, 4);
+  i += 4;
+  append_insns (&buildaddr, i, buf);
+
+  buf[0] = 0xe8; /* call <reladdr> */
+  offset = collector - (buildaddr + sizeof (jump_insn));
+  memcpy (buf + 1, &offset, 4);
+  append_insns (&buildaddr, 5, buf);
+  /* Clean up after the call.  */
+  buf[0] = 0x83; /* add $0x8,%esp */
+  buf[1] = 0xc4;
+  buf[2] = 0x08;
+  append_insns (&buildaddr, 3, buf);
+
+
+  /* Clear the spin-lock.  This would need the LOCK prefix on older
+     broken archs.  */
+  i = 0;
+  i += push_opcode (&buf[i], "31 c0");		/* xor %eax,%eax */
+  i += push_opcode (&buf[i], "a3");		/* mov %eax, lockaddr */
+  memcpy (buf + i, &lockaddr, 4);
+  i += 4;
+  append_insns (&buildaddr, i, buf);
+
+
+  /* Remove stack that had been used for the collect_t object.  */
+  i = 0;
+  i += push_opcode (&buf[i], "83 c4 08");	/* add $0x08,%esp */
+  append_insns (&buildaddr, i, buf);
+
+  i = 0;
+  buf[i++] = 0x83; /* add $0x4,%esp (no pop of %cs, assume unchanged) */
+  buf[i++] = 0xc4;
+  buf[i++] = 0x04;
+  buf[i++] = 0x17; /* pop %ss */
+  buf[i++] = 0x0f; /* pop %gs */
+  buf[i++] = 0xa9;
+  buf[i++] = 0x0f; /* pop %fs */
+  buf[i++] = 0xa1;
+  buf[i++] = 0x07; /* pop %es */
+  buf[i++] = 0x1f; /* pop %de */
+  buf[i++] = 0x9d; /* popf */
+  buf[i++] = 0x83; /* add $0x4,%esp (pop of tpaddr aka $pc) */
+  buf[i++] = 0xc4;
+  buf[i++] = 0x04;
+  buf[i++] = 0x61; /* popad */
+  append_insns (&buildaddr, i, buf);
+
+  /* Now, adjust the original instruction to execute in the jump
+     pad.  */
+  *adjusted_insn_addr = buildaddr;
+  relocate_instruction (&buildaddr, tpaddr);
+  *adjusted_insn_addr_end = buildaddr;
+
+  /* Write the jump back to the program.  */
+  offset = (tpaddr + orig_size) - (buildaddr + sizeof (jump_insn));
+  memcpy (buf, jump_insn, sizeof (jump_insn));
+  memcpy (buf + 1, &offset, 4);
+  append_insns (&buildaddr, sizeof (jump_insn), buf);
+
+  /* The jump pad is now built.  Wire in a jump to our jump pad.  This
+     is always done last (by our caller actually), so that we can
+     install fast tracepoints with threads running.  This relies on
+     the agent's atomic write support.  */
+  offset = *jump_entry - (tpaddr + sizeof (jump_insn));
+  memcpy (buf, jump_insn, sizeof (jump_insn));
+  memcpy (buf + 1, &offset, 4);
+  memcpy (jjump_pad_insn, buf, sizeof (jump_insn));
+  *jjump_pad_insn_size = sizeof (jump_insn);
+
+  /* Return the end address of our pad.  */
+  *jump_entry = buildaddr;
+
+  return 0;
+}
+
+static int
+x86_install_fast_tracepoint_jump_pad (CORE_ADDR tpoint, CORE_ADDR tpaddr,
+				      CORE_ADDR collector,
+				      CORE_ADDR lockaddr,
+				      ULONGEST orig_size,
+				      CORE_ADDR *jump_entry,
+				      unsigned char *jjump_pad_insn,
+				      ULONGEST *jjump_pad_insn_size,
+				      CORE_ADDR *adjusted_insn_addr,
+				      CORE_ADDR *adjusted_insn_addr_end)
+{
+#ifdef __x86_64__
+  if (register_size (0) == 8)
+    return amd64_install_fast_tracepoint_jump_pad (tpoint, tpaddr,
+						   collector, lockaddr,
+						   orig_size, jump_entry,
+						   jjump_pad_insn,
+						   jjump_pad_insn_size,
+						   adjusted_insn_addr,
+						   adjusted_insn_addr_end);
+#endif
+
+  return i386_install_fast_tracepoint_jump_pad (tpoint, tpaddr,
+						collector, lockaddr,
+						orig_size, jump_entry,
+						jjump_pad_insn,
+						jjump_pad_insn_size,
+						adjusted_insn_addr,
+						adjusted_insn_addr_end);
+}
+
 /* This is initialized assuming an amd64 target.
    x86_arch_setup will correct it for i386 or amd64 targets.  */
 
@@ -1073,5 +1502,7 @@ struct linux_target_ops the_low_target =
   x86_linux_new_thread,
   x86_linux_prepare_to_resume,
   x86_linux_process_qsupported,
-  x86_supports_tracepoints
+  x86_supports_tracepoints,
+  x86_get_thread_area,
+  x86_install_fast_tracepoint_jump_pad
 };
Index: src/gdb/gdbserver/mem-break.c
===================================================================
--- src.orig/gdb/gdbserver/mem-break.c	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/mem-break.c	2010-06-01 13:57:18.000000000 +0100
@@ -137,8 +137,10 @@ set_raw_breakpoint_at (CORE_ADDR where)
   bp->pc = where;
   bp->refcount = 1;
 
-  err = (*the_target->read_memory) (where, bp->old_data,
-				    breakpoint_len);
+  /* Note that there can be fast tracepoint jumps installed in the
+     same memory range, so to get at the original memory, we need to
+     use read_inferior_memory, which masks those out.  */
+  err = read_inferior_memory (where, bp->old_data, breakpoint_len);
   if (err != 0)
     {
       if (debug_threads)
@@ -169,6 +171,302 @@ set_raw_breakpoint_at (CORE_ADDR where)
   return bp;
 }
 
+/* Notice that breakpoint traps are always installed on top of fast
+   tracepoint jumps.  This is even if the fast tracepoint is installed
+   at a later time compared to when the breakpoint was installed.
+   This means that a stopping breakpoint or tracepoint has higher
+   "priority".  In turn, this allows having fast and slow tracepoints
+   (and breakpoints) at the same address behave correctly.  */
+
+
+/* A fast tracepoint jump.  */
+
+struct fast_tracepoint_jump
+{
+  struct fast_tracepoint_jump *next;
+
+  /* A reference count.  GDB can install more than one fast tracepoint
+     at the same address (each with its own action list, for
+     example).  */
+  int refcount;
+
+  /* The fast tracepoint's insertion address.  There can only be one
+     of these for a given PC.  */
+  CORE_ADDR pc;
+
+  /* Non-zero if this fast tracepoint jump is currently inserted in
+     the inferior.  */
+  int inserted;
+
+  /* The length of the jump instruction.  */
+  int length;
+
+  /* A poor-man's flexible array member, holding both the jump
+     instruction to insert, and a copy of the instruction that would
+     be in memory had not been a jump there (the shadow memory of the
+     tracepoint jump).  */
+  unsigned char insn_and_shadow[0];
+};
+
+/* Fast tracepoint FP's jump instruction to insert.  */
+#define fast_tracepoint_jump_insn(fp) \
+  ((fp)->insn_and_shadow + 0)
+
+/* The shadow memory of fast tracepoint jump FP.  */
+#define fast_tracepoint_jump_shadow(fp) \
+  ((fp)->insn_and_shadow + (fp)->length)
+
+
+/* Return the fast tracepoint jump set at WHERE.  */
+
+static struct fast_tracepoint_jump *
+find_fast_tracepoint_jump_at (CORE_ADDR where)
+{
+  struct process_info *proc = current_process ();
+  struct fast_tracepoint_jump *jp;
+
+  for (jp = proc->fast_tracepoint_jumps; jp != NULL; jp = jp->next)
+    if (jp->pc == where)
+      return jp;
+
+  return NULL;
+}
+
+int
+fast_tracepoint_jump_here (CORE_ADDR where)
+{
+  struct fast_tracepoint_jump *jp = find_fast_tracepoint_jump_at (where);
+
+  return (jp != NULL);
+}
+
+int
+delete_fast_tracepoint_jump (struct fast_tracepoint_jump *todel)
+{
+  struct fast_tracepoint_jump *bp, **bp_link;
+  int ret;
+  struct process_info *proc = current_process ();
+
+  bp = proc->fast_tracepoint_jumps;
+  bp_link = &proc->fast_tracepoint_jumps;
+
+  while (bp)
+    {
+      if (bp == todel)
+	{
+	  if (--bp->refcount == 0)
+	    {
+	      struct fast_tracepoint_jump *prev_bp_link = *bp_link;
+
+	      /* Unlink it.  */
+	      *bp_link = bp->next;
+
+	      /* Since there can be breakpoints inserted in the same
+		 address range, we use `write_inferior_memory', which
+		 takes care of layering breakpoints on top of fast
+		 tracepoints, and on top of the buffer we pass it.
+		 This works because we've already unlinked the fast
+		 tracepoint jump above.  Also note that we need to
+		 pass the current shadow contents, because
+		 write_inferior_memory updates any shadow memory with
+		 what we pass here, and we want that to be a nop.  */
+	      ret = write_inferior_memory (bp->pc,
+					   fast_tracepoint_jump_shadow (bp),
+					   bp->length);
+	      if (ret != 0)
+		{
+		  /* Something went wrong, relink the jump.  */
+		  *bp_link = prev_bp_link;
+
+		  if (debug_threads)
+		    fprintf (stderr,
+			     "Failed to uninsert fast tracepoint jump "
+			     "at 0x%s (%s) while deleting it.\n",
+			     paddress (bp->pc), strerror (ret));
+		  return ret;
+		}
+
+	      free (bp);
+	    }
+
+	  return 0;
+	}
+      else
+	{
+	  bp_link = &bp->next;
+	  bp = *bp_link;
+	}
+    }
+
+  warning ("Could not find fast tracepoint jump in list.");
+  return ENOENT;
+}
+
+struct fast_tracepoint_jump *
+set_fast_tracepoint_jump (CORE_ADDR where,
+			  unsigned char *insn, ULONGEST length)
+{
+  struct process_info *proc = current_process ();
+  struct fast_tracepoint_jump *jp;
+  int err;
+
+  /* We refcount fast tracepoint jumps.  Check if we already know
+     about a jump at this address.  */
+  jp = find_fast_tracepoint_jump_at (where);
+  if (jp != NULL)
+    {
+      jp->refcount++;
+      return jp;
+    }
+
+  /* We don't, so create a new object.  Double the length, because the
+     flexible array member holds both the jump insn, and the
+     shadow.  */
+  jp = xcalloc (1, sizeof (*jp) + (length * 2));
+  jp->pc = where;
+  jp->length = length;
+  memcpy (fast_tracepoint_jump_insn (jp), insn, length);
+  jp->refcount = 1;
+
+  /* Note that there can be trap breakpoints inserted in the same
+     address range.  To access the original memory contents, we use
+     `read_inferior_memory', which masks out breakpoints.  */
+  err = read_inferior_memory (where,
+			      fast_tracepoint_jump_shadow (jp), jp->length);
+  if (err != 0)
+    {
+      if (debug_threads)
+	fprintf (stderr,
+		 "Failed to read shadow memory of"
+		 " fast tracepoint at 0x%s (%s).\n",
+		 paddress (where), strerror (err));
+      free (jp);
+      return NULL;
+    }
+
+  /* Link the jump in.  */
+  jp->inserted = 1;
+  jp->next = proc->fast_tracepoint_jumps;
+  proc->fast_tracepoint_jumps = jp;
+
+  /* Since there can be trap breakpoints inserted in the same address
+     range, we use use `write_inferior_memory', which takes care of
+     layering breakpoints on top of fast tracepoints, on top of the
+     buffer we pass it.  This works because we've already linked in
+     the fast tracepoint jump above.  Also note that we need to pass
+     the current shadow contents, because write_inferior_memory
+     updates any shadow memory with what we pass here, and we want
+     that to be a nop.  */
+  err = write_inferior_memory (where, fast_tracepoint_jump_shadow (jp), length);
+  if (err != 0)
+    {
+      if (debug_threads)
+	fprintf (stderr,
+		 "Failed to insert fast tracepoint jump at 0x%s (%s).\n",
+		 paddress (where), strerror (err));
+
+      /* Unlink it.  */
+      proc->fast_tracepoint_jumps = jp->next;
+      free (jp);
+
+      return NULL;
+    }
+
+  return jp;
+}
+
+void
+uninsert_fast_tracepoint_jumps_at (CORE_ADDR pc)
+{
+  struct fast_tracepoint_jump *jp;
+  int err;
+
+  jp = find_fast_tracepoint_jump_at (pc);
+  if (jp == NULL)
+    {
+      /* This can happen when we remove all breakpoints while handling
+	 a step-over.  */
+      if (debug_threads)
+	fprintf (stderr,
+		 "Could not find fast tracepoint jump at 0x%s "
+		 "in list (uninserting).\n",
+		 paddress (pc));
+      return;
+    }
+
+  if (jp->inserted)
+    {
+      jp->inserted = 0;
+
+      /* Since there can be trap breakpoints inserted in the same
+	 address range, we use use `write_inferior_memory', which
+	 takes care of layering breakpoints on top of fast
+	 tracepoints, and on top of the buffer we pass it.  This works
+	 because we've already marked the fast tracepoint fast
+	 tracepoint jump uninserted above.  Also note that we need to
+	 pass the current shadow contents, because
+	 write_inferior_memory updates any shadow memory with what we
+	 pass here, and we want that to be a nop.  */
+      err = write_inferior_memory (jp->pc,
+				   fast_tracepoint_jump_shadow (jp),
+				   jp->length);
+      if (err != 0)
+	{
+	  jp->inserted = 1;
+
+	  if (debug_threads)
+	    fprintf (stderr,
+		     "Failed to uninsert fast tracepoint jump at 0x%s (%s).\n",
+		     paddress (pc), strerror (err));
+	}
+    }
+}
+
+void
+reinsert_fast_tracepoint_jumps_at (CORE_ADDR where)
+{
+  struct fast_tracepoint_jump *jp;
+  int err;
+
+  jp = find_fast_tracepoint_jump_at (where);
+  if (jp == NULL)
+    {
+      /* This can happen when we remove breakpoints when a tracepoint
+	 hit causes a tracing stop, while handling a step-over.  */
+      if (debug_threads)
+	fprintf (stderr,
+		 "Could not find fast tracepoint jump at 0x%s "
+		 "in list (reinserting).\n",
+		 paddress (where));
+      return;
+    }
+
+  if (jp->inserted)
+    error ("Jump already inserted at reinsert time.");
+
+  jp->inserted = 1;
+
+  /* Since there can be trap breakpoints inserted in the same address
+     range, we use `write_inferior_memory', which takes care of
+     layering breakpoints on top of fast tracepoints, and on top of
+     the buffer we pass it.  This works because we've already marked
+     the fast tracepoint jump inserted above.  Also note that we need
+     to pass the current shadow contents, because
+     write_inferior_memory updates any shadow memory with what we pass
+     here, and we want that to be a nop.  */
+  err = write_inferior_memory (where,
+			       fast_tracepoint_jump_shadow (jp), jp->length);
+  if (err != 0)
+    {
+      jp->inserted = 0;
+
+      if (debug_threads)
+	fprintf (stderr,
+		 "Failed to reinsert fast tracepoint jump at 0x%s (%s).\n",
+		 paddress (where), strerror (err));
+    }
+}
+
 struct breakpoint *
 set_breakpoint_at (CORE_ADDR where, int (*handler) (CORE_ADDR))
 {
@@ -215,8 +513,17 @@ delete_raw_breakpoint (struct process_in
 
 	      *bp_link = bp->next;
 
-	      ret = (*the_target->write_memory) (bp->pc, bp->old_data,
-						 breakpoint_len);
+	      /* Since there can be trap breakpoints inserted in the
+		 same address range, we use `write_inferior_memory',
+		 which takes care of layering breakpoints on top of
+		 fast tracepoints, and on top of the buffer we pass
+		 it.  This works because we've already unlinked the
+		 fast tracepoint jump above.  Also note that we need
+		 to pass the current shadow contents, because
+		 write_inferior_memory updates any shadow memory with
+		 what we pass here, and we want that to be a nop.  */
+	      ret = write_inferior_memory (bp->pc, bp->old_data,
+					   breakpoint_len);
 	      if (ret != 0)
 		{
 		  /* Something went wrong, relink the breakpoint.  */
@@ -426,8 +733,16 @@ uninsert_raw_breakpoint (struct raw_brea
       int err;
 
       bp->inserted = 0;
-      err = (*the_target->write_memory) (bp->pc, bp->old_data,
-					 breakpoint_len);
+      /* Since there can be fast tracepoint jumps inserted in the same
+	 address range, we use `write_inferior_memory', which takes
+	 care of layering breakpoints on top of fast tracepoints, and
+	 on top of the buffer we pass it.  This works because we've
+	 already unlinked the fast tracepoint jump above.  Also note
+	 that we need to pass the current shadow contents, because
+	 write_inferior_memory updates any shadow memory with what we
+	 pass here, and we want that to be a nop.  */
+      err = write_inferior_memory (bp->pc, bp->old_data,
+				   breakpoint_len);
       if (err != 0)
 	{
 	  bp->inserted = 1;
@@ -621,9 +936,39 @@ check_mem_read (CORE_ADDR mem_addr, unsi
 {
   struct process_info *proc = current_process ();
   struct raw_breakpoint *bp = proc->raw_breakpoints;
+  struct fast_tracepoint_jump *jp = proc->fast_tracepoint_jumps;
   CORE_ADDR mem_end = mem_addr + mem_len;
   int disabled_one = 0;
 
+  for (; jp != NULL; jp = jp->next)
+    {
+      CORE_ADDR bp_end = jp->pc + jp->length;
+      CORE_ADDR start, end;
+      int copy_offset, copy_len, buf_offset;
+
+      if (mem_addr >= bp_end)
+	continue;
+      if (jp->pc >= mem_end)
+	continue;
+
+      start = jp->pc;
+      if (mem_addr > start)
+	start = mem_addr;
+
+      end = bp_end;
+      if (end > mem_end)
+	end = mem_end;
+
+      copy_len = end - start;
+      copy_offset = start - jp->pc;
+      buf_offset = start - mem_addr;
+
+      if (jp->inserted)
+	memcpy (buf + buf_offset,
+		fast_tracepoint_jump_shadow (jp) + copy_offset,
+		copy_len);
+    }
+
   for (; bp != NULL; bp = bp->next)
     {
       CORE_ADDR bp_end = bp->pc + breakpoint_len;
@@ -665,9 +1010,42 @@ check_mem_write (CORE_ADDR mem_addr, uns
 {
   struct process_info *proc = current_process ();
   struct raw_breakpoint *bp = proc->raw_breakpoints;
+  struct fast_tracepoint_jump *jp = proc->fast_tracepoint_jumps;
   CORE_ADDR mem_end = mem_addr + mem_len;
   int disabled_one = 0;
 
+  /* First fast tracepoint jumps, then breakpoint traps on top.  */
+
+  for (; jp != NULL; jp = jp->next)
+    {
+      CORE_ADDR jp_end = jp->pc + jp->length;
+      CORE_ADDR start, end;
+      int copy_offset, copy_len, buf_offset;
+
+      if (mem_addr >= jp_end)
+	continue;
+      if (jp->pc >= mem_end)
+	continue;
+
+      start = jp->pc;
+      if (mem_addr > start)
+	start = mem_addr;
+
+      end = jp_end;
+      if (end > mem_end)
+	end = mem_end;
+
+      copy_len = end - start;
+      copy_offset = start - jp->pc;
+      buf_offset = start - mem_addr;
+
+      memcpy (fast_tracepoint_jump_shadow (jp) + copy_offset,
+	      buf + buf_offset, copy_len);
+      if (jp->inserted)
+	memcpy (buf + buf_offset,
+		fast_tracepoint_jump_insn (jp) + copy_offset, copy_len);
+    }
+
   for (; bp != NULL; bp = bp->next)
     {
       CORE_ADDR bp_end = bp->pc + breakpoint_len;
Index: src/gdb/gdbserver/mem-break.h
===================================================================
--- src.orig/gdb/gdbserver/mem-break.h	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/mem-break.h	2010-06-01 13:57:18.000000000 +0100
@@ -24,6 +24,7 @@
 
 /* Breakpoints are opaque.  */
 struct breakpoint;
+struct fast_tracepoint_jump;
 
 /* Create a new GDB breakpoint at WHERE.  Returns -1 if breakpoints
    are not supported on this target, 0 otherwise.  */
@@ -116,4 +117,30 @@ void free_all_breakpoints (struct proces
 
 void validate_breakpoints (void);
 
+/* Insert a fast tracepoint jump at WHERE, using instruction INSN, of
+   LENGTH bytes.  */
+
+struct fast_tracepoint_jump *set_fast_tracepoint_jump (CORE_ADDR where,
+						       unsigned char *insn,
+						       ULONGEST length);
+
+/* Delete fast tracepoint jump TODEL from our tables, and uninsert if
+   from memory.  */
+
+int delete_fast_tracepoint_jump (struct fast_tracepoint_jump *todel);
+
+/* Returns true if there's fast tracepoint jump set at WHERE.  */
+
+int fast_tracepoint_jump_here (CORE_ADDR);
+
+/* Uninsert fast tracepoint jumps at WHERE (and change their status to
+   uninserted).  This still leaves the tracepoints in the table.  */
+
+void uninsert_fast_tracepoint_jumps_at (CORE_ADDR pc);
+
+/* Reinsert fast tracepoint jumps at WHERE (and change their status to
+   inserted).  */
+
+void reinsert_fast_tracepoint_jumps_at (CORE_ADDR where);
+
 #endif /* MEM_BREAK_H */
Index: src/gdb/gdbserver/regcache.c
===================================================================
--- src.orig/gdb/gdbserver/regcache.c	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/regcache.c	2010-06-01 13:57:18.000000000 +0100
@@ -30,6 +30,8 @@ static int num_registers;
 
 const char **gdbserver_expedite_regs;
 
+#ifndef IN_PROCESS_AGENT
+
 struct regcache *
 get_thread_regcache (struct thread_info *thread, int fetch)
 {
@@ -82,9 +84,12 @@ regcache_invalidate (void)
   for_each_inferior (&all_threads, regcache_invalidate_one);
 }
 
+#endif
+
 struct regcache *
 init_register_cache (struct regcache *regcache, unsigned char *regbuf)
 {
+#ifndef IN_PROCESS_AGENT
   if (regbuf == NULL)
     {
       /* Make sure to zero-initialize the register cache when it is
@@ -95,6 +100,11 @@ init_register_cache (struct regcache *re
       regcache->registers_owned = 1;
     }
   else
+#else
+  if (regbuf == NULL)
+    fatal ("init_register_cache: can't allocate memory from the heap");
+  else
+#endif
     {
       regcache->registers = regbuf;
       regcache->registers_owned = 0;
@@ -105,6 +115,8 @@ init_register_cache (struct regcache *re
   return regcache;
 }
 
+#ifndef IN_PROCESS_AGENT
+
 struct regcache *
 new_register_cache (void)
 {
@@ -122,11 +134,14 @@ free_register_cache (struct regcache *re
 {
   if (regcache)
     {
-      free (regcache->registers);
+      if (regcache->registers_owned)
+	free (regcache->registers);
       free (regcache);
     }
 }
 
+#endif
+
 void
 regcache_cpy (struct regcache *dst, struct regcache *src)
 {
@@ -134,6 +149,7 @@ regcache_cpy (struct regcache *dst, stru
   dst->registers_valid = src->registers_valid;
 }
 
+#ifndef IN_PROCESS_AGENT
 static void
 realloc_register_cache (struct inferior_list_entry *thread_p)
 {
@@ -146,15 +162,18 @@ realloc_register_cache (struct inferior_
   free_register_cache (regcache);
   set_inferior_regcache_data (thread, new_register_cache ());
 }
+#endif
 
 void
 set_register_cache (struct reg *regs, int n)
 {
   int offset, i;
 
+#ifndef IN_PROCESS_AGENT
   /* Before changing the register cache internal layout, flush the
      contents of valid caches back to the threads.  */
   regcache_invalidate ();
+#endif
 
   reg_defs = regs;
   num_registers = n;
@@ -172,8 +191,10 @@ set_register_cache (struct reg *regs, in
   if (2 * register_bytes + 32 > PBUFSIZ)
     fatal ("Register packet size exceeds PBUFSIZ.");
 
+#ifndef IN_PROCESS_AGENT
   /* Re-allocate all pre-existing register caches.  */
   for_each_inferior (&all_threads, realloc_register_cache);
+#endif
 }
 
 int
@@ -182,6 +203,8 @@ register_cache_size (void)
   return register_bytes;
 }
 
+#ifndef IN_PROCESS_AGENT
+
 void
 registers_to_string (struct regcache *regcache, char *buf)
 {
@@ -236,6 +259,8 @@ find_register_by_number (int n)
   return &reg_defs[n];
 }
 
+#endif
+
 int
 register_size (int n)
 {
@@ -266,6 +291,8 @@ supply_regblock (struct regcache *regcac
     memset (regcache->registers, 0, register_bytes);
 }
 
+#ifndef IN_PROCESS_AGENT
+
 void
 supply_register_by_name (struct regcache *regcache,
 			 const char *name, const void *buf)
@@ -273,12 +300,16 @@ supply_register_by_name (struct regcache
   supply_register (regcache, find_regno (name), buf);
 }
 
+#endif
+
 void
 collect_register (struct regcache *regcache, int n, void *buf)
 {
   memcpy (buf, register_data (regcache, n, 1), register_size (n));
 }
 
+#ifndef IN_PROCESS_AGENT
+
 void
 collect_register_as_string (struct regcache *regcache, int n, char *buf)
 {
@@ -318,3 +349,5 @@ regcache_write_pc (struct regcache *regc
     internal_error (__FILE__, __LINE__,
 		    "regcache_write_pc: Unable to update PC");
 }
+
+#endif
Index: src/gdb/gdbserver/remote-utils.c
===================================================================
--- src.orig/gdb/gdbserver/remote-utils.c	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/remote-utils.c	2010-06-01 13:57:18.000000000 +0100
@@ -1127,7 +1127,7 @@ write_enn (char *buf)
 }
 
 void
-convert_int_to_ascii (unsigned char *from, char *to, int n)
+convert_int_to_ascii (const unsigned char *from, char *to, int n)
 {
   int nib;
   int ch;
@@ -1144,7 +1144,7 @@ convert_int_to_ascii (unsigned char *fro
 
 
 void
-convert_ascii_to_int (char *from, unsigned char *to, int n)
+convert_ascii_to_int (const char *from, unsigned char *to, int n)
 {
   int nib1, nib2;
   while (n--)
@@ -1354,7 +1354,7 @@ decode_m_packet (char *from, CORE_ADDR *
 
 void
 decode_M_packet (char *from, CORE_ADDR *mem_addr_ptr, unsigned int *len_ptr,
-		 unsigned char *to)
+		 unsigned char **to_p)
 {
   int i = 0;
   char ch;
@@ -1372,12 +1372,15 @@ decode_M_packet (char *from, CORE_ADDR *
       *len_ptr |= fromhex (ch) & 0x0f;
     }
 
-  convert_ascii_to_int (&from[i++], to, *len_ptr);
+  if (*to_p == NULL)
+    *to_p = xmalloc (*len_ptr);
+
+  convert_ascii_to_int (&from[i++], *to_p, *len_ptr);
 }
 
 int
 decode_X_packet (char *from, int packet_len, CORE_ADDR *mem_addr_ptr,
-		 unsigned int *len_ptr, unsigned char *to)
+		 unsigned int *len_ptr, unsigned char **to_p)
 {
   int i = 0;
   char ch;
@@ -1395,8 +1398,11 @@ decode_X_packet (char *from, int packet_
       *len_ptr |= fromhex (ch) & 0x0f;
     }
 
+  if (*to_p == NULL)
+    *to_p = xmalloc (*len_ptr);
+
   if (remote_unescape_input ((const gdb_byte *) &from[i], packet_len - i,
-			     to, *len_ptr) != *len_ptr)
+			     *to_p, *len_ptr) != *len_ptr)
     return -1;
 
   return 0;
@@ -1565,6 +1571,101 @@ look_up_one_symbol (const char *name, CO
   return 1;
 }
 
+/* Relocate an instruction to execute at a different address.  OLDLOC
+   is the address in the inferior memory where the instruction to
+   relocate is currently at.  On input, TO points to the destination
+   where we want the instruction to be copied (and possibly adjusted)
+   to.  On output, it points to one past the end of the resulting
+   instruction(s).  The effect of executing the instruction at TO
+   shall be the same as if executing it at FROM.  For example, call
+   instructions that implicitly push the return address on the stack
+   should be adjusted to return to the instruction after OLDLOC;
+   relative branches, and other PC-relative instructions need the
+   offset adjusted; etc.  Returns 0 on success, -1 on failure.  */
+
+int
+relocate_instruction (CORE_ADDR *to, CORE_ADDR oldloc)
+{
+  char own_buf[266];
+  int len;
+  ULONGEST written = 0;
+
+  /* Send the request.  */
+  strcpy (own_buf, "qRelocInsn:");
+  sprintf (own_buf, "qRelocInsn:%s;%s", paddress (oldloc),
+	   paddress (*to));
+  if (putpkt (own_buf) < 0)
+    return -1;
+
+  /* FIXME:  Eventually add buffer overflow checking (to getpkt?)  */
+  len = getpkt (own_buf);
+  if (len < 0)
+    return -1;
+
+  /* We ought to handle pretty much any packet at this point while we
+     wait for the qRelocInsn "response".  That requires re-entering
+     the main loop.  For now, this is an adequate approximation; allow
+     GDB to access memory.  */
+  while (own_buf[0] == 'm' || own_buf[0] == 'M' || own_buf[0] == 'X')
+    {
+      CORE_ADDR mem_addr;
+      unsigned char *mem_buf = NULL;
+      unsigned int mem_len;
+
+      if (own_buf[0] == 'm')
+	{
+	  decode_m_packet (&own_buf[1], &mem_addr, &mem_len);
+	  mem_buf = xmalloc (mem_len);
+	  if (read_inferior_memory (mem_addr, mem_buf, mem_len) == 0)
+	    convert_int_to_ascii (mem_buf, own_buf, mem_len);
+	  else
+	    write_enn (own_buf);
+	}
+      else if (own_buf[0] == 'X')
+	{
+	  if (decode_X_packet (&own_buf[1], len - 1, &mem_addr,
+			       &mem_len, &mem_buf) < 0
+	      || write_inferior_memory (mem_addr, mem_buf, mem_len) != 0)
+	    write_enn (own_buf);
+	  else
+	    write_ok (own_buf);
+	}
+      else
+	{
+	  decode_M_packet (&own_buf[1], &mem_addr, &mem_len, &mem_buf);
+	  if (write_inferior_memory (mem_addr, mem_buf, mem_len) == 0)
+	    write_ok (own_buf);
+	  else
+	    write_enn (own_buf);
+	}
+      free (mem_buf);
+      if (putpkt (own_buf) < 0)
+	return -1;
+      len = getpkt (own_buf);
+      if (len < 0)
+	return -1;
+    }
+
+  if (own_buf[0] == 'E')
+    {
+      warning ("An error occurred while relocating an instruction: %s\n",
+	       own_buf);
+      return -1;
+    }
+
+  if (strncmp (own_buf, "qRelocInsn:", strlen ("qRelocInsn:")) != 0)
+    {
+      warning ("Malformed response to qRelocInsn, ignoring: %s\n",
+	       own_buf);
+      return -1;
+    }
+
+  unpack_varlen_hex (own_buf + strlen ("qRelocInsn:"), &written);
+
+  *to += written;
+  return 0;
+}
+
 void
 monitor_output (const char *msg)
 {
Index: src/gdb/gdbserver/server.c
===================================================================
--- src.orig/gdb/gdbserver/server.c	2010-06-01 13:44:39.000000000 +0100
+++ src/gdb/gdbserver/server.c	2010-06-01 14:00:12.000000000 +0100
@@ -920,6 +920,9 @@ handle_query (char *own_buf, int packet_
 	 we access breakpoint shadows.  */
       validate_breakpoints ();
 
+      if (target_supports_tracepoints ())
+	tracepoint_look_up_symbols ();
+
       if (target_running () && the_target->look_up_symbols != NULL)
 	(*the_target->look_up_symbols) ();
 
@@ -1338,6 +1341,7 @@ handle_query (char *own_buf, int packet_
       && (own_buf[10] == ':' || own_buf[10] == '\0'))
     {
       char *p = &own_buf[10];
+      int gdb_supports_qRelocInsn = 0;
 
       /* Start processing qSupported packet.  */
       target_process_qsupported (NULL);
@@ -1372,6 +1376,11 @@ handle_query (char *own_buf, int packet_
 		  if (target_supports_multi_process ())
 		    multi_process = 1;
 		}
+	      else if (strcmp (p, "qRelocInsn+") == 0)
+		{
+		  /* GDB supports relocate instruction requests.  */
+		  gdb_supports_qRelocInsn = 1;
+		}
 	      else
 		target_process_qsupported (p);
 
@@ -1422,6 +1431,8 @@ handle_query (char *own_buf, int packet_
 	  strcat (own_buf, ";TraceStateVariables+");
 	  strcat (own_buf, ";TracepointSource+");
 	  strcat (own_buf, ";DisconnectedTracing+");
+	  if (gdb_supports_qRelocInsn && target_supports_fast_tracepoints ())
+	    strcat (own_buf, ";FastTracepoints+");
 	}
 
       return;
@@ -2122,6 +2133,7 @@ handle_status (char *own_buf)
   else
     {
       pause_all (0);
+      stabilize_threads ();
       gdb_wants_all_threads_stopped ();
 
       if (all_threads.head)
@@ -2821,7 +2833,7 @@ process_serial_event (void)
       break;
     case 'M':
       require_running (own_buf);
-      decode_M_packet (&own_buf[1], &mem_addr, &len, mem_buf);
+      decode_M_packet (&own_buf[1], &mem_addr, &len, &mem_buf);
       if (write_memory (mem_addr, mem_buf, len) == 0)
 	write_ok (own_buf);
       else
@@ -2830,7 +2842,7 @@ process_serial_event (void)
     case 'X':
       require_running (own_buf);
       if (decode_X_packet (&own_buf[1], packet_len - 1,
-			   &mem_addr, &len, mem_buf) < 0
+			   &mem_addr, &len, &mem_buf) < 0
 	  || write_memory (mem_addr, mem_buf, len) != 0)
 	write_enn (own_buf);
       else
Index: src/gdb/gdbserver/server.h
===================================================================
--- src.orig/gdb/gdbserver/server.h	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/server.h	2010-06-01 13:57:18.000000000 +0100
@@ -221,6 +221,7 @@ struct dll_info
 struct sym_cache;
 struct breakpoint;
 struct raw_breakpoint;
+struct fast_tracepoint_jump;
 struct process_info_private;
 
 struct process_info
@@ -244,6 +245,9 @@ struct process_info
   /* The list of raw memory breakpoints.  */
   struct raw_breakpoint *raw_breakpoints;
 
+  /* The list of installed fast tracepoints.  */
+  struct fast_tracepoint_jump *fast_tracepoint_jumps;
+
   /* Private target data.  */
   struct process_info_private *private;
 };
@@ -379,8 +383,8 @@ void initialize_async_io (void);
 void enable_async_io (void);
 void disable_async_io (void);
 void check_remote_input_interrupt_request (void);
-void convert_ascii_to_int (char *from, unsigned char *to, int n);
-void convert_int_to_ascii (unsigned char *from, char *to, int n);
+void convert_ascii_to_int (const char *from, unsigned char *to, int n);
+void convert_int_to_ascii (const unsigned char *from, char *to, int n);
 void new_thread_notify (int id);
 void dead_thread_notify (int id);
 void prepare_resume_reply (char *buf, ptid_t ptid,
@@ -391,9 +395,9 @@ void decode_address (CORE_ADDR *addrp, c
 void decode_m_packet (char *from, CORE_ADDR * mem_addr_ptr,
 		      unsigned int *len_ptr);
 void decode_M_packet (char *from, CORE_ADDR * mem_addr_ptr,
-		      unsigned int *len_ptr, unsigned char *to);
+		      unsigned int *len_ptr, unsigned char **to_p);
 int decode_X_packet (char *from, int packet_len, CORE_ADDR * mem_addr_ptr,
-		     unsigned int *len_ptr, unsigned char *to);
+		     unsigned int *len_ptr, unsigned char **to_p);
 int decode_xfer_write (char *buf, int packet_len, char **annex,
 		       CORE_ADDR *offset, unsigned int *len,
 		       unsigned char *data);
@@ -412,6 +416,8 @@ char *unpack_varlen_hex (char *buff,  UL
 void clear_symbol_cache (struct sym_cache **symcache_p);
 int look_up_one_symbol (const char *name, CORE_ADDR *addrp, int may_ask_gdb);
 
+int relocate_instruction (CORE_ADDR *to, CORE_ADDR oldloc);
+
 void monitor_output (const char *msg);
 
 char *xml_escape_text (const char *text);
@@ -507,11 +513,15 @@ char *phex_nz (ULONGEST l, int sizeof_l)
 
 /* Functions from tracepoint.c */
 
+int in_process_agent_loaded (void);
+
 void initialize_tracepoint (void);
 
 extern int tracing;
 extern int disconnected_tracing;
 
+void tracepoint_look_up_symbols (void);
+
 void stop_tracing (void);
 
 int handle_tracepoint_general_set (char *own_buf);
@@ -532,6 +542,37 @@ int fetch_traceframe_registers (int tfnu
 				struct regcache *regcache,
 				int regnum);
 
+/* If a thread is determined to be collecting a fast tracepoint, this
+   structure holds the collect status.  */
+
+struct fast_tpoint_collect_status
+{
+  /* The tracepoint that is presently being collected.  */
+  int tpoint_num;
+  CORE_ADDR tpoint_addr;
+
+  /* The address range in the jump pad of where the original
+     instruction the tracepoint jump was inserted was relocated
+     to.  */
+  CORE_ADDR adjusted_insn_addr;
+  CORE_ADDR adjusted_insn_addr_end;
+};
+
+int fast_tracepoint_collecting (CORE_ADDR thread_area,
+				CORE_ADDR stop_pc,
+				struct fast_tpoint_collect_status *status);
+void force_unlock_trace_buffer (void);
+
+int handle_tracepoint_bkpts (struct thread_info *tinfo, CORE_ADDR stop_pc);
+
+#ifdef IN_PROCESS_AGENT
+void initialize_low_tracepoint (void);
+void supply_fast_tracepoint_registers (struct regcache *regcache,
+				       const unsigned char *regs);
+#else
+void stop_tracing (void);
+#endif
+
 /* Version information, from version.c.  */
 extern const char version[];
 extern const char host_name[];
Index: src/gdb/gdbserver/target.h
===================================================================
--- src.orig/gdb/gdbserver/target.h	2010-06-01 13:44:36.000000000 +0100
+++ src/gdb/gdbserver/target.h	2010-06-01 13:57:18.000000000 +0100
@@ -324,6 +324,31 @@ struct target_ops
 
   /* Cancel all pending breakpoints hits in all threads.  */
   void (*cancel_breakpoints) (void);
+
+  /* Stabilize all threads.  That is, force them out of jump pads.  */
+  void (*stabilize_threads) (void);
+
+  /* Install a fast tracepoint jump pad.  TPOINT is the address of the
+     tracepoint internal object as used by the IPA agent.  TPADDR is
+     the address of tracepoint.  COLLECTOR is address of the function
+     the jump pad redirects to.  LOCKADDR is the address of the jump
+     pad lock object.  ORIG_SIZE is the size in bytes of the
+     instruction at TPADDR.  JUMP_ENTRY points to the address of the
+     jump pad entry, and on return holds the address past the end of
+     the created jump pad. JJUMP_PAD_INSN is a buffer containing a
+     copy of the instruction at TPADDR.  ADJUST_INSN_ADDR and
+     ADJUST_INSN_ADDR_END are output parameters that return the
+     address range where the instruction at TPADDR was relocated
+     to.  */
+  int (*install_fast_tracepoint_jump_pad) (CORE_ADDR tpoint, CORE_ADDR tpaddr,
+					   CORE_ADDR collector,
+					   CORE_ADDR lockaddr,
+					   ULONGEST orig_size,
+					   CORE_ADDR *jump_entry,
+					   unsigned char *jjump_pad_insn,
+					   ULONGEST *jjump_pad_insn_size,
+					   CORE_ADDR *adjusted_insn_addr,
+					   CORE_ADDR *adjusted_insn_addr_end);
 };
 
 extern struct target_ops *the_target;
@@ -378,6 +403,9 @@ void set_target_ops (struct target_ops *
   (the_target->supports_tracepoints			\
    ? (*the_target->supports_tracepoints) () : 0)
 
+#define target_supports_fast_tracepoints()		\
+  (the_target->install_fast_tracepoint_jump_pad != NULL)
+
 #define thread_stopped(thread) \
   (*the_target->thread_stopped) (thread)
 
@@ -402,6 +430,28 @@ void set_target_ops (struct target_ops *
 	(*the_target->cancel_breakpoints) ();  	\
     } while (0)
 
+#define stabilize_threads()			\
+  do						\
+    {						\
+      if (the_target->stabilize_threads)     	\
+	(*the_target->stabilize_threads) ();  	\
+    } while (0)
+
+#define install_fast_tracepoint_jump_pad(tpoint, tpaddr,		\
+					 collector, lockaddr,		\
+					 orig_size,			\
+					 jump_entry, jjump_pad_insn,	\
+					 jjump_pad_insn_size,		\
+					 adjusted_insn_addr,		\
+					 adjusted_insn_addr_end)	\
+  (*the_target->install_fast_tracepoint_jump_pad) (tpoint, tpaddr,	\
+						   collector,lockaddr,	\
+						   orig_size, jump_entry, \
+						   jjump_pad_insn,	\
+						   jjump_pad_insn_size, \
+						   adjusted_insn_addr,	\
+						   adjusted_insn_addr_end)
+
 /* Start non-stop mode, returns 0 on success, -1 on failure.   */
 
 int start_non_stop (int nonstop);
Index: src/gdb/gdbserver/tracepoint.c
===================================================================
--- src.orig/gdb/gdbserver/tracepoint.c	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/tracepoint.c	2010-06-01 13:57:18.000000000 +0100
@@ -21,11 +21,45 @@
 #include <fcntl.h>
 #include <unistd.h>
 #include <sys/time.h>
+#include <stddef.h>
+#if HAVE_MALLOC_H
+#include <malloc.h>
+#endif
+#if HAVE_STDINT_H
+#include <stdint.h>
+#endif
+
+/* This file is built for both both GDBserver, and the in-process
+   agent (IPA), a shared library that includes a tracing agent that is
+   loaded by the inferior to support fast tracepoints.  Fast
+   tracepoints (or more accurately, jump based tracepoints) are
+   implemented by patching the tracepoint location with a jump into a
+   small trampoline function whose job is to save the register state,
+   call the in-process tracing agent, and then execute the original
+   instruction that was under the tracepoint jump (possibly adjusted,
+   if PC-relative, or some such).
+
+   The current synchronization design is pull based.  That means,
+   GDBserver does most of the work, by peeking/poking at the inferior
+   agent's memory directly for downloading tracepoint and associated
+   objects, and for uploading trace frames.  Whenever the IPA needs
+   something from GDBserver (trace buffer is full, tracing stopped for
+   some reason, etc.) the IPA calls a corresponding hook function
+   where GDBserver has placed a breakpoint.
+
+   Each of the agents has its own trace buffer.  When browsing the
+   trace frames built from slow and fast tracepoints from GDB (tfind
+   mode), there's no guarantee the user is seeing the trace frames in
+   strict chronological creation order, although, GDBserver tries to
+   keep the order relatively reasonable, by syncing the trace buffers
+   at appropriate times.
+
+*/
 
-static void trace_debug_1 (const char *, ...) ATTR_FORMAT (printf, 1, 2);
+static void trace_vdebug (const char *, ...) ATTR_FORMAT (printf, 1, 2);
 
 static void
-trace_debug_1 (const char *fmt, ...)
+trace_vdebug (const char *fmt, ...)
 {
   char buf[1024];
   va_list ap;
@@ -36,12 +70,254 @@ trace_debug_1 (const char *fmt, ...)
   va_end (ap);
 }
 
-#define trace_debug(FMT, args...)		\
+#define trace_debug_1(level, fmt, args...)	\
   do {						\
-    if (debug_threads)				\
-      trace_debug_1 ((FMT), ##args);		\
+    if (level <= debug_threads)			\
+      trace_vdebug ((fmt), ##args);		\
   } while (0)
 
+#define trace_debug(FMT, args...)		\
+  trace_debug_1 (1, FMT, ##args)
+
+#if defined(__GNUC__)
+#  define ATTR_USED __attribute__((used))
+#  define ATTR_NOINLINE __attribute__((noinline))
+#  define ATTR_CONSTRUCTOR __attribute__ ((constructor))
+#else
+#  define ATTR_USED
+#  define ATTR_NOINLINE
+#  define ATTR_CONSTRUCTOR
+#endif
+
+/* Make sure the functions the IPA needs to export (symbols GDBserver
+   needs to query GDB about) are exported.  */
+
+#ifdef IN_PROCESS_AGENT
+# if defined _WIN32 || defined __CYGWIN__
+#   define IP_AGENT_EXPORT __declspec(dllexport) ATTR_USED
+# else
+#   if __GNUC__ >= 4
+#     define IP_AGENT_EXPORT \
+  __attribute__ ((visibility("default"))) ATTR_USED
+#   else
+#     define IP_AGENT_EXPORT ATTR_USED
+#   endif
+# endif
+#else
+#  define IP_AGENT_EXPORT
+#endif
+
+/* Prefix exported symbols, for good citizenship.  All the symbols
+   that need exporting are defined in this module.  */
+#ifdef IN_PROCESS_AGENT
+# define gdb_tp_heap_buffer gdb_agent_gdb_tp_heap_buffer
+# define gdb_jump_pad_buffer gdb_agent_gdb_jump_pad_buffer
+# define gdb_jump_pad_buffer_end gdb_agent_gdb_jump_pad_buffer_end
+# define collecting gdb_agent_collecting
+# define gdb_collect gdb_agent_gdb_collect
+# define stop_tracing gdb_agent_stop_tracing
+# define flush_trace_buffer gdb_agent_flush_trace_buffer
+# define about_to_request_buffer_space gdb_agent_about_to_request_buffer_space
+# define trace_buffer_is_full gdb_agent_trace_buffer_is_full
+# define stopping_tracepoint gdb_agent_stopping_tracepoint
+# define expr_eval_result gdb_agent_expr_eval_result
+# define error_tracepoint gdb_agent_error_tracepoint
+# define tracepoints gdb_agent_tracepoints
+# define tracing gdb_agent_tracing
+# define trace_buffer_ctrl gdb_agent_trace_buffer_ctrl
+# define trace_buffer_ctrl_curr gdb_agent_trace_buffer_ctrl_curr
+# define trace_buffer_lo gdb_agent_trace_buffer_lo
+# define trace_buffer_hi gdb_agent_trace_buffer_hi
+# define traceframe_read_count gdb_agent_traceframe_read_count
+# define traceframe_write_count gdb_agent_traceframe_write_count
+# define traceframes_created gdb_agent_traceframes_created
+# define trace_state_variables gdb_agent_trace_state_variables
+#endif
+
+#ifndef IN_PROCESS_AGENT
+
+/* Addresses of in-process agent's symbols GDBserver cares about.  */
+
+struct ipa_sym_addresses
+{
+  CORE_ADDR addr_gdb_tp_heap_buffer;
+  CORE_ADDR addr_gdb_jump_pad_buffer;
+  CORE_ADDR addr_gdb_jump_pad_buffer_end;
+  CORE_ADDR addr_collecting;
+  CORE_ADDR addr_gdb_collect;
+  CORE_ADDR addr_stop_tracing;
+  CORE_ADDR addr_flush_trace_buffer;
+  CORE_ADDR addr_about_to_request_buffer_space;
+  CORE_ADDR addr_trace_buffer_is_full;
+  CORE_ADDR addr_stopping_tracepoint;
+  CORE_ADDR addr_expr_eval_result;
+  CORE_ADDR addr_error_tracepoint;
+  CORE_ADDR addr_tracepoints;
+  CORE_ADDR addr_tracing;
+  CORE_ADDR addr_trace_buffer_ctrl;
+  CORE_ADDR addr_trace_buffer_ctrl_curr;
+  CORE_ADDR addr_trace_buffer_lo;
+  CORE_ADDR addr_trace_buffer_hi;
+  CORE_ADDR addr_traceframe_read_count;
+  CORE_ADDR addr_traceframe_write_count;
+  CORE_ADDR addr_traceframes_created;
+  CORE_ADDR addr_trace_state_variables;
+};
+
+#define STRINGIZE_1(STR) #STR
+#define STRINGIZE(STR) STRINGIZE_1(STR)
+#define IPA_SYM(SYM)				\
+  {							\
+    STRINGIZE (gdb_agent_ ## SYM),			\
+    offsetof (struct ipa_sym_addresses, addr_ ## SYM)	\
+  }
+
+static struct
+{
+  const char *name;
+  int offset;
+  int required;
+} symbol_list[] = {
+  IPA_SYM(gdb_tp_heap_buffer),
+  IPA_SYM(gdb_jump_pad_buffer),
+  IPA_SYM(gdb_jump_pad_buffer_end),
+  IPA_SYM(collecting),
+  IPA_SYM(gdb_collect),
+  IPA_SYM(stop_tracing),
+  IPA_SYM(flush_trace_buffer),
+  IPA_SYM(about_to_request_buffer_space),
+  IPA_SYM(trace_buffer_is_full),
+  IPA_SYM(stopping_tracepoint),
+  IPA_SYM(expr_eval_result),
+  IPA_SYM(error_tracepoint),
+  IPA_SYM(tracepoints),
+  IPA_SYM(tracing),
+  IPA_SYM(trace_buffer_ctrl),
+  IPA_SYM(trace_buffer_ctrl_curr),
+  IPA_SYM(trace_buffer_lo),
+  IPA_SYM(trace_buffer_hi),
+  IPA_SYM(traceframe_read_count),
+  IPA_SYM(traceframe_write_count),
+  IPA_SYM(traceframes_created),
+  IPA_SYM(trace_state_variables),
+};
+
+struct ipa_sym_addresses ipa_sym_addrs;
+
+int all_tracepoint_symbols_looked_up;
+
+int
+in_process_agent_loaded (void)
+{
+  return all_tracepoint_symbols_looked_up;
+}
+
+static int read_inferior_integer (CORE_ADDR symaddr, int *val);
+
+static void
+write_e_ipa_not_loaded (char *buffer)
+{
+  sprintf (buffer,
+	   "E.In-process agent library not loaded in process.  "
+	   "Dynamic tracepoints unavailable.");
+}
+
+static int
+maybe_write_ipa_not_loaded (char *buffer)
+{
+  if (!in_process_agent_loaded ())
+    {
+      write_e_ipa_not_loaded (buffer);
+      return 1;
+    }
+  return 0;
+}
+
+/* Cache all future symbols that the tracepoints module might request.
+   We can not request symbols at arbitrary states in the remote
+   protocol, only when the client tells us that new symbols are
+   available.  So when we load the in-process library, make sure to
+   check the entire list.  */
+
+void
+tracepoint_look_up_symbols (void)
+{
+  int all_ok;
+  int i;
+
+  if (all_tracepoint_symbols_looked_up)
+    return;
+
+  all_ok = 1;
+  for (i = 0; i < sizeof (symbol_list) / sizeof (symbol_list[0]); i++)
+    {
+      CORE_ADDR *addrp =
+	(CORE_ADDR *) ((char *) &ipa_sym_addrs + symbol_list[i].offset);
+
+      if (look_up_one_symbol (symbol_list[i].name, addrp, 1) == 0)
+	{
+	  if (debug_threads)
+	    fprintf (stderr, "symbol `%s' not found\n", symbol_list[i].name);
+	  all_ok = 0;
+	}
+    }
+
+  all_tracepoint_symbols_looked_up = all_ok;
+}
+
+#endif
+
+/* GDBserver places a breakpoint on the IPA's version (which is a nop)
+   of the "stop_tracing" function.  When this breakpoint is hit,
+   tracing stopped in the IPA for some reason.  E.g., due to
+   tracepoint reaching the pass count, hitting conditional expression
+   evaluation error, etc.
+
+   The IPA's trace buffer is never in circular tracing mode: instead,
+   GDBserver's is, and whenever the in-process buffer fills, it calls
+   "flush_trace_buffer", which triggers an internal breakpoint.
+   GDBserver reacts to this breakpoint by pulling the meanwhile
+   collected data.  Old frames discarding is always handled on the
+   GDBserver side.  */
+
+#ifdef IN_PROCESS_AGENT
+int debug_threads = 0;
+
+int
+read_inferior_memory (CORE_ADDR memaddr, unsigned char *myaddr, int len)
+{
+  memcpy (myaddr, (void *) (uintptr_t) memaddr, len);
+  return 0;
+}
+
+/* Call this in the functions where GDBserver places a breakpoint, so
+   that the compiler doesn't try to be clever and skip calling the
+   function at all.  This is necessary, even if we tell the compiler
+   to not inline said functions.  */
+
+#if defined(__GNUC__)
+#  define UNKNOWN_SIDE_EFFECTS() asm ("")
+#else
+#  define UNKNOWN_SIDE_EFFECTS() do {} while (0)
+#endif
+
+IP_AGENT_EXPORT void ATTR_USED ATTR_NOINLINE
+stop_tracing (void)
+{
+  /* GDBserver places breakpoint here.  */
+  UNKNOWN_SIDE_EFFECTS();
+}
+
+IP_AGENT_EXPORT void ATTR_USED ATTR_NOINLINE
+flush_trace_buffer (void)
+{
+  /* GDBserver places breakpoint here.  */
+  UNKNOWN_SIDE_EFFECTS();
+}
+
+#endif
+
+#ifndef IN_PROCESS_AGENT
 static int
 tracepoint_handler (CORE_ADDR address)
 {
@@ -50,6 +326,65 @@ tracepoint_handler (CORE_ADDR address)
   return 0;
 }
 
+/* Breakpoint at "stop_tracing" in the inferior lib.  */
+struct breakpoint *stop_tracing_bkpt;
+static int stop_tracing_handler (CORE_ADDR);
+
+/* Breakpoint at "flush_trace_buffer" in the inferior lib.  */
+struct breakpoint *flush_trace_buffer_bkpt;
+static int flush_trace_buffer_handler (CORE_ADDR);
+
+static void download_tracepoints (void);
+static void download_trace_state_variables (void);
+static void upload_fast_traceframes (void);
+
+static int
+read_inferior_integer (CORE_ADDR symaddr, int *val)
+{
+  return read_inferior_memory (symaddr, (unsigned char *) val,
+			       sizeof (*val));
+}
+
+static int
+read_inferior_uinteger (CORE_ADDR symaddr, unsigned int *val)
+{
+  return read_inferior_memory (symaddr, (unsigned char *) val,
+			       sizeof (*val));
+}
+
+static int
+read_inferior_data_pointer (CORE_ADDR symaddr, CORE_ADDR *val)
+{
+  void *pval = (void *) (uintptr_t) val;
+  int ret;
+
+  ret = read_inferior_memory (symaddr, (unsigned char *) &pval, sizeof (pval));
+  *val = (uintptr_t) pval;
+  return ret;
+}
+
+static int
+write_inferior_data_pointer (CORE_ADDR symaddr, CORE_ADDR val)
+{
+  void *pval = (void *) (uintptr_t) val;
+  return write_inferior_memory (symaddr,
+				(unsigned char *) &pval, sizeof (pval));
+}
+
+static int
+write_inferior_integer (CORE_ADDR symaddr, int val)
+{
+  return write_inferior_memory (symaddr, (unsigned char *) &val, sizeof (val));
+}
+
+static int
+write_inferior_uinteger (CORE_ADDR symaddr, unsigned int val)
+{
+  return write_inferior_memory (symaddr, (unsigned char *) &val, sizeof (val));
+}
+
+#endif
+
 /* This enum must exactly match what is documented in
    gdb/doc/agentexpr.texi, including all the numerical values.  */
 
@@ -198,12 +533,6 @@ struct eval_expr_action
   struct agent_expr *expr;
 };
 
-/* An 'L' (collect static trace data) action.  */
-struct collect_static_trace_data_action
-{
-  struct tracepoint_action base;
-};
-
 /* This structure describes a piece of the source-level definition of
    the tracepoint.  The contents are not interpreted by the target,
    but preserved verbatim for uploading upon reconnection.  */
@@ -224,6 +553,15 @@ struct source_string
   struct source_string *next;
 };
 
+enum tracepoint_type
+{
+  /* Trap based tracepoint.  */
+  trap_tracepoint,
+
+  /* A fast tracepoint implemented with a jump instead of a trap.  */
+  fast_tracepoint,
+};
+
 struct tracepoint_hit_ctx;
 
 /* The definition of a tracepoint.  */
@@ -248,6 +586,9 @@ struct tracepoint
      tracepoints may share an address.  */
   CORE_ADDR address;
 
+  /* Tracepoint type.  */
+  enum tracepoint_type type;
+
   /* True if the tracepoint is currently enabled.  */
   int enabled;
 
@@ -266,32 +607,61 @@ struct tracepoint
   /* The list of actions to take when the tracepoint triggers.  */
   int numactions;
   struct tracepoint_action **actions;
-  /* Same, but in string/packet form.  */
-  char **actions_str;
-
-  /* The list of actions to take while in a stepping loop.  */
-  int num_step_actions;
-  struct tracepoint_action **step_actions;
-  /* Same, but in string/packet form.  */
-  char **step_actions_str;
 
   /* Count of the times we've hit this tracepoint during the run.
      Note that while-stepping steps are not counted as "hits".  */
   long hit_count;
 
+  /* Link to the next tracepoint in the list.  */
+  struct tracepoint *next;
+
+#ifndef IN_PROCESS_AGENT
+  /* The list of actions to take when the tracepoint triggers, in
+     string/packet form.  */
+  char **actions_str;
+
   /* The collection of strings that describe the tracepoint as it was
      entered into GDB.  These are not used by the target, but are
      reported back to GDB upon reconnection.  */
   struct source_string *source_strings;
 
-  /* Handle returned by the breakpoint module when we inserted the
-     trap.  NULL if we haven't inserted it yet.  */
+  /* The number of bytes displaced by fast tracepoints. It may subsume
+     multiple instructions, for multi-byte fast tracepoints.  This
+     field is only valid for fast tracepoints.  */
+  int orig_size;
+
+  /* Only for fast tracepoints.  */
+  CORE_ADDR obj_addr_on_target;
+
+  /* Address range where the original instruction under a fast
+     tracepoint was relocated to.  (_end is actually one byte past
+     the end).  */
+  CORE_ADDR adjusted_insn_addr;
+  CORE_ADDR adjusted_insn_addr_end;
+
+  /* The address range of the piece of the jump pad buffer that was
+     assigned to this fast tracepoint.  (_end is actually one byte
+     past the end).*/
+  CORE_ADDR jump_pad;
+  CORE_ADDR jump_pad_end;
+
+  /* The list of actions to take while in a stepping loop.  These
+     fields are only valid for patch-based tracepoints.  */
+  int num_step_actions;
+  struct tracepoint_action **step_actions;
+  /* Same, but in string/packet form.  */
+  char **step_actions_str;
+
+  /* Handle returned by the breakpoint or tracepoint module when we
+     inserted the trap or jump.  NULL if we haven't inserted it
+     yet.  */
   void *handle;
+#endif
 
-  /* Link to the next tracepoint in the list.  */
-  struct tracepoint *next;
 };
 
+#ifndef IN_PROCESS_AGENT
+
 /* Given `while-stepping', a thread may be collecting data for more
    than one tracepoint simultaneously.  On the other hand, the same
    tracepoint with a while-stepping action may be hit by more than one
@@ -314,22 +684,28 @@ struct wstep_state
   long current_step;
 };
 
-/* The linked list of all tracepoints.  */
+#endif
+
+/* The linked list of all tracepoints.  Marked explicitly as used as
+   the in-process library doesn't use it for the fast tracepoints
+   support.  */
+IP_AGENT_EXPORT struct tracepoint *tracepoints ATTR_USED;
 
-static struct tracepoint *tracepoints;
+#ifndef IN_PROCESS_AGENT
 
 /* Pointer to the last tracepoint in the list, new tracepoints are
    linked in at the end.  */
 
 static struct tracepoint *last_tracepoint;
+#endif
 
 /* The first tracepoint to exceed its pass count.  */
 
-static struct tracepoint *stopping_tracepoint;
+IP_AGENT_EXPORT struct tracepoint *stopping_tracepoint;
 
 /* True if the trace buffer is full or otherwise no longer usable.  */
 
-static int trace_buffer_is_full;
+IP_AGENT_EXPORT int trace_buffer_is_full;
 
 /* Enumeration of the different kinds of things that can happen during
    agent expression evaluation.  */
@@ -349,6 +725,8 @@ enum eval_result_type
 
 static enum eval_result_type expr_eval_result = expr_eval_no_error;
 
+#ifndef IN_PROCESS_AGENT
+
 static const char *eval_result_names[] =
   {
     "terror:in the attic",  /* this should never be reported */
@@ -361,6 +739,8 @@ static const char *eval_result_names[] =
     "terror:divide by zero"
   };
 
+#endif
+
 /* The tracepoint in which the error occurred.  */
 
 static struct tracepoint *error_tracepoint;
@@ -393,7 +773,11 @@ struct trace_state_variable
 
 /* Linked list of all trace state variables.  */
 
-static struct trace_state_variable *trace_state_variables;
+#ifdef IN_PROCESS_AGENT
+struct trace_state_variable *alloced_trace_state_variables;
+#endif
+
+IP_AGENT_EXPORT struct trace_state_variable *trace_state_variables;
 
 /* The results of tracing go into a fixed-size space known as the
    "trace buffer".  Because usage follows a limited number of
@@ -453,7 +837,7 @@ struct traceframe
   /* The base of the trace data, which is contiguous from this point.  */
   unsigned char data[0];
 
-} ATTR_PACKED traceframe_t;
+} ATTR_PACKED;
 
 /* The traceframe to be used as the source of data to send back to
    GDB.  A value of -1 means to get data from the live program.  */
@@ -464,7 +848,9 @@ int current_traceframe = -1;
    when it fills, the oldest trace frames are discarded in order to
    make room.  */
 
+#ifndef IN_PROCESS_AGENT
 static int circular_trace_buffer;
+#endif
 
 /* Pointer to the block of memory that traceframes all go into.  */
 
@@ -475,31 +861,182 @@ static unsigned char *trace_buffer_lo;
 
 static unsigned char *trace_buffer_hi;
 
-/* Pointer to the first trace frame in the buffer.  In the
-   non-circular case, this is equal to trace_buffer_lo, otherwise it
-   moves around in the buffer.  */
-
-static unsigned char *trace_buffer_start;
-
-/* Pointer to the free part of the trace buffer.  Note that we clear
-   several bytes at and after this pointer, so that traceframe
-   scans/searches terminate properly.  */
+/* Control structure holding the read/write/etc. pointers into the
+   trace buffer.  We need more than one of these to implement a
+   transaction-like mechanism to garantees that both GDBserver and the
+   in-process agent can try to change the trace buffer
+   simultaneously.  */
+
+struct trace_buffer_control
+{
+  /* Pointer to the first trace frame in the buffer.  In the
+     non-circular case, this is equal to trace_buffer_lo, otherwise it
+     moves around in the buffer.  */
+  unsigned char *start;
+
+  /* Pointer to the free part of the trace buffer.  Note that we clear
+     several bytes at and after this pointer, so that traceframe
+     scans/searches terminate properly.  */
+  unsigned char *free;
+
+  /* Pointer to the byte after the end of the free part.  Note that
+     this may be smaller than trace_buffer_free in the circular case,
+     and means that the free part is in two pieces.  Initially it is
+     equal to trace_buffer_hi, then is generally equivalent to
+     trace_buffer_start.  */
+  unsigned char *end_free;
+
+  /* Pointer to the wraparound.  If not equal to trace_buffer_hi, then
+     this is the point at which the trace data breaks, and resumes at
+     trace_buffer_lo.  */
+  unsigned char *wrap;
+};
 
-static unsigned char *trace_buffer_free;
+/* Same as above, to be used by GDBserver when updating the in-process
+   agent.  */
+struct ipa_trace_buffer_control
+{
+  uintptr_t start;
+  uintptr_t free;
+  uintptr_t end_free;
+  uintptr_t wrap;
+};
 
-/* Pointer to the byte after the end of the free part.  Note that this
-   may be smaller than trace_buffer_free in the circular case, and
-   means that the free part is in two pieces.  Initially it is equal
-   to trace_buffer_hi, then is generally equivalent to
-   trace_buffer_start.  */
 
-static unsigned char *trace_buffer_end_free;
+/* We have possibly both GDBserver and an inferior thread accessing
+   the same IPA trace buffer memory.  The IPA is the producer (tries
+   to put new frames in the buffer), while GDBserver occasionally
+   consumes them, that is, flushes the IPA's buffer into its own
+   buffer.  Both sides need to update the trace buffer control
+   pointers (current head, tail, etc.).  We can't use a global lock to
+   synchronize the accesses, as otherwise we could deadlock GDBserver
+   (if the thread holding the lock stops for a signal, say).  So
+   instead of that, we use a transaction scheme where GDBserver writes
+   always prevail over the IPAs writes, and, we have the IPA detect
+   the commit failure/overwrite, and retry the whole attempt.  This is
+   mainly implemented by having a global token object that represents
+   who wrote last to the buffer control structure.  We need to freeze
+   any inferior writing to the buffer while GDBserver touches memory,
+   so that the inferior can correctly detect that GDBserver had been
+   there, otherwise, it could mistakingly think its commit was
+   successful; that's implemented by simply having GDBserver set a
+   breakpoint the inferior hits if it is the critical region.
+
+   There are three cycling trace buffer control structure copies
+   (buffer head, tail, etc.), with the token object including an index
+   indicating which is current live copy.  The IPA tentatively builds
+   an updated copy in a non-current control structure, while GDBserver
+   always clobbers the current version directly.  The IPA then tries
+   to atomically "commit" its version; if GDBserver clobbered the
+   structure meanwhile, that will fail, and the IPA restarts the
+   allocation process.
+
+   Listing the step in further detail, we have:
+
+  In-process agent (producer):
+
+  - passes by `about_to_request_buffer_space' breakpoint/lock
+
+  - reads current token, extracts current trace buffer control index,
+    and starts tentatively updating the rightmost one (0->1, 1->2,
+    2->0).  Note that only one inferior thread is executing this code
+    at any given time, due to an outer lock in the jump pads.
+
+  - updates counters, and tries to commit the token.
+
+  - passes by second `about_to_request_buffer_space' breakpoint/lock,
+    leaving the sync region.
+
+  - checks if the update was effective.
+
+  - if trace buffer was found full, hits flush_trace_buffer
+    breakpoint, and restarts later afterwards.
+
+  GDBserver (consumer):
+
+  - sets `about_to_request_buffer_space' breakpoint/lock.
+
+  - updates the token unconditionally, using the current buffer
+    control index, since it knows that the IP agent always writes to
+    the rightmost, and due to the breakpoint, at most one IP thread
+    can try to update the trace buffer concurrently to GDBserver, so
+    there will be no danger of trace buffer control index wrap making
+    the IPA write to the same index as GDBserver.
+
+  - flushes the IP agent's trace buffer completely, and updates the
+    current trace buffer control structure.  GDBserver *always* wins.
+
+  - removes the `about_to_request_buffer_space' breakpoint.
+
+The token is stored in the `trace_buffer_ctrl_curr' variable.
+Internally, it's bits are defined as:
+
+ |-------------+-----+-------------+--------+-------------+--------------|
+ | Bit offsets |  31 |   30 - 20   |   19   |    18-8     |     7-0      |
+ |-------------+-----+-------------+--------+-------------+--------------|
+ | What        | GSB | PC (11-bit) | unused | CC (11-bit) | TBCI (8-bit) |
+ |-------------+-----+-------------+--------+-------------+--------------|
+
+ GSB  - GDBserver Stamp Bit
+ PC   - Previous Counter
+ CC   - Current Counter
+ TBCI - Trace Buffer Control Index
+
+
+An IPA update of `trace_buffer_ctrl_curr' does:
+
+    - read CC from the current token, save as PC.
+    - updates pointers
+    - atomically tries to write PC+1,CC
+
+A GDBserver update of `trace_buffer_ctrl_curr' does:
+
+    - reads PC and CC from the current token.
+    - updates pointers
+    - writes GSB,PC,CC
+*/
+
+/* These are the bits of `trace_buffer_ctrl_curr' that are reserved
+   for the counters described below.  The cleared bits are used to
+   hold the index of the items of the `trace_buffer_ctrl' array that
+   is "current".  */
+#define GDBSERVER_FLUSH_COUNT_MASK        0xfffffff0
+
+/* `trace_buffer_ctrl_curr' contains two counters.  The `previous'
+   counter, and the `current' counter.  */
+
+#define GDBSERVER_FLUSH_COUNT_MASK_PREV   0x7ff00000
+#define GDBSERVER_FLUSH_COUNT_MASK_CURR   0x0007ff00
+
+/* When GDBserver update the IP agent's `trace_buffer_ctrl_curr', it
+   always stamps this bit as set.  */
+#define GDBSERVER_UPDATED_FLUSH_COUNT_BIT 0x80000000
+
+#ifdef IN_PROCESS_AGENT
+IP_AGENT_EXPORT struct trace_buffer_control trace_buffer_ctrl[3];
+IP_AGENT_EXPORT unsigned int trace_buffer_ctrl_curr;
+
+# define TRACE_BUFFER_CTRL_CURR \
+  (trace_buffer_ctrl_curr & ~GDBSERVER_FLUSH_COUNT_MASK)
+
+#else
+
+/* The GDBserver side agent only needs one instance of this object, as
+   it doesn't need to sync with itself.  Define it as array anyway so
+   that the rest of the code base doesn't need to care for the
+   difference.  */
+struct trace_buffer_control trace_buffer_ctrl[1];
+# define TRACE_BUFFER_CTRL_CURR 0
+#endif
 
-/* Pointer to the wraparound.  If not equal to trace_buffer_hi, then
-   this is the point at which the trace data breaks, and resumes at
-   trace_buffer_lo.  */
+/* These are convenience macros used to access the current trace
+   buffer control in effect.  */
+#define trace_buffer_start (trace_buffer_ctrl[TRACE_BUFFER_CTRL_CURR].start)
+#define trace_buffer_free (trace_buffer_ctrl[TRACE_BUFFER_CTRL_CURR].free)
+#define trace_buffer_end_free \
+  (trace_buffer_ctrl[TRACE_BUFFER_CTRL_CURR].end_free)
+#define trace_buffer_wrap (trace_buffer_ctrl[TRACE_BUFFER_CTRL_CURR].wrap)
 
-static unsigned char *trace_buffer_wrap;
 
 /* Macro that returns a pointer to the first traceframe in the buffer.  */
 
@@ -519,10 +1056,11 @@ static unsigned char *trace_buffer_wrap;
 			     : 0)))
 
 /* The difference between these counters represents the total number
-   of complete traceframes present in the trace buffer.  */
+   of complete traceframes present in the trace buffer.  The IP agent
+   writes to the write count, GDBserver writes to read count.  */
 
-static unsigned int traceframe_write_count;
-static unsigned int traceframe_read_count;
+IP_AGENT_EXPORT unsigned int traceframe_write_count;
+IP_AGENT_EXPORT unsigned int traceframe_read_count;
 
 /* Convenience macro.  */
 
@@ -532,7 +1070,9 @@ static unsigned int traceframe_read_coun
 /* The count of all traceframes created in the current run, including
    ones that were discarded to make room.  */
 
-static int traceframes_created;
+IP_AGENT_EXPORT int traceframes_created;
+
+#ifndef IN_PROCESS_AGENT
 
 /* Read-only regions are address ranges whose contents don't change,
    and so can be read from target memory even while looking at a trace
@@ -556,9 +1096,13 @@ struct readonly_region
 
 static struct readonly_region *readonly_regions;
 
+#endif
+
 /* The global that controls tracing overall.  */
 
-int tracing;
+IP_AGENT_EXPORT int tracing;
+
+#ifndef IN_PROCESS_AGENT
 
 /* Controls whether tracing should continue after GDB disconnects.  */
 
@@ -572,18 +1116,37 @@ static const char *tracing_stop_reason =
 
 static int tracing_stop_tpnum;
 
+#endif
+
 /* Functions local to this file.  */
 
 /* Base "class" for tracepoint type specific data to be passed down to
-   collect_data_at_tracepoint. */
+   collect_data_at_tracepoint.  */
 struct tracepoint_hit_ctx
 {
-  /* empty */
+  enum tracepoint_type type;
 };
 
-/* Trap tracepoint specific data to be passed down to
+#ifdef IN_PROCESS_AGENT
+
+/* Fast/jump tracepoint specific data to be passed down to
    collect_data_at_tracepoint.  */
+struct fast_tracepoint_ctx
+{
+  struct tracepoint_hit_ctx base;
+
+  struct regcache regcache;
+  int regcache_initted;
+  unsigned char *regspace;
+
+  unsigned char *regs;
+  struct tracepoint *tpoint;
+};
 
+#else
+
+/* Static tracepoint specific data to be passed down to
+   collect_data_at_tracepoint.  */
 struct trap_tracepoint_ctx
 {
   struct tracepoint_hit_ctx base;
@@ -591,8 +1154,12 @@ struct trap_tracepoint_ctx
   struct regcache *regcache;
 };
 
+#endif
+
+#ifndef IN_PROCESS_AGENT
 static struct agent_expr *parse_agent_expr (char **actparm);
 static char *unparse_agent_expr (struct agent_expr *aexpr);
+#endif
 static enum eval_result_type eval_agent_expr (struct tracepoint_hit_ctx *ctx,
 					      struct traceframe *tframe,
 					      struct agent_expr *aexpr,
@@ -602,29 +1169,49 @@ static int agent_mem_read (struct tracef
 			   unsigned char *to, CORE_ADDR from, ULONGEST len);
 static int agent_tsv_read (struct traceframe *tframe, int n);
 
+#ifndef IN_PROCESS_AGENT
 static CORE_ADDR traceframe_get_pc (struct traceframe *tframe);
 static int traceframe_read_tsv (int num, LONGEST *val);
+#endif
 
 static int condition_true_at_tracepoint (struct tracepoint_hit_ctx *ctx,
 					 struct tracepoint *tpoint);
 
+#ifndef IN_PROCESS_AGENT
 static void clear_readonly_regions (void);
 static void clear_installed_tracepoints (void);
+#endif
 
 static void collect_data_at_tracepoint (struct tracepoint_hit_ctx *ctx,
 					CORE_ADDR stop_pc,
 					struct tracepoint *tpoint);
-
+#ifndef IN_PROCESS_AGENT
 static void collect_data_at_step (struct tracepoint_hit_ctx *ctx,
 				  CORE_ADDR stop_pc,
 				  struct tracepoint *tpoint, int current_step);
-
+#endif
 static void do_action_at_tracepoint (struct tracepoint_hit_ctx *ctx,
 				     CORE_ADDR stop_pc,
 				     struct tracepoint *tpoint,
 				     struct traceframe *tframe,
 				     struct tracepoint_action *taction);
 
+#ifndef IN_PROCESS_AGENT
+static struct tracepoint *fast_tracepoint_from_ipa_tpoint_address (CORE_ADDR);
+#endif
+
+#if defined(__GNUC__)
+#  define memory_barrier() asm volatile ("" : : : "memory")
+#else
+#  define memory_barrier() do {} while (0)
+#endif
+
+/* We only build the IPA if this builtin is supported, and there are
+   no uses of this in GDBserver itself, so we're safe in defining this
+   unconditionally.  */
+#define cmpxchg(mem, oldval, newval) \
+  __sync_val_compare_and_swap (mem, oldval, newval)
+
 /* Record that an error occurred during expression evaluation.  */
 
 static void
@@ -634,7 +1221,17 @@ record_tracepoint_error (struct tracepoi
   trace_debug ("Tracepoint %d at %s %s eval reports error %d",
 	       tpoint->number, paddress (tpoint->address), which, rtype);
 
-  expr_eval_result = rtype;
+#ifdef IN_PROCESS_AGENT
+  /* Only record the first error we get.  */
+  if (cmpxchg (&expr_eval_result,
+	       expr_eval_no_error,
+	       rtype) != expr_eval_no_error)
+    return;
+#else
+  if (expr_eval_result != expr_eval_no_error)
+    return;
+#endif
+
   error_tracepoint = tpoint;
 }
 
@@ -654,6 +1251,45 @@ clear_trace_buffer (void)
   traceframes_created = 0;
 }
 
+#ifndef IN_PROCESS_AGENT
+
+static void
+clear_inferior_trace_buffer (void)
+{
+  CORE_ADDR ipa_trace_buffer_lo;
+  CORE_ADDR ipa_trace_buffer_hi;
+  struct traceframe ipa_traceframe = { 0 };
+  struct ipa_trace_buffer_control ipa_trace_buffer_ctrl;
+
+  read_inferior_data_pointer (ipa_sym_addrs.addr_trace_buffer_lo,
+			      &ipa_trace_buffer_lo);
+  read_inferior_data_pointer (ipa_sym_addrs.addr_trace_buffer_hi,
+			      &ipa_trace_buffer_hi);
+
+  ipa_trace_buffer_ctrl.start = ipa_trace_buffer_lo;
+  ipa_trace_buffer_ctrl.free = ipa_trace_buffer_lo;
+  ipa_trace_buffer_ctrl.end_free = ipa_trace_buffer_hi;
+  ipa_trace_buffer_ctrl.wrap = ipa_trace_buffer_hi;
+
+  /* A traceframe with zeroed fields marks the end of trace data.  */
+  write_inferior_memory (ipa_sym_addrs.addr_trace_buffer_ctrl,
+			 (unsigned char *) &ipa_trace_buffer_ctrl,
+			 sizeof (ipa_trace_buffer_ctrl));
+
+  write_inferior_uinteger (ipa_sym_addrs.addr_trace_buffer_ctrl_curr, 0);
+
+  /* A traceframe with zeroed fields marks the end of trace data.  */
+  write_inferior_memory (ipa_trace_buffer_lo,
+			 (unsigned char *) &ipa_traceframe,
+			 sizeof (ipa_traceframe));
+
+  write_inferior_uinteger (ipa_sym_addrs.addr_traceframe_write_count, 0);
+  write_inferior_uinteger (ipa_sym_addrs.addr_traceframe_read_count, 0);
+  write_inferior_integer (ipa_sym_addrs.addr_traceframes_created, 0);
+}
+
+#endif
+
 static void
 init_trace_buffer (unsigned char *buf, int bufsize)
 {
@@ -663,6 +1299,18 @@ init_trace_buffer (unsigned char *buf, i
   clear_trace_buffer ();
 }
 
+#ifdef IN_PROCESS_AGENT
+
+IP_AGENT_EXPORT void ATTR_USED ATTR_NOINLINE
+about_to_request_buffer_space (void)
+{
+  /* GDBserver places breakpoint here while it goes about to flush
+     data at random times.  */
+  UNKNOWN_SIDE_EFFECTS();
+}
+
+#endif
+
 /* Carve out a piece of the trace buffer, returning NULL in case of
    failure.  */
 
@@ -670,8 +1318,17 @@ static void *
 trace_buffer_alloc (size_t amt)
 {
   unsigned char *rslt;
+  struct trace_buffer_control *tbctrl;
+  unsigned int curr;
+#ifdef IN_PROCESS_AGENT
+  unsigned int prev, prev_filtered;
+  unsigned int commit_count;
+  unsigned int commit;
+  unsigned int readout;
+#else
   struct traceframe *oldest;
   unsigned char *new_start;
+#endif
 
   trace_debug ("Want to allocate %ld+%ld bytes in trace buffer",
 	       (long) amt, (long) sizeof (struct traceframe));
@@ -679,14 +1336,42 @@ trace_buffer_alloc (size_t amt)
   /* Account for the EOB marker.  */
   amt += sizeof (struct traceframe);
 
+#ifdef IN_PROCESS_AGENT
+ again:
+  memory_barrier ();
+
+  /* Read the current token and extract the index to try to write to,
+     storing it in CURR.  */
+  prev = trace_buffer_ctrl_curr;
+  prev_filtered = prev & ~GDBSERVER_FLUSH_COUNT_MASK;
+  curr = prev_filtered + 1;
+  if (curr > 2)
+    curr = 0;
+
+  about_to_request_buffer_space ();
+
+  /* Start out with a copy of the current state.  GDBserver may be
+     midway writing to the PREV_FILTERED TBC, but, that's OK, we won't
+     be able to commit anyway if that happens.  */
+  trace_buffer_ctrl[curr]
+    = trace_buffer_ctrl[prev_filtered];
+  trace_debug ("trying curr=%u", curr);
+#else
+  /* The GDBserver's agent doesn't need all that syncing, and always
+     updates TCB 0 (there's only one, mind you).  */
+  curr = 0;
+#endif
+  tbctrl = &trace_buffer_ctrl[curr];
+
   /* Offsets are easier to grok for debugging than raw addresses,
      especially for the small trace buffer sizes that are useful for
      testing.  */
-  trace_debug ("Trace buffer start=%d free=%d endfree=%d wrap=%d hi=%d",
-	       (int) (trace_buffer_start - trace_buffer_lo),
-	       (int) (trace_buffer_free - trace_buffer_lo),
-	       (int) (trace_buffer_end_free - trace_buffer_lo),
-	       (int) (trace_buffer_wrap - trace_buffer_lo),
+  trace_debug ("Trace buffer [%d] start=%d free=%d endfree=%d wrap=%d hi=%d",
+	       curr,
+	       (int) (tbctrl->start - trace_buffer_lo),
+	       (int) (tbctrl->free - trace_buffer_lo),
+	       (int) (tbctrl->end_free - trace_buffer_lo),
+	       (int) (tbctrl->wrap - trace_buffer_lo),
 	       (int) (trace_buffer_hi - trace_buffer_lo));
 
   /* The algorithm here is to keep trying to get a contiguous block of
@@ -698,9 +1383,9 @@ trace_buffer_alloc (size_t amt)
   while (1)
     {
       /* First, if we have two free parts, try the upper one first.  */
-      if (trace_buffer_end_free < trace_buffer_free)
+      if (tbctrl->end_free < tbctrl->free)
 	{
-	  if (trace_buffer_free + amt <= trace_buffer_hi)
+	  if (tbctrl->free + amt <= trace_buffer_hi)
 	    /* We have enough in the upper part.  */
 	    break;
 	  else
@@ -710,15 +1395,32 @@ trace_buffer_alloc (size_t amt)
 		 space later, if/when the wrapped-around traceframe is
 		 discarded.  */
 	      trace_debug ("Upper part too small, setting wraparound");
-	      trace_buffer_wrap = trace_buffer_free;
-	      trace_buffer_free = trace_buffer_lo;
+	      tbctrl->wrap = tbctrl->free;
+	      tbctrl->free = trace_buffer_lo;
 	    }
 	}
 
       /* The normal case.  */
-      if (trace_buffer_free + amt <= trace_buffer_end_free)
+      if (tbctrl->free + amt <= tbctrl->end_free)
 	break;
 
+#ifdef IN_PROCESS_AGENT
+      /* The IP Agent's buffer is always circular.  It isn't used
+	 currently, but `circular_trace_buffer' could represent
+	 GDBserver's mode.  If we didn't find space, ask GDBserver to
+	 flush.  */
+
+      flush_trace_buffer ();
+      memory_barrier ();
+      if (tracing)
+	{
+	  trace_debug ("gdbserver flushed buffer, retrying");
+	  goto again;
+	}
+
+      /* GDBserver cancelled the tracing.  Bail out as well.  */
+      return NULL;
+#else
       /* If we're here, then neither part is big enough, and
 	 non-circular trace buffers are now full.  */
       if (!circular_trace_buffer)
@@ -741,45 +1443,111 @@ trace_buffer_alloc (size_t amt)
 	  return NULL;
 	}
 
+      /* We don't run this code in the in-process agent currently.
+	 E.g., we could leave the in-process agent in autonomous
+	 circular mode if we only have fast tracepoints.  If we do
+	 that, then this bit becomes racy with GDBserver, which also
+	 writes to this counter.  */
       --traceframe_write_count;
 
       new_start = (unsigned char *) NEXT_TRACEFRAME (oldest);
       /* If we freed the traceframe that wrapped around, go back
 	 to the non-wrap case.  */
-      if (new_start < trace_buffer_start)
+      if (new_start < tbctrl->start)
 	{
 	  trace_debug ("Discarding past the wraparound");
-	  trace_buffer_wrap = trace_buffer_hi;
+	  tbctrl->wrap = trace_buffer_hi;
 	}
-      trace_buffer_start = new_start;
-      trace_buffer_end_free = trace_buffer_start;
+      tbctrl->start = new_start;
+      tbctrl->end_free = tbctrl->start;
 
       trace_debug ("Discarded a traceframe\n"
-		   "Trace buffer, start=%d free=%d endfree=%d wrap=%d hi=%d",
-		   (int) (trace_buffer_start - trace_buffer_lo),
-		   (int) (trace_buffer_free - trace_buffer_lo),
-		   (int) (trace_buffer_end_free - trace_buffer_lo),
-		   (int) (trace_buffer_wrap - trace_buffer_lo),
+		   "Trace buffer [%d], start=%d free=%d "
+		   "endfree=%d wrap=%d hi=%d",
+		   curr,
+		   (int) (tbctrl->start - trace_buffer_lo),
+		   (int) (tbctrl->free - trace_buffer_lo),
+		   (int) (tbctrl->end_free - trace_buffer_lo),
+		   (int) (tbctrl->wrap - trace_buffer_lo),
 		   (int) (trace_buffer_hi - trace_buffer_lo));
 
       /* Now go back around the loop.  The discard might have resulted
 	 in either one or two pieces of free space, so we want to try
 	 both before freeing any more traceframes.  */
+#endif
     }
 
   /* If we get here, we know we can provide the asked-for space.  */
 
-  rslt = trace_buffer_free;
+  rslt = tbctrl->free;
 
   /* Adjust the request back down, now that we know we have space for
-     the marker.  */
-  trace_buffer_free += (amt - sizeof (struct traceframe));
+     the marker, but don't commit to AMT yet, we may still need to
+     restart the operation if GDBserver touches the trace buffer
+     (obviously only important in the in-process agent's version).  */
+  tbctrl->free += (amt - sizeof (struct traceframe));
+
+  /* Or not.  If GDBserver changed the trace buffer behind our back,
+     we get to restart a new allocation attempt.  */
+
+#ifdef IN_PROCESS_AGENT
+  /* Build the tentative token.  */
+  commit_count = (((prev & 0x0007ff00) + 0x100) & 0x0007ff00);
+  commit = (((prev & 0x0007ff00) << 12)
+	    | commit_count
+	    | curr);
+
+  /* Try to commit it.  */
+  readout = cmpxchg (&trace_buffer_ctrl_curr, prev, commit);
+  if (readout != prev)
+    {
+      trace_debug ("GDBserver has touched the trace buffer, restarting."
+		   " (prev=%08x, commit=%08x, readout=%08x)",
+		   prev, commit, readout);
+      goto again;
+    }
+
+  /* Hold your horses here.  Even if that change was committed,
+     GDBserver could come in, and clobber it.  We need to hold to be
+     able to tell if GDBserver clobbers before or after we committed
+     the change.  Whenever GDBserver goes about touching the IPA
+     buffer, it sets a breakpoint in this routine, so we have a sync
+     point here.  */
+  about_to_request_buffer_space ();
+
+  /* Check if the change has been effective, even if GDBserver stopped
+     us at the breakpoint.  */
+
+  {
+    unsigned int refetch;
+
+    memory_barrier ();
+
+    refetch = trace_buffer_ctrl_curr;
+
+    if ((refetch == commit
+	 || ((refetch & 0x7ff00000) >> 12) == commit_count))
+      {
+	/* effective */
+	trace_debug ("change is effective: (prev=%08x, commit=%08x, "
+		     "readout=%08x, refetch=%08x)",
+		     prev, commit, readout, refetch);
+      }
+    else
+      {
+	trace_debug ("GDBserver has touched the trace buffer, not effective."
+		     " (prev=%08x, commit=%08x, readout=%08x, refetch=%08x)",
+		     prev, commit, readout, refetch);
+	goto again;
+      }
+  }
+#endif
 
   /* We have a new piece of the trace buffer.  Hurray!  */
 
   /* Add an EOB marker just past this allocation.  */
-  ((struct traceframe *) trace_buffer_free)->tpnum = 0;
-  ((struct traceframe *) trace_buffer_free)->data_size = 0;
+  ((struct traceframe *) tbctrl->free)->tpnum = 0;
+  ((struct traceframe *) tbctrl->free)->data_size = 0;
 
   /* Adjust the request back down, now that we know we have space for
      the marker.  */
@@ -788,17 +1556,21 @@ trace_buffer_alloc (size_t amt)
   if (debug_threads)
     {
       trace_debug ("Allocated %d bytes", (int) amt);
-      trace_debug ("Trace buffer start=%d free=%d endfree=%d wrap=%d hi=%d",
-		   (int) (trace_buffer_start - trace_buffer_lo),
-		   (int) (trace_buffer_free - trace_buffer_lo),
-		   (int) (trace_buffer_end_free - trace_buffer_lo),
-		   (int) (trace_buffer_wrap - trace_buffer_lo),
+      trace_debug ("Trace buffer [%d] start=%d free=%d "
+		   "endfree=%d wrap=%d hi=%d",
+		   curr,
+		   (int) (tbctrl->start - trace_buffer_lo),
+		   (int) (tbctrl->free - trace_buffer_lo),
+		   (int) (tbctrl->end_free - trace_buffer_lo),
+		   (int) (tbctrl->wrap - trace_buffer_lo),
 		   (int) (trace_buffer_hi - trace_buffer_lo));
     }
 
   return rslt;
 }
 
+#ifndef IN_PROCESS_AGENT
+
 /* Return the total free space.  This is not necessarily the largest
    block we can allocate, because of the two-part case.  */
 
@@ -834,6 +1606,9 @@ add_tracepoint (int num, CORE_ADDR addr)
   tpoint->num_step_actions = 0;
   tpoint->step_actions = NULL;
   tpoint->step_actions_str = NULL;
+  /* Start all off as regular (slow) tracepoints.  */
+  tpoint->type = trap_tracepoint;
+  tpoint->orig_size = -1;
   tpoint->source_strings = NULL;
   tpoint->handle = NULL;
   tpoint->next = NULL;
@@ -849,6 +1624,8 @@ add_tracepoint (int num, CORE_ADDR addr)
   return tpoint;
 }
 
+#ifndef IN_PROCESS_AGENT
+
 /* Return the tracepoint with the given number and address, or NULL.  */
 
 static struct tracepoint *
@@ -884,6 +1661,8 @@ find_next_tracepoint_by_number (struct t
   return NULL;
 }
 
+#endif
+
 static char *
 save_string (const char *str, size_t len)
 {
@@ -1018,6 +1797,8 @@ add_tracepoint_action (struct tracepoint
     }
 }
 
+#endif
+
 /* Find or create a trace state variable with the given number.  */
 
 static struct trace_state_variable *
@@ -1025,6 +1806,13 @@ get_trace_state_variable (int num)
 {
   struct trace_state_variable *tsv;
 
+#ifdef IN_PROCESS_AGENT
+  /* Search for an existing variable.  */
+  for (tsv = alloced_trace_state_variables; tsv; tsv = tsv->next)
+    if (tsv->number == num)
+      return tsv;
+#endif
+
   /* Search for an existing variable.  */
   for (tsv = trace_state_variables; tsv; tsv = tsv->next)
     if (tsv->number == num)
@@ -1036,7 +1824,7 @@ get_trace_state_variable (int num)
 /* Find or create a trace state variable with the given number.  */
 
 static struct trace_state_variable *
-create_trace_state_variable (int num)
+create_trace_state_variable (int num, int gdb)
 {
   struct trace_state_variable *tsv;
 
@@ -1051,9 +1839,18 @@ create_trace_state_variable (int num)
   tsv->value = 0;
   tsv->getter = NULL;
   tsv->name = NULL;
-  tsv->next = trace_state_variables;
-  trace_state_variables = tsv;
-
+#ifdef IN_PROCESS_AGENT
+  if (!gdb)
+    {
+      tsv->next = alloced_trace_state_variables;
+      alloced_trace_state_variables = tsv;
+    }
+  else
+#endif
+    {
+      tsv->next = trace_state_variables;
+      trace_state_variables = tsv;
+    }
   return tsv;
 }
 
@@ -1178,6 +1975,8 @@ finish_traceframe (struct traceframe *tf
   ++traceframes_created;
 }
 
+#ifndef IN_PROCESS_AGENT
+
 /* Given a traceframe number NUM, find the NUMth traceframe in the
    buffer.  */
 
@@ -1278,6 +2077,10 @@ find_next_traceframe_by_tracepoint (int 
   return NULL;
 }
 
+#endif
+
+#ifndef IN_PROCESS_AGENT
+
 /* Clear all past trace state.  */
 
 static void
@@ -1322,6 +2125,7 @@ cmd_qtinit (char *packet)
     }
 
   clear_trace_buffer ();
+  clear_inferior_trace_buffer ();
 
   write_ok (packet);
 }
@@ -1357,7 +2161,16 @@ clear_installed_tracepoints (void)
 	  continue;
 	}
 
-      delete_breakpoint (tpoint->handle);
+      switch (tpoint->type)
+	{
+	case trap_tracepoint:
+	  delete_breakpoint (tpoint->handle);
+	  break;
+	case fast_tracepoint:
+	  delete_fast_tracepoint_jump (tpoint->handle);
+	  break;
+	}
+
       tpoint->handle = NULL;
     }
 
@@ -1421,7 +2234,14 @@ cmd_qtdp (char *own_buf)
       while (*packet == ':')
 	{
 	  ++packet;
-	  if (*packet == 'X')
+	  if (*packet == 'F')
+	    {
+	      tpoint->type = fast_tracepoint;
+	      ++packet;
+	      packet = unpack_varlen_hex (packet, &count);
+	      tpoint->orig_size = count;
+	    }
+	  else if (*packet == 'X')
 	    {
 	      actparm = (char *) packet;
 	      tpoint->cond = parse_agent_expr (&actparm);
@@ -1437,10 +2257,11 @@ cmd_qtdp (char *own_buf)
       if (*packet == '-')
 	trace_debug ("Also has actions\n");
 
-      trace_debug ("Defined tracepoint %d at 0x%s, "
+      trace_debug ("Defined %stracepoint %d at 0x%s, "
 		   "enabled %d step %ld pass %ld",
-		   tpoint->number, paddress (tpoint->address),
-		   tpoint->enabled,
+		   tpoint->type == fast_tracepoint ? "fast "
+		   : "",
+		   tpoint->number, paddress (tpoint->address), tpoint->enabled,
 		   tpoint->step_count, tpoint->pass_count);
     }
   else if (tpoint)
@@ -1540,7 +2361,7 @@ cmd_qtdv (char *own_buf)
   nbytes = unhexify (varname, packet, nbytes);
   varname[nbytes] = '\0';
 
-  tsv = create_trace_state_variable (num);
+  tsv = create_trace_state_variable (num, 1);
   tsv->initial_value = (LONGEST) val;
   tsv->name = varname;
 
@@ -1649,21 +2470,135 @@ in_readonly_region (CORE_ADDR addr, ULON
   return 0;
 }
 
-static void
-cmd_qtstart (char *packet)
-{
-  struct tracepoint *tpoint;
-  int slow_tracepoint_count;
+/* The maximum size of a jump pad entry.  */
+static const int max_jump_pad_size = 0x100;
 
-  trace_debug ("Starting the trace");
-
-  slow_tracepoint_count = 0;
+static CORE_ADDR gdb_jump_pad_head;
 
-  *packet = '\0';
+/* Return the address of the next free jump space.  */
 
-  /* Pause all threads temporarily while we patch tracepoints.  */
+static CORE_ADDR
+get_jump_space_head (void)
+{
+  if (gdb_jump_pad_head == 0)
+    {
+      if (read_inferior_data_pointer (ipa_sym_addrs.addr_gdb_jump_pad_buffer,
+				      &gdb_jump_pad_head))
+	fatal ("error extracting jump_pad_buffer");
+    }
+
+  return gdb_jump_pad_head;
+}
+
+/* Reserve USED bytes from the jump space.  */
+
+static void
+claim_jump_space (ULONGEST used)
+{
+  trace_debug ("claim_jump_space reserves %s bytes at %s",
+	       pulongest (used), paddress (gdb_jump_pad_head));
+  gdb_jump_pad_head += used;
+}
+
+/* Sort tracepoints by PC, using a bubble sort.  */
+
+static void
+sort_tracepoints (void)
+{
+  struct tracepoint *lst, *tmp, *prev = NULL;
+  int i, j, n = 0;
+
+  if (tracepoints == NULL)
+    return;
+
+  /* Count nodes.  */
+  for (tmp = tracepoints; tmp->next; tmp = tmp->next)
+    n++;
+
+  for (i = 0; i < n - 1; i++)
+    for (j = 0, lst = tracepoints;
+	 lst && lst->next && (j <= n - 1 - i);
+	 j++)
+      {
+	/* If we're at beginning, the start node is the prev
+	   node.  */
+	if (j == 0)
+	  prev = lst;
+
+	/* Compare neighbors.  */
+	if (lst->next->address < lst->address)
+	  {
+	    struct tracepoint *p;
+
+	    /* Swap'em.  */
+	    tmp = (lst->next ? lst->next->next : NULL);
+
+	    if (j == 0 && prev == tracepoints)
+	      tracepoints = lst->next;
+
+	    p = lst->next;
+	    prev->next = lst->next;
+	    lst->next->next = lst;
+	    lst->next = tmp;
+	    prev = p;
+	  }
+	else
+	  {
+	    lst = lst->next;
+	    /* Keep track of the previous node.  We need it if we need
+	       to swap nodes.  */
+	    if (j != 0)
+	      prev = prev->next;
+	  }
+      }
+}
+
+#define MAX_JUMP_SIZE 20
+
+static void
+cmd_qtstart (char *packet)
+{
+  struct tracepoint *tpoint, *prev_ftpoint;
+  int slow_tracepoint_count, fast_count;
+  CORE_ADDR jump_entry;
+
+  /* The jump to the jump pad of the last fast tracepoint
+     installed.  */
+  unsigned char fjump[MAX_JUMP_SIZE];
+  ULONGEST fjump_size;
+
+  trace_debug ("Starting the trace");
+
+  slow_tracepoint_count = fast_count = 0;
+
+  /* Sort tracepoints by ascending address.  This makes installing
+     fast tracepoints at the same address easier to handle. */
+  sort_tracepoints ();
+
+  /* Pause all threads temporarily while we patch tracepoints.  */
+  pause_all (0);
+
+  /* Get threads out of jump pads.  Safe to do here, since this is a
+     top level command.  And, required to do here, since we're
+     deleting/rewriting jump pads.  */
+
+  stabilize_threads ();
+
+  /* Freeze threads.  */
   pause_all (1);
 
+  /* Sync the fast tracepoints list in the inferior ftlib.  */
+  if (in_process_agent_loaded ())
+    {
+      download_tracepoints ();
+      download_trace_state_variables ();
+    }
+
+  /* No previous fast tpoint yet.  */
+  prev_ftpoint = NULL;
+
+  *packet = '\0';
+
   /* Install tracepoints.  */
   for (tpoint = tracepoints; tpoint; tpoint = tpoint->next)
     {
@@ -1673,15 +2608,83 @@ cmd_qtstart (char *packet)
       if (!tpoint->enabled)
 	continue;
 
-      ++slow_tracepoint_count;
+      if (tpoint->type == trap_tracepoint)
+	{
+	  ++slow_tracepoint_count;
 
-      /* Tracepoints are installed as memory breakpoints.  Just go
-	 ahead and install the trap.  The breakpoints module handles
-	 duplicated breakpoints, and the memory read routine handles
-	 un-patching traps from memory reads.  */
-      tpoint->handle = set_breakpoint_at (tpoint->address, tracepoint_handler);
+	  /* Tracepoints are installed as memory breakpoints.  Just go
+	     ahead and install the trap.  The breakpoints module
+	     handles duplicated breakpoints, and the memory read
+	     routine handles un-patching traps from memory reads.  */
+	  tpoint->handle = set_breakpoint_at (tpoint->address,
+					      tracepoint_handler);
+	}
+      else if (tpoint->type == fast_tracepoint)
+	{
+	  ++fast_count;
 
-      /* Any failure is sufficient cause to give up.  */
+	  if (maybe_write_ipa_not_loaded (packet))
+	    {
+	      trace_debug ("Requested a fast tracepoint, but fast "
+			   "tracepoints aren't supported.");
+	      break;
+	    }
+
+	  if (prev_ftpoint != NULL && prev_ftpoint->address == tpoint->address)
+	    {
+	      tpoint->handle = set_fast_tracepoint_jump (tpoint->address,
+							 fjump,
+							 fjump_size);
+	      tpoint->jump_pad = prev_ftpoint->jump_pad;
+	      tpoint->jump_pad_end = prev_ftpoint->jump_pad_end;
+	      tpoint->adjusted_insn_addr = prev_ftpoint->adjusted_insn_addr;
+	      tpoint->adjusted_insn_addr_end
+		= prev_ftpoint->adjusted_insn_addr_end;
+	    }
+	  else
+	    {
+	      CORE_ADDR jentry;
+	      int err = 0;
+
+	      prev_ftpoint = NULL;
+
+	      jentry = jump_entry = get_jump_space_head ();
+
+	      /* Install the jump pad.  */
+	      err = install_fast_tracepoint_jump_pad
+		(tpoint->obj_addr_on_target,
+		 tpoint->address,
+		 ipa_sym_addrs.addr_gdb_collect,
+		 ipa_sym_addrs.addr_collecting,
+		 tpoint->orig_size,
+		 &jentry,
+		 fjump, &fjump_size,
+		 &tpoint->adjusted_insn_addr,
+		 &tpoint->adjusted_insn_addr_end);
+
+	      /* Wire it in.  */
+	      if (!err)
+		tpoint->handle = set_fast_tracepoint_jump (tpoint->address,
+							   fjump, fjump_size);
+
+	      if (tpoint->handle != NULL)
+		{
+		  tpoint->jump_pad = jump_entry;
+		  tpoint->jump_pad_end = jentry;
+
+		  /* Pad to 8-byte alignment.  */
+		  jentry = ((jentry + 7) & ~0x7);
+		  claim_jump_space (jentry - jump_entry);
+
+		  /* So that we can handle multiple fast tracepoints
+		     at the same address easily.  */
+		  prev_ftpoint = tpoint;
+		}
+	    }
+	}
+
+      /* Any failure in the inner loop is sufficient cause to give
+	 up.  */
       if (tpoint->handle == NULL)
 	break;
     }
@@ -1706,6 +2709,30 @@ cmd_qtstart (char *packet)
   /* Tracing is now active, hits will now start being logged.  */
   tracing = 1;
 
+  if (in_process_agent_loaded ())
+    {
+      if (write_inferior_integer (ipa_sym_addrs.addr_tracing, 1))
+	fatal ("Error setting tracing variable in lib");
+
+      if (write_inferior_data_pointer (ipa_sym_addrs.addr_stopping_tracepoint,
+				       0))
+	fatal ("Error clearing stopping_tracepoint variable in lib");
+
+      if (write_inferior_integer (ipa_sym_addrs.addr_trace_buffer_is_full, 0))
+	fatal ("Error clearing trace_buffer_is_full variable in lib");
+
+      stop_tracing_bkpt = set_breakpoint_at (ipa_sym_addrs.addr_stop_tracing,
+					     stop_tracing_handler);
+      if (stop_tracing_bkpt == NULL)
+	error ("Error setting stop_tracing breakpoint");
+
+      flush_trace_buffer_bkpt
+	= set_breakpoint_at (ipa_sym_addrs.addr_flush_trace_buffer,
+			     flush_trace_buffer_handler);
+      if (flush_trace_buffer_bkpt == NULL)
+	error ("Error setting flush_trace_buffer breakpoint");
+    }
+
   unpause_all (1);
 
   write_ok (packet);
@@ -1725,7 +2752,14 @@ stop_tracing (void)
 
   trace_debug ("Stopping the trace");
 
-  /* Pause all threads before removing breakpoints from memory.  */
+  /* Pause all threads before removing fast jumps from memory,
+     breakpoints, and touching IPA state variables (inferior memory).
+     Some thread may hit the internal tracing breakpoints, or be
+     collecting this moment, but that's ok, we don't release the
+     tpoint object's memory or the jump pads here (we only do that
+     when we're sure we can move all threads out of the jump pads).
+     We can't now, since we may be getting here due to the inferior
+     agent calling us.  */
   pause_all (1);
   /* Since we're removing breakpoints, cancel breakpoint hits,
      possibly related to the breakpoints we're about to delete.  */
@@ -1734,6 +2768,11 @@ stop_tracing (void)
   /* Stop logging. Tracepoints can still be hit, but they will not be
      recorded.  */
   tracing = 0;
+  if (in_process_agent_loaded ())
+    {
+      if (write_inferior_integer (ipa_sym_addrs.addr_tracing, 0))
+	fatal ("Error clearing tracing variable in lib");
+    }
 
   tracing_stop_reason = "t???";
   tracing_stop_tpnum = 0;
@@ -1757,11 +2796,13 @@ stop_tracing (void)
       tracing_stop_reason = eval_result_names[expr_eval_result];
       tracing_stop_tpnum = error_tracepoint->number;
     }
+#ifndef IN_PROCESS_AGENT
   else if (!gdb_connected ())
     {
       trace_debug ("Stopping the trace because GDB disconnected");
       tracing_stop_reason = "tdisconnected";
     }
+#endif
   else
     {
       trace_debug ("Stopping the trace because of a tstop command");
@@ -1774,9 +2815,49 @@ stop_tracing (void)
   /* Clear out the tracepoints.  */
   clear_installed_tracepoints ();
 
+  if (in_process_agent_loaded ())
+    {
+      /* Pull in fast tracepoint trace frames from the inferior lib
+	 buffer into our buffer, even if our buffer is already full,
+	 because we want to present the full number of created frames
+	 in addition to what fit in the trace buffer.  */
+      upload_fast_traceframes ();
+    }
+
+  if (stop_tracing_bkpt != NULL)
+    {
+      delete_breakpoint (stop_tracing_bkpt);
+      stop_tracing_bkpt = NULL;
+    }
+
+  if (flush_trace_buffer_bkpt != NULL)
+    {
+      delete_breakpoint (flush_trace_buffer_bkpt);
+      flush_trace_buffer_bkpt = NULL;
+    }
+
   unpause_all (1);
 }
 
+static int
+stop_tracing_handler (CORE_ADDR addr)
+{
+  trace_debug ("lib hit stop_tracing");
+
+  /* Don't actually handle it here.  When we stop tracing we remove
+     breakpoints from the inferior, and that is not allowed in a
+     breakpoint handler (as the caller is walking the breakpoint
+     list).  */
+  return 0;
+}
+
+static int
+flush_trace_buffer_handler (CORE_ADDR addr)
+{
+  trace_debug ("lib hit flush_trace_buffer");
+  return 0;
+}
+
 static void
 cmd_qtstop (char *packet)
 {
@@ -1877,6 +2958,15 @@ cmd_qtstatus (char *packet)
   trace_debug ("Returning trace status as %d, stop reason %s",
 	       tracing, tracing_stop_reason);
 
+  if (in_process_agent_loaded ())
+    {
+      pause_all (1);
+
+      upload_fast_traceframes ();
+
+      unpause_all (1);
+   }
+
   stop_reason_rsp = (char *) tracing_stop_reason;
 
   /* The user visible error string in terror needs to be hex encoded.
@@ -1930,6 +3020,8 @@ response_tracepoint (char *packet, struc
 	   paddress (tpoint->address),
 	   (tpoint->enabled ? 'E' : 'D'), tpoint->step_count,
 	   tpoint->pass_count);
+  if (tpoint->type == fast_tracepoint)
+    sprintf (packet + strlen (packet), ":F%x", tpoint->orig_size);
 
   if (tpoint->cond)
     {
@@ -2279,6 +3371,9 @@ handle_tracepoint_query (char *packet)
   return 0;
 }
 
+#endif
+#ifndef IN_PROCESS_AGENT
+
 /* Call this when thread TINFO has hit the tracepoint defined by
    TP_NUMBER and TP_ADDRESS, and that tracepoint has a while-stepping
    action.  This adds a while-stepping collecting state item to the
@@ -2351,6 +3446,11 @@ tracepoint_finished_step (struct thread_
   struct wstep_state **wstep_link;
   struct trap_tracepoint_ctx ctx;
 
+  /* Pull in fast tracepoint trace frames from the inferior lib buffer into
+     our buffer.  */
+  if (in_process_agent_loaded ())
+    upload_fast_traceframes ();
+
   /* Check if we were indeed collecting data for one of more
      tracepoints with a 'while-stepping' count.  */
   if (tinfo->while_stepping == NULL)
@@ -2374,6 +3474,7 @@ tracepoint_finished_step (struct thread_
 	       target_pid_to_str (tinfo->entry.id),
 	       wstep->tp_number, paddress (wstep->tp_address));
 
+  ctx.base.type = trap_tracepoint;
   ctx.regcache = get_thread_regcache (tinfo, 1);
 
   while (wstep != NULL)
@@ -2437,6 +3538,89 @@ tracepoint_finished_step (struct thread_
   return 1;
 }
 
+/* Handle any internal tracing control breakpoint hits.  That means,
+   pull traceframes from the IPA to our buffer, and syncing both
+   tracing agents when the IPA's tracing stops for some reason.  */
+
+int
+handle_tracepoint_bkpts (struct thread_info *tinfo, CORE_ADDR stop_pc)
+{
+  /* Pull in fast tracepoint trace frames from the inferior in-process
+     agent's buffer into our buffer.  */
+
+  if (!in_process_agent_loaded ())
+    return 0;
+
+  upload_fast_traceframes ();
+
+  /* Check if the in-process agent had decided we should stop
+     tracing.  */
+  if (stop_pc == ipa_sym_addrs.addr_stop_tracing)
+    {
+      int ipa_trace_buffer_is_full;
+      CORE_ADDR ipa_stopping_tracepoint;
+      int ipa_expr_eval_result;
+      CORE_ADDR ipa_error_tracepoint;
+
+      trace_debug ("lib stopped at stop_tracing");
+
+      read_inferior_integer (ipa_sym_addrs.addr_trace_buffer_is_full,
+			     &ipa_trace_buffer_is_full);
+
+      read_inferior_data_pointer (ipa_sym_addrs.addr_stopping_tracepoint,
+				  &ipa_stopping_tracepoint);
+      write_inferior_data_pointer (ipa_sym_addrs.addr_stopping_tracepoint, 0);
+
+      read_inferior_data_pointer (ipa_sym_addrs.addr_error_tracepoint,
+				  &ipa_error_tracepoint);
+      write_inferior_data_pointer (ipa_sym_addrs.addr_error_tracepoint, 0);
+
+      read_inferior_integer (ipa_sym_addrs.addr_expr_eval_result,
+			     &ipa_expr_eval_result);
+      write_inferior_integer (ipa_sym_addrs.addr_expr_eval_result, 0);
+
+      trace_debug ("lib: trace_buffer_is_full: %d, "
+		   "stopping_tracepoint: %s, "
+		   "ipa_expr_eval_result: %d, "
+		   "error_tracepoint: %s, ",
+		   ipa_trace_buffer_is_full,
+		   paddress (ipa_stopping_tracepoint),
+		   ipa_expr_eval_result,
+		   paddress (ipa_error_tracepoint));
+
+      if (debug_threads)
+	{
+	  if (ipa_trace_buffer_is_full)
+	    trace_debug ("lib stopped due to full buffer.");
+	  if (ipa_stopping_tracepoint)
+	    trace_debug ("lib stopped due to tpoint");
+	  if (ipa_stopping_tracepoint)
+	    trace_debug ("lib stopped due to error");
+	}
+
+      if (ipa_stopping_tracepoint != 0)
+	{
+	  stopping_tracepoint
+	    = fast_tracepoint_from_ipa_tpoint_address (ipa_stopping_tracepoint);
+	}
+      else if (ipa_expr_eval_result != expr_eval_no_error)
+	{
+	  expr_eval_result = ipa_expr_eval_result;
+	  error_tracepoint
+	    = fast_tracepoint_from_ipa_tpoint_address (ipa_error_tracepoint);
+	}
+      stop_tracing ();
+      return 1;
+    }
+  else if (stop_pc == ipa_sym_addrs.addr_flush_trace_buffer)
+    {
+      trace_debug ("lib stopped at flush_trace_buffer");
+      return 1;
+    }
+
+  return 0;
+}
+
 /* Return true if TINFO just hit a tracepoint.  Collect data if
    so.  */
 
@@ -2451,10 +3635,14 @@ tracepoint_was_hit (struct thread_info *
   if (!tracing)
     return 0;
 
+  ctx.base.type = trap_tracepoint;
   ctx.regcache = get_thread_regcache (tinfo, 1);
 
   for (tpoint = tracepoints; tpoint; tpoint = tpoint->next)
     {
+      /* Note that we collect fast tracepoints here as well.  We'll
+	 step over the fast tracepoint jump later, which avoids the
+	 double collect.  */
       if (tpoint->enabled && stop_pc == tpoint->address)
 	{
 	  trace_debug ("Thread %s at address of tracepoint %d at 0x%s",
@@ -2490,6 +3678,8 @@ tracepoint_was_hit (struct thread_info *
   return ret;
 }
 
+#endif
+
 /* Create a trace frame for the hit of the given tracepoint in the
    given thread.  */
 
@@ -2522,9 +3712,11 @@ collect_data_at_tracepoint (struct trace
     {
       for (acti = 0; acti < tpoint->numactions; ++acti)
 	{
+#ifndef IN_PROCESS_AGENT
 	  trace_debug ("Tracepoint %d at 0x%s about to do action '%s'",
 		       tpoint->number, paddress (tpoint->address),
 		       tpoint->actions_str[acti]);
+#endif
 
 	  do_action_at_tracepoint (ctx, stop_pc, tpoint, tframe,
 				   tpoint->actions[acti]);
@@ -2537,6 +3729,8 @@ collect_data_at_tracepoint (struct trace
     trace_buffer_is_full = 1;
 }
 
+#ifndef IN_PROCESS_AGENT
+
 static void
 collect_data_at_step (struct tracepoint_hit_ctx *ctx,
 		      CORE_ADDR stop_pc,
@@ -2572,11 +3766,33 @@ collect_data_at_step (struct tracepoint_
     trace_buffer_is_full = 1;
 }
 
+#endif
+
 static struct regcache *
 get_context_regcache (struct tracepoint_hit_ctx *ctx)
 {
-  struct trap_tracepoint_ctx *tctx = (struct trap_tracepoint_ctx *) ctx;
-  struct regcache *regcache = tctx->regcache;
+  struct regcache *regcache = NULL;
+
+#ifdef IN_PROCESS_AGENT
+  if (ctx->type == fast_tracepoint)
+    {
+      struct fast_tracepoint_ctx *fctx = (struct fast_tracepoint_ctx *) ctx;
+      if (!fctx->regcache_initted)
+	{
+	  fctx->regcache_initted = 1;
+	  init_register_cache (&fctx->regcache, fctx->regspace);
+	  supply_regblock (&fctx->regcache, NULL);
+	  supply_fast_tracepoint_registers (&fctx->regcache, fctx->regs);
+	}
+      regcache = &fctx->regcache;
+    }
+#else
+  if (ctx->type == trap_tracepoint)
+    {
+      struct trap_tracepoint_ctx *tctx = (struct trap_tracepoint_ctx *) ctx;
+      regcache = tctx->regcache;
+    }
+#endif
 
   gdb_assert (regcache != NULL);
 
@@ -2640,18 +3856,24 @@ do_action_at_tracepoint (struct tracepoi
 	/* Copy the register data to the regblock.  */
 	regcache_cpy (&tregcache, context_regcache);
 
+#ifndef IN_PROCESS_AGENT
 	/* On some platforms, trap-based tracepoints will have the PC
 	   pointing to the next instruction after the trap, but we
 	   don't want the user or GDB trying to guess whether the
 	   saved PC needs adjusting; so always record the adjusted
 	   stop_pc.  Note that we can't use tpoint->address instead,
-	   since it will be wrong for while-stepping actions.  */
+	   since it will be wrong for while-stepping actions.  This
+	   adjustment is a nop for fast tracepoints collected from the
+	   in-process lib (but not if GDBserver is collecting one
+	   preemptively), since the PC had already been adjusted to
+	   contain the tracepoint's address by the jump pad.  */
 	trace_debug ("Storing stop pc (0x%s) in regblock",
 		     paddress (tpoint->address));
 
 	/* This changes the regblock, not the thread's
 	   regcache.  */
 	regcache_write_pc (&tregcache, stop_pc);
+#endif
       }
       break;
     case 'X':
@@ -2699,6 +3921,8 @@ condition_true_at_tracepoint (struct tra
   return (value ? 1 : 0);
 }
 
+#ifndef IN_PROCESS_AGENT
+
 /* The packet form of an agent expression consists of an 'X', number
    of bytes in expression, a comma, and then the bytes.  */
 
@@ -2734,6 +3958,8 @@ unparse_agent_expr (struct agent_expr *a
   return rslt;
 }
 
+#endif
+
 /* The agent expression evaluator, as specified by the GDB docs. It
    returns 0 if everything went OK, and a nonzero error code
    otherwise.  */
@@ -3181,6 +4407,8 @@ agent_tsv_read (struct traceframe *tfram
   return 0;
 }
 
+#ifndef IN_PROCESS_AGENT
+
 static unsigned char *
 traceframe_find_block_type (unsigned char *database, unsigned int datasize,
 			    int tfnum, char type_wanted)
@@ -3226,11 +4454,6 @@ traceframe_find_block_type (unsigned cha
 	  memcpy (&mlen, dataptr, sizeof (mlen));
 	  dataptr += (sizeof (mlen) + mlen);
 	  break;
-	case 'S':
-	  /* Skip over the static trace data block.  */
-	  memcpy (&mlen, dataptr, sizeof (mlen));
-	  dataptr += (sizeof (mlen) + mlen);
-	  break;
 	case 'V':
 	  /* Skip over the TSV block.  */
 	  dataptr += (sizeof (int) + sizeof (LONGEST));
@@ -3425,6 +4648,826 @@ traceframe_read_tsv (int tsvnum, LONGEST
   return 1;
 }
 
+/* Return the first fast tracepoint whose jump pad contains PC.  */
+
+static struct tracepoint *
+fast_tracepoint_from_jump_pad_address (CORE_ADDR pc)
+{
+  struct tracepoint *tpoint;
+
+  for (tpoint = tracepoints; tpoint; tpoint = tpoint->next)
+    if (tpoint->type == fast_tracepoint)
+      if (tpoint->jump_pad <= pc && pc < tpoint->jump_pad_end)
+	return tpoint;
+
+  return NULL;
+}
+
+/* Return GDBserver's tracepoint that matches the IP Agent's
+   tracepoint object that lives at IPA_TPOINT_OBJ in the IP Agent's
+   address space.  */
+
+static struct tracepoint *
+fast_tracepoint_from_ipa_tpoint_address (CORE_ADDR ipa_tpoint_obj)
+{
+  struct tracepoint *tpoint;
+
+  for (tpoint = tracepoints; tpoint; tpoint = tpoint->next)
+    if (tpoint->type == fast_tracepoint)
+      if (tpoint->obj_addr_on_target == ipa_tpoint_obj)
+	return tpoint;
+
+  return NULL;
+}
+
+#endif
+
+/* The type of the object that is used to synchronize fast tracepoint
+   collection.  */
+
+typedef struct collecting_t
+{
+  /* The fast tracepoint number currently collecting.  */
+  uintptr_t tpoint;
+
+  /* A number that GDBserver can use to identify the thread that is
+     presently holding the collect lock.  This need not (and usually
+     is not) the thread id, as getting the current thread ID usually
+     requires a system call, which we want to avoid like the plague.
+     Usually this is thread's TCB, found in the TLS (pseudo-)
+     register, which is readable with a single insn on several
+     architectures.  */
+  uintptr_t thread_area;
+} collecting_t;
+
+#ifndef IN_PROCESS_AGENT
+
+void
+force_unlock_trace_buffer (void)
+{
+  write_inferior_data_pointer (ipa_sym_addrs.addr_collecting, 0);
+}
+
+/* Check if the thread identified by THREAD_AREA which is stopped at
+   STOP_PC, is presently locking the fast tracepoint collection, and
+   if so, gather some status of said collection.  Returns 0 if the
+   thread isn't collecting or in the jump pad at all.  1, if in the
+   jump pad (or within gdb_collect) and hasn't executed the adjusted
+   original insn yet (can set a breakpoint there and run to it).  2,
+   if presently executing the adjusted original insn --- in which
+   case, if we want to move the thread out of the jump pad, we need to
+   single-step it until this function returns 0.  */
+
+int
+fast_tracepoint_collecting (CORE_ADDR thread_area,
+			    CORE_ADDR stop_pc,
+			    struct fast_tpoint_collect_status *status)
+{
+  CORE_ADDR ipa_collecting;
+  CORE_ADDR ipa_gdb_jump_pad_buffer, ipa_gdb_jump_pad_buffer_end;
+  struct tracepoint *tpoint;
+  int needs_breakpoint;
+
+  /* The thread THREAD_AREA is either:
+
+      0. not collecting at all, not within the jump pad, or within
+	 gdb_collect or one of its callees.
+
+      1. in the jump pad and haven't reached gdb_collect
+
+      2. within gdb_collect (out of the jump pad) (collect is set)
+
+      3. we're in the jump pad, after gdb_collect having returned,
+	 possibly executing the adjusted insns.
+
+      For cases 1 and 3, `collecting' may or not be set.  The jump pad
+      doesn't have any complicated jump logic, so we can tell if the
+      thread is executing the adjust original insn or not by just
+      matching STOP_PC with known jump pad addresses.  If we it isn't
+      yet executing the original insn, set a breakpoint there, and let
+      the thread run to it, so to quickly step over a possible (many
+      insns) gdb_collect call.  Otherwise, or when the breakpoint is
+      hit, only a few (small number of) insns are left to be executed
+      in the jump pad.  Single-step the thread until it leaves the
+      jump pad.  */
+
+ again:
+  tpoint = NULL;
+  needs_breakpoint = 0;
+  trace_debug ("fast_tracepoint_collecting");
+
+  if (read_inferior_data_pointer (ipa_sym_addrs.addr_gdb_jump_pad_buffer,
+				  &ipa_gdb_jump_pad_buffer))
+    fatal ("error extracting `gdb_jump_pad_buffer'");
+  if (read_inferior_data_pointer (ipa_sym_addrs.addr_gdb_jump_pad_buffer_end,
+				  &ipa_gdb_jump_pad_buffer_end))
+    fatal ("error extracting `gdb_jump_pad_buffer_end'");
+
+  if (ipa_gdb_jump_pad_buffer <= stop_pc && stop_pc < ipa_gdb_jump_pad_buffer_end)
+    {
+      /* We can tell which tracepoint(s) the thread is collecting by
+	 matching the jump pad address back to the tracepoint.  */
+      tpoint = fast_tracepoint_from_jump_pad_address (stop_pc);
+      if (tpoint == NULL)
+	{
+	  warning ("in jump pad, but no matching tpoint?");
+	  return 0;
+	}
+      else
+	{
+	  trace_debug ("in jump pad of tpoint (%d, %s); jump_pad(%s, %s); "
+		       "adj_insn(%s, %s)",
+		       tpoint->number, paddress (tpoint->address),
+		       paddress (tpoint->jump_pad),
+		       paddress (tpoint->jump_pad_end),
+		       paddress (tpoint->adjusted_insn_addr),
+		       paddress (tpoint->adjusted_insn_addr_end));
+	}
+
+      /* Definitely in the jump pad.  May or may not need
+	 fast-exit-jump-pad breakpoint.  */
+      if (tpoint->jump_pad <= stop_pc
+	  && stop_pc < tpoint->adjusted_insn_addr)
+	needs_breakpoint =  1;
+    }
+  else
+    {
+      collecting_t ipa_collecting_obj;
+
+      /* If `collecting' is set/locked, then the THREAD_AREA thread
+	 may or not be the one holding the lock.  We have to read the
+	 lock to find out.  */
+
+      if (read_inferior_data_pointer (ipa_sym_addrs.addr_collecting,
+				      &ipa_collecting))
+	{
+	  trace_debug ("fast_tracepoint_collecting:"
+		       " failed reading 'collecting' in the inferior");
+	  return 0;
+	}
+
+      if (!ipa_collecting)
+	{
+	  trace_debug ("fast_tracepoint_collecting: not collecting"
+		       " (and nobody is).");
+	  return 0;
+	}
+
+      /* Some thread is collecting.  Check which.  */
+      if (read_inferior_memory (ipa_collecting,
+				(unsigned char *) &ipa_collecting_obj,
+				sizeof (ipa_collecting_obj)) != 0)
+	goto again;
+
+      if (ipa_collecting_obj.thread_area != thread_area)
+	{
+	  trace_debug ("fast_tracepoint_collecting: not collecting "
+		       "(another thread is)");
+	  return 0;
+	}
+
+      tpoint
+	= fast_tracepoint_from_ipa_tpoint_address (ipa_collecting_obj.tpoint);
+      if (tpoint == NULL)
+	{
+	  warning ("fast_tracepoint_collecting: collecting, "
+		   "but tpoint %s not found?",
+		   paddress ((CORE_ADDR) ipa_collecting_obj.tpoint));
+	  return 0;
+	}
+
+      /* The thread is within `gdb_collect', skip over the rest of
+	 fast tracepoint collection quickly using a breakpoint.  */
+      needs_breakpoint = 1;
+    }
+
+  /* The caller wants a bit of status detail.  */
+  if (status != NULL)
+    {
+      status->tpoint_num = tpoint->number;
+      status->tpoint_addr = tpoint->address;
+      status->adjusted_insn_addr = tpoint->adjusted_insn_addr;
+      status->adjusted_insn_addr_end = tpoint->adjusted_insn_addr_end;
+    }
+
+  if (needs_breakpoint)
+    {
+      /* Hasn't executed the original instruction yet.  Set breakpoint
+	 there, and wait till it's hit, then single-step until exiting
+	 the jump pad.  */
+
+      trace_debug ("\
+fast_tracepoint_collecting, returning continue-until-break at %s",
+		   paddress (tpoint->adjusted_insn_addr));
+
+      return 1; /* continue */
+    }
+  else
+    {
+      /* Just single-step until exiting the jump pad.  */
+
+      trace_debug ("fast_tracepoint_collecting, returning "
+		   "need-single-step (%s-%s)",
+		   paddress (tpoint->adjusted_insn_addr),
+		   paddress (tpoint->adjusted_insn_addr_end));
+
+      return 2; /* single-step */
+    }
+}
+
+#endif
+
+#ifdef IN_PROCESS_AGENT
+
+/* The global fast tracepoint collect lock.  Points to a collecting_t
+   object built on the stack by the jump pad, if presently locked;
+   NULL if it isn't locked.  Note that this lock *must* be set while
+   executing any *function other than the jump pad.  See
+   fast_tracepoint_collecting.  */
+static collecting_t * ATTR_USED collecting;
+
+/* This routine, called from the jump pad (in asm) is designed to be
+   called from the jump pads of fast tracepoints, thus it is on the
+   critical path.  */
+
+IP_AGENT_EXPORT void ATTR_USED
+gdb_collect (struct tracepoint *tpoint, unsigned char *regs)
+{
+  struct fast_tracepoint_ctx ctx;
+
+  /* Don't do anything until the trace run is completely set up.  */
+  if (!tracing)
+    return;
+
+  ctx.base.type = fast_tracepoint;
+  ctx.regs = regs;
+  ctx.regcache_initted = 0;
+  ctx.tpoint = tpoint;
+
+  /* Wrap the regblock in a register cache (in the stack, we don't
+     want to malloc here).  */
+  ctx.regspace = alloca (register_cache_size ());
+  if (ctx.regspace == NULL)
+    {
+      trace_debug ("Trace buffer block allocation failed, skipping");
+      return;
+    }
+
+  /* Test the condition if present, and collect if true.  */
+  if (tpoint->cond == NULL
+      || condition_true_at_tracepoint ((struct tracepoint_hit_ctx *) &ctx,
+				       tpoint))
+    {
+      collect_data_at_tracepoint ((struct tracepoint_hit_ctx *) &ctx,
+				  tpoint->address, tpoint);
+
+      /* Note that this will cause original insns to be written back
+	 to where we jumped from, but that's OK because we're jumping
+	 back to the next whole instruction.  This will go badly if
+	 instruction restoration is not atomic though.  */
+      if (stopping_tracepoint
+	  || trace_buffer_is_full
+	  || expr_eval_result != expr_eval_no_error)
+	stop_tracing ();
+    }
+  else
+    {
+      /* If there was a condition and it evaluated to false, the only
+	 way we would stop tracing is if there was an error during
+	 condition expression evaluation.  */
+      if (expr_eval_result != expr_eval_no_error)
+	stop_tracing ();
+    }
+}
+
+#endif
+
+#ifndef IN_PROCESS_AGENT
+
+/* We'll need to adjust these when we consider bi-arch setups, and big
+   endian machines.  */
+
+static int
+write_inferior_data_ptr (CORE_ADDR where, CORE_ADDR ptr)
+{
+  return write_inferior_memory (where,
+				(unsigned char *) &ptr, sizeof (void *));
+}
+
+/* The base pointer of the IPA's heap.  This is the only memory the
+   IPA is allowed to use.  The IPA should _not_ call the inferior's
+   `malloc' during operation.  That'd be slow, and, most importantly,
+   it may not be safe.  We may be collecting a tracepoint in a signal
+   handler, for example.  */
+static CORE_ADDR target_tp_heap;
+
+/* Allocate at least SIZE bytes of memory from the IPA heap, aligned
+   to 8 bytes.  */
+
+static CORE_ADDR
+target_malloc (ULONGEST size)
+{
+  CORE_ADDR ptr;
+
+  if (target_tp_heap == 0)
+    {
+      /* We have the pointer *address*, need what it points to.  */
+      if (read_inferior_data_pointer (ipa_sym_addrs.addr_gdb_tp_heap_buffer,
+				      &target_tp_heap))
+	fatal ("could get target heap head pointer");
+    }
+
+  ptr = target_tp_heap;
+  target_tp_heap += size;
+
+  /* Pad to 8-byte alignment.  */
+  target_tp_heap = ((target_tp_heap + 7) & ~0x7);
+
+  return ptr;
+}
+
+static CORE_ADDR
+download_agent_expr (struct agent_expr *expr)
+{
+  CORE_ADDR expr_addr;
+  CORE_ADDR expr_bytes;
+
+  expr_addr = target_malloc (sizeof (*expr));
+  write_inferior_memory (expr_addr, (unsigned char *) expr, sizeof (*expr));
+
+  expr_bytes = target_malloc (expr->length);
+  write_inferior_data_ptr (expr_addr + offsetof (struct agent_expr, bytes),
+			   expr_bytes);
+  write_inferior_memory (expr_bytes, expr->bytes, expr->length);
+
+  return expr_addr;
+}
+
+/* Align V up to N bits.  */
+#define UALIGN(V, N) (((V) + ((N) - 1)) & ~((N) - 1))
+
+static void
+download_tracepoints (void)
+{
+  CORE_ADDR tpptr = 0, prev_tpptr = 0;
+  struct tracepoint *tpoint;
+
+  /* Start out empty.  */
+  write_inferior_data_ptr (ipa_sym_addrs.addr_tracepoints, 0);
+
+  for (tpoint = tracepoints; tpoint; tpoint = tpoint->next)
+    {
+      struct tracepoint target_tracepoint;
+
+      if (tpoint->type != fast_tracepoint)
+	continue;
+
+      target_tracepoint = *tpoint;
+
+      prev_tpptr = tpptr;
+      tpptr = target_malloc (sizeof (*tpoint));
+      tpoint->obj_addr_on_target = tpptr;
+
+      if (tpoint == tracepoints)
+	{
+	  /* First object in list, set the head pointer in the
+	     inferior.  */
+	  write_inferior_data_ptr (ipa_sym_addrs.addr_tracepoints, tpptr);
+	}
+      else
+	{
+	  write_inferior_data_ptr (prev_tpptr + offsetof (struct tracepoint,
+							  next),
+				   tpptr);
+	}
+
+      /* Write the whole object.  We'll fix up its pointers in a bit.
+	 Assume no next for now.  This is fixed up above on the next
+	 iteration, if there's any.  */
+      target_tracepoint.next = NULL;
+      /* Need to clear this here too, since we're downloading the
+	 tracepoints before clearing our own copy.  */
+      target_tracepoint.hit_count = 0;
+
+      write_inferior_memory (tpptr, (unsigned char *) &target_tracepoint,
+			     sizeof (target_tracepoint));
+
+      if (tpoint->cond)
+	write_inferior_data_ptr (tpptr + offsetof (struct tracepoint,
+						   cond),
+				 download_agent_expr (tpoint->cond));
+
+      if (tpoint->numactions)
+	{
+	  int i;
+	  CORE_ADDR actions_array;
+
+	  /* The pointers array.  */
+	  actions_array
+	    = target_malloc (sizeof (*tpoint->actions) * tpoint->numactions);
+	  write_inferior_data_ptr (tpptr + offsetof (struct tracepoint,
+						     actions),
+				   actions_array);
+
+	  /* Now for each pointer, download the action.  */
+	  for (i = 0; i < tpoint->numactions; i++)
+	    {
+	      CORE_ADDR ipa_action = 0;
+	      struct tracepoint_action *action = tpoint->actions[i];
+
+	      switch (action->type)
+		{
+		case 'M':
+		  ipa_action
+		    = target_malloc (sizeof (struct collect_memory_action));
+		  write_inferior_memory (ipa_action,
+					 (unsigned char *) action,
+					 sizeof (struct collect_memory_action));
+		  break;
+		case 'R':
+		  ipa_action
+		    = target_malloc (sizeof (struct collect_registers_action));
+		  write_inferior_memory (ipa_action,
+					 (unsigned char *) action,
+					 sizeof (struct collect_registers_action));
+		  break;
+		case 'X':
+		  {
+		    CORE_ADDR expr;
+		    struct eval_expr_action *eaction
+		      = (struct eval_expr_action *) action;
+
+		    ipa_action = target_malloc (sizeof (*eaction));
+		    write_inferior_memory (ipa_action,
+					   (unsigned char *) eaction,
+					   sizeof (*eaction));
+
+		    expr = download_agent_expr (eaction->expr);
+		    write_inferior_data_ptr
+		      (ipa_action + offsetof (struct eval_expr_action, expr),
+		       expr);
+		    break;
+		  }
+		default:
+		  trace_debug ("unknown trace action '%c', ignoring",
+			       action->type);
+		  break;
+		}
+
+	      if (ipa_action != 0)
+		write_inferior_data_ptr
+		  (actions_array + i * sizeof (sizeof (*tpoint->actions)),
+		   ipa_action);
+	    }
+	}
+    }
+}
+
+static void
+download_trace_state_variables (void)
+{
+  CORE_ADDR ptr = 0, prev_ptr = 0;
+  struct trace_state_variable *tsv;
+
+  /* Start out empty.  */
+  write_inferior_data_ptr (ipa_sym_addrs.addr_trace_state_variables, 0);
+
+  for (tsv = trace_state_variables; tsv != NULL; tsv = tsv->next)
+    {
+      struct trace_state_variable target_tsv;
+
+      /* TSV's with a getter have been initialized equally in both the
+	 inferior and GDBserver.  Skip them.  */
+      if (tsv->getter != NULL)
+	continue;
+
+      target_tsv = *tsv;
+
+      prev_ptr = ptr;
+      ptr = target_malloc (sizeof (*tsv));
+
+      if (tsv == trace_state_variables)
+	{
+	  /* First object in list, set the head pointer in the
+	     inferior.  */
+
+	  write_inferior_data_ptr (ipa_sym_addrs.addr_trace_state_variables,
+				   ptr);
+	}
+      else
+	{
+	  write_inferior_data_ptr (prev_ptr
+				   + offsetof (struct trace_state_variable,
+					       next),
+				   ptr);
+	}
+
+      /* Write the whole object.  We'll fix up its pointers in a bit.
+	 Assume no next, fixup when needed.  */
+      target_tsv.next = NULL;
+
+      write_inferior_memory (ptr, (unsigned char *) &target_tsv,
+			     sizeof (target_tsv));
+
+      if (tsv->name != NULL)
+	{
+	  size_t size = strlen (tsv->name) + 1;
+	  CORE_ADDR name_addr = target_malloc (size);
+	  write_inferior_memory (name_addr,
+				 (unsigned char *) tsv->name, size);
+	  write_inferior_data_ptr (ptr
+				   + offsetof (struct trace_state_variable,
+					       name),
+				   name_addr);
+	}
+
+      if (tsv->getter != NULL)
+	{
+	  fatal ("what to do with these?");
+	}
+    }
+
+  if (prev_ptr != 0)
+    {
+      /* Fixup the next pointer in the last item in the list.  */
+      write_inferior_data_ptr (prev_ptr + offsetof (struct trace_state_variable,
+						    next), 0);
+    }
+}
+
+/* Upload complete trace frames out of the IP Agent's trace buffer
+   into GDBserver's trace buffer.  This always uploads either all or
+   no trace frames.  This is the counter part of
+   `trace_alloc_trace_buffer'.  See its description of the atomic
+   synching mechanism.  */
+
+static void
+upload_fast_traceframes (void)
+{
+  unsigned int ipa_traceframe_read_count, ipa_traceframe_write_count;
+  unsigned int ipa_traceframe_read_count_racy, ipa_traceframe_write_count_racy;
+  CORE_ADDR tf;
+  struct ipa_trace_buffer_control ipa_trace_buffer_ctrl;
+  unsigned int curr_tbctrl_idx;
+  unsigned int ipa_trace_buffer_ctrl_curr;
+  unsigned int ipa_trace_buffer_ctrl_curr_old;
+  CORE_ADDR ipa_trace_buffer_ctrl_addr;
+  struct breakpoint *about_to_request_buffer_space_bkpt;
+  CORE_ADDR ipa_trace_buffer_lo;
+  CORE_ADDR ipa_trace_buffer_hi;
+
+  if (read_inferior_uinteger (ipa_sym_addrs.addr_traceframe_read_count,
+			      &ipa_traceframe_read_count_racy))
+    {
+      /* This will happen in most targets if the current thread is
+	 running.  */
+      return;
+    }
+
+  if (read_inferior_uinteger (ipa_sym_addrs.addr_traceframe_write_count,
+			      &ipa_traceframe_write_count_racy))
+    return;
+
+  trace_debug ("ipa_traceframe_count (racy area): %d (w=%d, r=%d)",
+	       ipa_traceframe_write_count_racy - ipa_traceframe_read_count_racy,
+	       ipa_traceframe_write_count_racy, ipa_traceframe_read_count_racy);
+
+  if (ipa_traceframe_write_count_racy == ipa_traceframe_read_count_racy)
+    return;
+
+  about_to_request_buffer_space_bkpt
+    = set_breakpoint_at (ipa_sym_addrs.addr_about_to_request_buffer_space,
+			 NULL);
+
+  if (read_inferior_uinteger (ipa_sym_addrs.addr_trace_buffer_ctrl_curr,
+			      &ipa_trace_buffer_ctrl_curr))
+    return;
+
+  ipa_trace_buffer_ctrl_curr_old = ipa_trace_buffer_ctrl_curr;
+
+  curr_tbctrl_idx = ipa_trace_buffer_ctrl_curr & ~GDBSERVER_FLUSH_COUNT_MASK;
+
+  {
+    unsigned int prev, counter;
+
+    /* Update the token, with new counters, and the GDBserver stamp
+       bit.  Alway reuse the current TBC index.  */
+    prev = ipa_trace_buffer_ctrl_curr & 0x0007ff00;
+    counter = (prev + 0x100) & 0x0007ff00;
+
+    ipa_trace_buffer_ctrl_curr = (0x80000000
+				  | (prev << 12)
+				  | counter
+				  | curr_tbctrl_idx);
+  }
+
+  if (write_inferior_uinteger (ipa_sym_addrs.addr_trace_buffer_ctrl_curr,
+			       ipa_trace_buffer_ctrl_curr))
+    return;
+
+  trace_debug ("Lib: Committed %08x -> %08x",
+	       ipa_trace_buffer_ctrl_curr_old,
+	       ipa_trace_buffer_ctrl_curr);
+
+  /* Re-read these, now that we've installed the
+     `about_to_request_buffer_space' breakpoint/lock.  A thread could
+     have finished a traceframe between the last read of these
+     counters and setting the breakpoint above.  If we start
+     uploading, we never want to leave this function with
+     traceframe_read_count != 0, otherwise, GDBserver could end up
+     incrementing the counter tokens more than once (due to event loop
+     nesting), which would break the IP agent's "effective" detection
+     (see trace_alloc_trace_buffer).  */
+  if (read_inferior_uinteger (ipa_sym_addrs.addr_traceframe_read_count,
+			      &ipa_traceframe_read_count))
+    return;
+  if (read_inferior_uinteger (ipa_sym_addrs.addr_traceframe_write_count,
+			      &ipa_traceframe_write_count))
+    return;
+
+  if (debug_threads)
+    {
+      trace_debug ("ipa_traceframe_count (blocked area): %d (w=%d, r=%d)",
+		   ipa_traceframe_write_count - ipa_traceframe_read_count,
+		   ipa_traceframe_write_count, ipa_traceframe_read_count);
+
+      if (ipa_traceframe_write_count != ipa_traceframe_write_count_racy
+	  || ipa_traceframe_read_count != ipa_traceframe_read_count_racy)
+	trace_debug ("note that ipa_traceframe_count's parts changed");
+    }
+
+  /* Get the address of the current TBC object (the IP agent has an
+     array of 3 such objects).  The index is stored in the TBC
+     token.  */
+  ipa_trace_buffer_ctrl_addr = ipa_sym_addrs.addr_trace_buffer_ctrl;
+  ipa_trace_buffer_ctrl_addr
+    += sizeof (struct ipa_trace_buffer_control) * curr_tbctrl_idx;
+
+  if (read_inferior_memory (ipa_trace_buffer_ctrl_addr,
+			    (unsigned char *) &ipa_trace_buffer_ctrl,
+			    sizeof (struct ipa_trace_buffer_control)))
+    return;
+
+  if (read_inferior_data_pointer (ipa_sym_addrs.addr_trace_buffer_lo,
+				  &ipa_trace_buffer_lo))
+    return;
+  if (read_inferior_data_pointer (ipa_sym_addrs.addr_trace_buffer_hi,
+				  &ipa_trace_buffer_hi))
+    return;
+
+  /* Offsets are easier to grok for debugging than raw addresses,
+     especially for the small trace buffer sizes that are useful for
+     testing.  */
+  trace_debug ("Lib: Trace buffer [%d] start=%d free=%d "
+	       "endfree=%d wrap=%d hi=%d",
+	       curr_tbctrl_idx,
+	       (int) (ipa_trace_buffer_ctrl.start - ipa_trace_buffer_lo),
+	       (int) (ipa_trace_buffer_ctrl.free - ipa_trace_buffer_lo),
+	       (int) (ipa_trace_buffer_ctrl.end_free - ipa_trace_buffer_lo),
+	       (int) (ipa_trace_buffer_ctrl.wrap - ipa_trace_buffer_lo),
+	       (int) (ipa_trace_buffer_hi - ipa_trace_buffer_lo));
+
+  /* Note that the IPA's buffer is always circular.  */
+
+#define IPA_FIRST_TRACEFRAME() (ipa_trace_buffer_ctrl.start)
+
+#define IPA_NEXT_TRACEFRAME_1(TF, TFOBJ)		\
+  ((TF) + sizeof (struct traceframe) + (TFOBJ)->data_size)
+
+#define IPA_NEXT_TRACEFRAME(TF, TFOBJ)					\
+  (IPA_NEXT_TRACEFRAME_1 (TF, TFOBJ)					\
+   - ((IPA_NEXT_TRACEFRAME_1 (TF, TFOBJ) >= ipa_trace_buffer_ctrl.wrap) \
+      ? (ipa_trace_buffer_ctrl.wrap - ipa_trace_buffer_lo)		\
+      : 0))
+
+  tf = IPA_FIRST_TRACEFRAME ();
+
+  while (ipa_traceframe_write_count - ipa_traceframe_read_count)
+    {
+      struct tracepoint *tpoint;
+      struct traceframe *tframe;
+      unsigned char *block;
+      struct traceframe ipa_tframe;
+
+      if (read_inferior_memory (tf, (unsigned char *) &ipa_tframe,
+				offsetof (struct traceframe, data)))
+	error ("Uploading: couldn't read traceframe at %s\n", paddress (tf));
+
+      if (ipa_tframe.tpnum == 0)
+	fatal ("Uploading: No (more) fast traceframes, but "
+	       "ipa_traceframe_count == %u??\n",
+	       ipa_traceframe_write_count - ipa_traceframe_read_count);
+
+      /* Note that this will be incorrect for multi-location
+	 tracepoints...  */
+      tpoint = find_next_tracepoint_by_number (NULL, ipa_tframe.tpnum);
+
+      tframe = add_traceframe (tpoint);
+      if (tframe == NULL)
+	{
+	  trace_buffer_is_full = 1;
+	  trace_debug ("Uploading: trace buffer is full");
+	}
+      else
+	{
+	  /* Copy the whole set of blocks in one go for now.  FIXME:
+	     split this in smaller blocks.  */
+	  block = add_traceframe_block (tframe, ipa_tframe.data_size);
+	  if (block != NULL)
+	    {
+	      if (read_inferior_memory (tf + offsetof (struct traceframe, data),
+					block, ipa_tframe.data_size))
+		error ("Uploading: Couldn't read traceframe data at %s\n",
+		       paddress (tf + offsetof (struct traceframe, data)));
+	    }
+
+	  trace_debug ("Uploading: traceframe didn't fit");
+	  finish_traceframe (tframe);
+	}
+
+      tf = IPA_NEXT_TRACEFRAME (tf, &ipa_tframe);
+
+      /* If we freed the traceframe that wrapped around, go back
+	 to the non-wrap case.  */
+      if (tf < ipa_trace_buffer_ctrl.start)
+	{
+	  trace_debug ("Lib: Discarding past the wraparound");
+	  ipa_trace_buffer_ctrl.wrap = ipa_trace_buffer_hi;
+	}
+      ipa_trace_buffer_ctrl.start = tf;
+      ipa_trace_buffer_ctrl.end_free = ipa_trace_buffer_ctrl.start;
+      ++ipa_traceframe_read_count;
+
+      if (ipa_trace_buffer_ctrl.start == ipa_trace_buffer_ctrl.free
+	  && ipa_trace_buffer_ctrl.start == ipa_trace_buffer_ctrl.end_free)
+	{
+	  trace_debug ("Lib: buffer is fully empty.  "
+		       "Trace buffer [%d] start=%d free=%d endfree=%d",
+		       curr_tbctrl_idx,
+		       (int) (ipa_trace_buffer_ctrl.start
+			      - ipa_trace_buffer_lo),
+		       (int) (ipa_trace_buffer_ctrl.free
+			      - ipa_trace_buffer_lo),
+		       (int) (ipa_trace_buffer_ctrl.end_free
+			      - ipa_trace_buffer_lo));
+
+	  ipa_trace_buffer_ctrl.start = ipa_trace_buffer_lo;
+	  ipa_trace_buffer_ctrl.free = ipa_trace_buffer_lo;
+	  ipa_trace_buffer_ctrl.end_free = ipa_trace_buffer_hi;
+	  ipa_trace_buffer_ctrl.wrap = ipa_trace_buffer_hi;
+	}
+
+      trace_debug ("Uploaded a traceframe\n"
+		   "Lib: Trace buffer [%d] start=%d free=%d "
+		   "endfree=%d wrap=%d hi=%d",
+		   curr_tbctrl_idx,
+		   (int) (ipa_trace_buffer_ctrl.start - ipa_trace_buffer_lo),
+		   (int) (ipa_trace_buffer_ctrl.free - ipa_trace_buffer_lo),
+		   (int) (ipa_trace_buffer_ctrl.end_free - ipa_trace_buffer_lo),
+		   (int) (ipa_trace_buffer_ctrl.wrap - ipa_trace_buffer_lo),
+		   (int) (ipa_trace_buffer_hi - ipa_trace_buffer_lo));
+    }
+
+  if (write_inferior_memory (ipa_trace_buffer_ctrl_addr,
+			     (unsigned char *) &ipa_trace_buffer_ctrl,
+			     sizeof (struct ipa_trace_buffer_control)))
+    return;
+
+  write_inferior_integer (ipa_sym_addrs.addr_traceframe_read_count,
+			  ipa_traceframe_read_count);
+
+  trace_debug ("Done uploading traceframes [%d]\n", curr_tbctrl_idx);
+
+  pause_all (1);
+  cancel_breakpoints ();
+
+  delete_breakpoint (about_to_request_buffer_space_bkpt);
+  about_to_request_buffer_space_bkpt = NULL;
+
+  unpause_all (1);
+
+  if (trace_buffer_is_full)
+    stop_tracing ();
+}
+#endif
+
+#ifdef IN_PROCESS_AGENT
+
+#include <sys/mman.h>
+#include <fcntl.h>
+
+IP_AGENT_EXPORT char *gdb_tp_heap_buffer;
+IP_AGENT_EXPORT char *gdb_jump_pad_buffer;
+IP_AGENT_EXPORT char *gdb_jump_pad_buffer_end;
+
+static void __attribute__ ((constructor))
+initialize_tracepoint_ftlib (void)
+{
+  initialize_tracepoint ();
+}
+
+#endif
+
 static LONGEST
 tsv_get_timestamp (void)
 {
@@ -3448,7 +5491,31 @@ initialize_tracepoint (void)
      uploaded to GDB upon connection and become one of its trace state
      variables.  (In case you're wondering, if GDB already has a trace
      variable numbered 1, it will be renumbered.)  */
-  create_trace_state_variable (1);
+  create_trace_state_variable (1, 0);
   set_trace_state_variable_name (1, "trace_timestamp");
   set_trace_state_variable_getter (1, tsv_get_timestamp);
+
+#ifdef IN_PROCESS_AGENT
+  {
+    int pagesize;
+    pagesize = sysconf (_SC_PAGE_SIZE);
+    if (pagesize == -1)
+      fatal ("sysconf");
+
+    gdb_tp_heap_buffer = xmalloc (5 * 1024 * 1024);
+
+    /* Allocate scratch buffer aligned on a page boundary.  */
+    gdb_jump_pad_buffer = memalign (pagesize, pagesize * 20);
+    gdb_jump_pad_buffer_end = gdb_jump_pad_buffer + pagesize * 20;
+
+    /* Make it writable and executable.  */
+    if (mprotect (gdb_jump_pad_buffer, pagesize * 20,
+		  PROT_READ | PROT_WRITE | PROT_EXEC) != 0)
+      fatal ("\
+initialize_tracepoint: mprotect(%p, %d, PROT_READ|PROT_EXEC) failed with %s",
+	     gdb_jump_pad_buffer, pagesize * 20, strerror (errno));
+  }
+
+  initialize_low_tracepoint ();
+#endif
 }
Index: src/gdb/gdbserver/utils.c
===================================================================
--- src.orig/gdb/gdbserver/utils.c	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/utils.c	2010-06-01 13:57:18.000000000 +0100
@@ -28,6 +28,14 @@
 #include <malloc.h>
 #endif
 
+#ifdef IN_PROCESS_AGENT
+#  define PREFIX "ipa: "
+#  define TOOLNAME "GDBserver in-process agent"
+#else
+#  define PREFIX "gdbserver: "
+#  define TOOLNAME "GDBserver"
+#endif
+
 /* Generally useful subroutines used throughout the program.  */
 
 static void malloc_failure (size_t size) ATTR_NORETURN;
@@ -35,7 +43,7 @@ static void malloc_failure (size_t size)
 static void
 malloc_failure (size_t size)
 {
-  fprintf (stderr, "gdbserver: ran out of memory while trying to allocate %lu bytes\n",
+  fprintf (stderr, PREFIX "ran out of memory while trying to allocate %lu bytes\n",
 	   (unsigned long) size);
   exit (1);
 }
@@ -107,6 +115,8 @@ xstrdup (const char *s)
   return ret;
 }
 
+#ifndef IN_PROCESS_AGENT
+
 /* Free a standard argv vector.  */
 
 void
@@ -124,6 +134,8 @@ freeargv (char **vector)
     }
 }
 
+#endif
+
 /* Print the system error message for errno, and also mention STRING
    as the file name for which the error was encountered.
    Then return to command level.  */
@@ -153,13 +165,19 @@ perror_with_name (const char *string)
 void
 error (const char *string,...)
 {
+#ifndef IN_PROCESS_AGENT
   extern jmp_buf toplevel;
+#endif
   va_list args;
   va_start (args, string);
   fflush (stdout);
   vfprintf (stderr, string, args);
   fprintf (stderr, "\n");
+#ifndef IN_PROCESS_AGENT
   longjmp (toplevel, 1);
+#else
+  exit (1);
+#endif
 }
 
 /* Print an error message and exit reporting failure.
@@ -172,7 +190,7 @@ fatal (const char *string,...)
 {
   va_list args;
   va_start (args, string);
-  fprintf (stderr, "gdbserver: ");
+  fprintf (stderr, PREFIX);
   vfprintf (stderr, string, args);
   fprintf (stderr, "\n");
   va_end (args);
@@ -185,7 +203,7 @@ warning (const char *string,...)
 {
   va_list args;
   va_start (args, string);
-  fprintf (stderr, "gdbserver: ");
+  fprintf (stderr, PREFIX);
   vfprintf (stderr, string, args);
   fprintf (stderr, "\n");
   va_end (args);
@@ -200,7 +218,7 @@ internal_error (const char *file, int li
   va_start (args, fmt);
 
   fprintf (stderr,  "\
-%s:%d: A problem internal to GDBserver has been detected.\n", file, line);
+%s:%d: A problem internal to " TOOLNAME " has been detected.\n", file, line);
   vfprintf (stderr, fmt, args);
   fprintf (stderr, "\n");
   va_end (args);
@@ -208,7 +226,7 @@ internal_error (const char *file, int li
 }
 
 /* Temporary storage using circular buffer.  */
-#define NUMCELLS 4
+#define NUMCELLS 10
 #define CELLSIZE 50
 
 /* Return the next entry in the circular buffer.  */
Index: src/gdb/gdbserver/configure
===================================================================
--- src.orig/gdb/gdbserver/configure	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/configure	2010-06-01 13:57:18.000000000 +0100
@@ -590,6 +590,8 @@ ac_includes_default="\
 #endif"
 
 ac_subst_vars='LTLIBOBJS
+extra_libraries
+IPA_DEPFILES
 srv_xmlfiles
 srv_xmlbuiltin
 USE_THREAD_DB
@@ -4468,6 +4470,73 @@ fi
 GDBSERVER_DEPFILES="$srv_regobj $srv_tgtobj $srv_hostio_err_objs $srv_thread_depfiles"
 GDBSERVER_LIBS="$srv_libs"
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether the target supports __sync_*_compare_and_swap" >&5
+$as_echo_n "checking whether the target supports __sync_*_compare_and_swap... " >&6; }
+if test "${gdbsrv_cv_have_sync_builtins+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+int foo, bar; bar = __sync_val_compare_and_swap(&foo, 0, 1);
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  gdbsrv_cv_have_sync_builtins=yes
+else
+  gdbsrv_cv_have_sync_builtins=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gdbsrv_cv_have_sync_builtins" >&5
+$as_echo "$gdbsrv_cv_have_sync_builtins" >&6; }
+if test $gdbsrv_cv_have_sync_builtins = yes; then
+
+$as_echo "#define HAVE_SYNC_BUILTINS 1" >>confdefs.h
+
+fi
+
+saved_cflags="$CFLAGS"
+CFLAGS="$CFLAGS -fvisibility=hidden"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  gdbsrv_cv_have_visibility_hidden=yes
+else
+  gdbsrv_cv_have_visibility_hidden=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+CFLAGS="$saved_cflags"
+
+IPA_DEPFILES=""
+
+# Rather than allowing to build a broken IPA, we simply disable it if
+# we don't find a compiler supporting all the features we need.
+if test "$ipa_obj" != "" \
+   -a "$gdbsrv_cv_have_sync_builtins" = yes \
+   -a "$gdbsrv_cv_have_visibility_hidden" = yes; then
+   IPA_DEPFILES="$ipa_obj"
+   extra_libraries="libinproctrace.so"
+fi
+
+
+
 
 
 
Index: src/gdb/gdbserver/config.in
===================================================================
--- src.orig/gdb/gdbserver/config.in	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/gdbserver/config.in	2010-06-01 13:57:18.000000000 +0100
@@ -112,6 +112,9 @@
 /* Define to 1 if you have the <string.h> header file. */
 #undef HAVE_STRING_H
 
+/* Define to 1 if the target supports __sync_*_compare_and_swap */
+#undef HAVE_SYNC_BUILTINS
+
 /* Define to 1 if you have the <sys/file.h> header file. */
 #undef HAVE_SYS_FILE_H
 
Index: src/gdb/NEWS
===================================================================
--- src.orig/gdb/NEWS	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/NEWS	2010-06-01 13:57:18.000000000 +0100
@@ -33,8 +33,10 @@ qRelocInsn
 
 * New features in the GDB remote stub, GDBserver
 
-  - GDBserver now support tracepoints.  The feature is currently
-    supported by the i386-linux and amd64-linux builds.
+  - GDBserver now support tracepoints (including fast tracepoints).
+    The feature is currently supported by the i386-linux and
+    amd64-linux builds.  See the "Tracepoints support in gdbserver"
+    section in the manual for more information.
 
   - GDBserver now supports x86_64 Windows 64-bit debugging.
 
Index: src/gdb/doc/gdb.texinfo
===================================================================
--- src.orig/gdb/doc/gdb.texinfo	2010-06-01 12:55:36.000000000 +0100
+++ src/gdb/doc/gdb.texinfo	2010-06-01 13:57:18.000000000 +0100
@@ -9440,6 +9440,9 @@ Some targets may support @dfn{fast trace
 a different way (such as with a jump instead of a trap), that is
 faster but possibly restricted in where they may be installed.
 
+@code{gdbserver} supports tracepoints on some target systems.
+@xref{Server,,Tracepoints support in @code{gdbserver}}.
+
 This section describes commands to set tracepoints and associated
 conditions and actions.
 
@@ -15692,6 +15695,82 @@ of a multi-process mode debug session.
 
 @end table
 
+@subsection Tracepoints support in @code{gdbserver}
+@cindex tracepoints support in @code{gdbserver}
+
+On some targets, @code{gdbserver} supports tracepoints and fast
+tracepoints.
+
+For fast tracepoints to work, a special library called the
+@dfn{in-process agent} (IPA), must be loaded in the inferior process.
+This library is built and distributed as an integral part of
+@code{gdbserver}.
+
+There are several ways to load the in-process agent in your program:
+
+@table @code
+@item Specifying it as dependency at link time
+
+You can link your program dynamically with the in-process agent
+library.  On most systems, this is accomplished by adding
+@code{-linproctrace} to the link command.
+
+@item Using the system's preloading mechanisms
+
+You can force loading the in-process agent at startup time by using
+your system's support for preloading shared libraries.  Many Unixes
+support the concept of preloading user defined libraries.  In most
+cases, you do that by specifying @code{LD_PRELOAD=libinproctrace.so}
+in the environment.  See also the description of @code{gdbserver}'s
+@option{--wrapper} command line option.
+
+@item Using @value{GDBN} to force loading the agent at run time
+
+On some systems, you can force the inferior to load a shared library,
+by calling a dynamic loader function in the inferior that takes care
+of dynamically looking up and loading a shared library.  On most Unix
+systems, the function is @code{dlopen}.  You'll use the @code{call}
+command for that.  For example:
+
+@smallexample
+(@value{GDBP}) call dlopen ("libinproctrace.so", ...)
+@end smallexample
+
+Note that on most Unix systems, for the @code{dlopen} function to be
+available, the program needs to be linked with @code{-ldl}.
+@end table
+
+On systems that have a userspace dynamic loader, like most Unix
+systems, when you connect to @code{gdbserver} using @code{target
+remote}, you'll find that the program is stopped at the dynamic
+loader's entry point, and no shared library has been loaded in the
+program's address space yet, including the in-process agent.  In that
+case, before being able to use any of the fast tracepoints features,
+you need to let the loader run and load the shared libraries.  The
+most simple way to do that is to run the program to the main
+procedure.  E.g., if debugging a C or C@t{++} program, start
+@code{gdbserver} like so:
+
+@smallexample
+$ gdbserver :9999 myprogram
+@end smallexample
+
+Start GDB and connect to @code{gdbserver} like so, and run to main:
+
+@smallexample
+$ gdb myprogram
+(@value{GDBP}) target remote myhost:9999
+0x00007f215893ba60 in ?? () from /lib64/ld-linux-x86-64.so.2
+(@value{GDBP}) b main
+(@value{GDBP}) continue
+@end smallexample
+
+The in-process tracing agent library should now be loaded into the
+process; you can confirm it with the @code{info sharedlibrary}
+command, which will list @file{libinproctrace.so} as loaded in the
+process.  You are now ready to install fast tracepoints and start
+tracing.
+
 @node Remote Configuration
 @section Remote Configuration
 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]