This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 0/2] Better handling of slow remote transfers


On 08/18/2015 05:50 PM, Sandra Loosemore wrote:

> I would be happy to try to help debug what is going wrong here, but I'm
> not a GDB expert by any means and I'm totally unfamiliar with this part 
> of the code and don't know where to start.  If somebody has a theory on 
> this, is there someplace I could put in printfs to trace what is 
> happening?  From the previous discussion, it seemed like the flow 
> through the different layers of abstraction to what the RSP was actually 
> doing was very obscure, in terms of where the chunking is happening, etc.

That was just a misunderstanding.  It's not really obscure.  GDB has a file io
abstraction that wires the open/close/pread/pwrite syscalls to RSP packets.
Callers just need to use the target_fileio_open/pread etc. functions instead
of the usual syscalls.  Whenever BFD or GDB accesses a file that is prefixed
target: or remote:, the open/read/write requests go through those calls.
For remote targets, those end up in remote.c:remote_hostio_pread, etc.  The only
chunking going on is the server trimming the pread results to the maximum packet
size it can return to gdb.  Both gdb and gdbserver have that maximum buffer
size hardcoded at 16384 for historical reasons.

>> Your PandaBoard takes 8 seconds.  That doesn't seem so bad to me.
>> If this Altera board is the only one with the massive slowdown then
>> I don't think we should delay 7.10 any further on this issue--and I
>> certainly don't think we should lose the functionality that the
>> default sysroot of "target:" brings.
> 
> FWIW, I have just tried on a random sampling of our MIPS target boards 
> as well, with the same test program and same scenario of continuing to a 
> breakpoint on main.  The timings were 1:40 (SEAD-3 LX110 board, M14Kc 
> microMips), 1:38 (MALTA 74Kc MIPS16), and 1:20 (MALTA 5Kef MIPS64).  I 
> could try some PowerPC boards as well if we need more datapoints.

Thanks for doing this.  As yet another data point, it'd be interesting to
know whether Gary's rate-limit patch speeds up any other of these boards
as well.

Did you try getting the nfs mount out of the picture?  That is, copy the
debug-build files to the target, and have gdb pull them automatically.

It's expected that reading a file out of remote filesystem
is going to be slower than not reading it.  Obviously some slowdown
is expected.  There's no free lunch.

The fact that Gary's chunk size limiting patch made things much
faster on the nios2 board is still mysterious to me.  I'd expect the
slowness to be latency bound, given the request/response nature of the RSP,
but then I'd expect that more chunking would slow things down, not speed
it up.

I'm trying to build a more complete picture of the potential for
trouble.  The more data, the better...

So tonight I played with this.  To amplify latency issues against a remote target,
I've built gdbserver on gcc76 on the gcc compile farm.  Anyone can request
access to the gcc compile farm machines and to reproduce this.  That's an x86-64
in France almost 2000km away from me.  I've got a ping time of ~85ms to
that machine, and I'm behind a ~16Mbit ADSL.

As an illustration, here's what it takes to download the gdbserver binary using ssh/scp.
Not the fastest file transfer method, but the most convenient:

$ time scp gcc76.fsffrance.org:/home/palves/gdb/build/gdb/gdbserver/gdbserver /tmp/gdbserver
gdbserver             100% 1451KB   1.4MB/s   00:01

real    0m1.296s
user    0m0.015s
sys     0m0.013s

Now let's compare with GDB's file retrieval:

[palves@gcc76] $ ./gdbserver --multi :9999

[palves@home] $ ssh -L 9999:localhost:9999 gcc76.fsffrance.org
[palves@home] $ cat /tmp/remote-get
define getfile
       shell date
       remote get /home/palves/gdb/build/gdb/gdbserver/gdbserver /tmp/gdbserver
       shell date
       end
[palves@home] $ ./gdb -data-directory=data-directory -ex "tar extended-rem :9999" -x /tmp/remote-get

(gdb) getfile
Detaching after fork from child process 31567.
Wed Aug 19 01:04:13 WEST 2015
Detaching after fork from child process 31568.
Wed Aug 19 01:04:23 WEST 2015
(gdb)

~10s GDB/vFile vs ~1.3s scp/ssh.  Not that impressive.

So then I tried bumping the RSP max packet buffer size.  That required
a patch to gdb and another to gdbserver.  I bumped the buffer size 16 times.
The result was:

(gdb) getfile
Wed Aug 19 01:13:01 WEST 2015
Wed Aug 19 01:13:04 WEST 2015
(gdb) getfile
Wed Aug 19 01:13:12 WEST 2015
Wed Aug 19 01:13:14 WEST 2015
(gdb) getfile
Wed Aug 19 01:13:19 WEST 2015
Wed Aug 19 01:13:21 WEST 2015

So 2s/3s.  Not bad.

Bumping the packet buffer size to 4MB, to make sure
a single vFile:pread covers the ~1.5MB file, I consistently
get ~1s:

(gdb) getfile
Wed Aug 19 01:17:52 WEST 2015
Wed Aug 19 01:17:53 WEST 2015
(gdb) getfile
Wed Aug 19 01:17:55 WEST 2015
Wed Aug 19 01:17:56 WEST 2015
(gdb) getfile
Wed Aug 19 01:17:57 WEST 2015
Wed Aug 19 01:17:58 WEST 2015
(gdb)

which is then in the ball park of scp.  Seems clear to me that for
whole-file transfers, the bottleneck (on this use case) is the number
of chunks / round trips.  For file transfer, obviously a stream-based
transfer, like scp's, that avoids roundtrip hiccups, wins.

For the use case of connecting to gdbserver with "target remote",
and having gdb fetch the binary out of gdbserver with vFile, I tried
something else.

This time, I'll simply debug gdbserver itself remotely, letting gdb
fetch the gdbserver binary, and run to main.  Once main is reached,
quit:

[palves@gcc76] $ ./gdbserver :9999 ./gdbserver
[palves@home] $ ssh -L 9999:localhost:9999 gcc76.fsffrance.org
[palves@home] time ./gdb -data-directory=data-directory -ex "tar rem :9999" -ex "b main" -ex "c" -ex "set confirm off" -ex "quit"

Pristine gdb 7.10.50.20150816-cvs gets me:
...
Remote debugging using :9999
Reading symbols from target:/home/palves/gdb/build/gdb/gdbserver/gdbserver...done.
Reading symbols from target:/lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
0x00007ffff7ddd190 in ?? () from target:/lib64/ld-linux-x86-64.so.2
Breakpoint 1 at 0x41200c: file ../../../src/gdb/gdbserver/server.c, line 3635.
Continuing.

Breakpoint 1, main (argc=1, argv=0x7fffffffe3d8) at ../../../src/gdb/gdbserver/server.c:3635
3635    ../../../src/gdb/gdbserver/server.c: No such file or directory.
/home/palves/gdb/build/gdb/gdbserver/gdbserver: No such file or directory.

real    0m55.315s
user    0m0.400s
sys     0m0.087s

Bumping the remote max packet buffer size to (16384*16) helps a bit,
though not impressive:

real    0m51.600s
user    0m0.404s
sys     0m0.085s

Bumping to 0x100000 helps a bit further:

real    0m48.335s
user    0m0.421s
sys     0m0.091s

Looking at "set debug remote 1" output, I noticed that gdb often issues
a sequence of small vFile:pread requests for the same file.  Ctrl-c'ing
gdb while it was doing that and getting a backtrace shows that those requests
often came from bfd.  So I thought that maybe adding a readahead buffer/cache
to gdb's vFile:pread requests could help.  And indeed, that dropped the
time for the same use case further down to:

real    0m29.055s
user    0m2.625s
sys     0m0.090s

I added a few counters to show cache hit/miss, and got:

 readahead cache miss 44
 readahead cache hit 377

Not sure exactly what makes the "user" time go up that much (maybe
reading too much ahead causes too much RSP escaping, etc.?)
If I repeat the same leaving gdbserver's maximum buffer size
set at the original 16384, I still get <30s:

real    0m27.169s
user    0m0.464s
sys     0m0.061s

though notice the user time goes down again.  Maybe this indicates
that bigger chunk sizes indeed increase CPU usage, which on slower boards
would have a more profound effect than on the x86_64 machines I'm using.
An observation in line with Sandra's observation of Gary's patch speeding
up the nios2 board.  Or maybe that's just a coincidence.

BTW, the transfers seem to be always interruptible for me, even without
Gary's patch, and even the slow ones.

And finally, here's the time using a local sysroot and specifying a local
program:

$ time $g -ex "set sysroot /" -ex "tar rem :9999" -ex "b main" -ex "c" -ex "set confirm off" -ex "quit" /tmp/gdbserver
Reading symbols from /tmp/gdbserver...done.
Remote debugging using :9999
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib64/ld-2.18.so.debug...done.
done.
0x00007ffff7ddd190 in ?? ()
Breakpoint 1 at 0x41200c: file ../../../src/gdb/gdbserver/server.c, line 3635.
Continuing.

Breakpoint 1, main (argc=1, argv=0x7fffffffe3d8) at ../../../src/gdb/gdbserver/server.c:3635
3635    ../../../src/gdb/gdbserver/server.c: No such file or directory.

real    0m7.385s
user    0m0.132s
sys     0m0.028s

I'm attaching the 3 patches that I used.  I wouldn't be surprised
if they contain obvious blunders; it's getting quite late here...

Thanks,
Pedro Alves

>From 5fec4911ec7cd9ea2ee4b5d3be0502e7a7df1f9f Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Tue, 18 Aug 2015 22:53:56 +0100
Subject: [PATCH 1/3] gdb: remove packet size limit

---
 gdb/remote.c | 36 +++++++++++++++---------------------
 1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/gdb/remote.c b/gdb/remote.c
index ca1f0df..a839adf 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -894,14 +894,6 @@ get_memory_packet_size (struct memory_packet_config *config)
   struct remote_state *rs = get_remote_state ();
   struct remote_arch_state *rsa = get_remote_arch_state ();
 
-  /* NOTE: The somewhat arbitrary 16k comes from the knowledge (folk
-     law?) that some hosts don't cope very well with large alloca()
-     calls.  Eventually the alloca() code will be replaced by calls to
-     xmalloc() and make_cleanups() allowing this restriction to either
-     be lifted or removed.  */
-#ifndef MAX_REMOTE_PACKET_SIZE
-#define MAX_REMOTE_PACKET_SIZE 16384
-#endif
   /* NOTE: 20 ensures we can write at least one byte.  */
 #ifndef MIN_REMOTE_PACKET_SIZE
 #define MIN_REMOTE_PACKET_SIZE 20
@@ -910,7 +902,7 @@ get_memory_packet_size (struct memory_packet_config *config)
   if (config->fixed_p)
     {
       if (config->size <= 0)
-	what_they_get = MAX_REMOTE_PACKET_SIZE;
+	what_they_get = 16384;
       else
 	what_they_get = config->size;
     }
@@ -929,8 +921,6 @@ get_memory_packet_size (struct memory_packet_config *config)
 	  && what_they_get > rsa->actual_register_packet_size)
 	what_they_get = rsa->actual_register_packet_size;
     }
-  if (what_they_get > MAX_REMOTE_PACKET_SIZE)
-    what_they_get = MAX_REMOTE_PACKET_SIZE;
   if (what_they_get < MIN_REMOTE_PACKET_SIZE)
     what_they_get = MIN_REMOTE_PACKET_SIZE;
 
@@ -3911,6 +3901,7 @@ remote_check_symbols (void)
   char *msg, *reply, *tmp;
   struct bound_minimal_symbol sym;
   int end;
+  struct cleanup *old_chain;
 
   /* The remote side has no concept of inferiors that aren't running
      yet, it only knows about running processes.  If we're connected
@@ -3929,7 +3920,8 @@ remote_check_symbols (void)
 
   /* Allocate a message buffer.  We can't reuse the input buffer in RS,
      because we need both at the same time.  */
-  msg = alloca (get_remote_packet_size ());
+  msg = xmalloc (get_remote_packet_size ());
+  old_chain = make_cleanup (xfree, msg);
 
   /* Invite target to request symbol lookups.  */
 
@@ -3967,6 +3959,8 @@ remote_check_symbols (void)
       getpkt (&rs->buf, &rs->buf_size, 0);
       reply = rs->buf;
     }
+
+  do_cleanups (old_chain);
 }
 
 static struct serial *
@@ -4089,13 +4083,6 @@ remote_packet_size (const struct protocol_feature *feature,
       return;
     }
 
-  if (packet_size > MAX_REMOTE_PACKET_SIZE)
-    {
-      warning (_("limiting remote suggested packet size (%d bytes) to %d"),
-	       packet_size, MAX_REMOTE_PACKET_SIZE);
-      packet_size = MAX_REMOTE_PACKET_SIZE;
-    }
-
   /* Record the new maximum packet size.  */
   rs->explicit_packet_size = packet_size;
 }
@@ -7645,7 +7632,8 @@ putpkt_binary (const char *buf, int cnt)
   struct remote_state *rs = get_remote_state ();
   int i;
   unsigned char csum = 0;
-  char *buf2 = alloca (cnt + 6);
+  char *buf2 = xmalloc (cnt + 6);
+  struct cleanup *old_chain = make_cleanup (xfree, buf2);
 
   int ch;
   int tcount = 0;
@@ -7738,6 +7726,7 @@ putpkt_binary (const char *buf, int cnt)
 	    case '+':
 	      if (remote_debug)
 		fprintf_unfiltered (gdb_stdlog, "Ack\n");
+	      do_cleanups (old_chain);
 	      return 1;
 	    case '-':
 	      if (remote_debug)
@@ -7746,7 +7735,10 @@ putpkt_binary (const char *buf, int cnt)
 	    case SERIAL_TIMEOUT:
 	      tcount++;
 	      if (tcount > 3)
-		return 0;
+		{
+		  do_cleanups (old_chain);
+		  return 0;
+		}
 	      break;		/* Retransmit buffer.  */
 	    case '$':
 	      {
@@ -7833,6 +7825,8 @@ putpkt_binary (const char *buf, int cnt)
 	}
 #endif
     }
+
+  do_cleanups (old_chain);
   return 0;
 }
 
-- 
1.9.3

>From 0becb8d88cc5e364ce413258de32d326c6395ad6 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Wed, 19 Aug 2015 00:04:18 +0200
Subject: [PATCH 2/3] gdbserver: bump max packet buffer size

---
 gdb/gdbserver/server.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gdb/gdbserver/server.h b/gdb/gdbserver/server.h
index 9080151..9ec3708 100644
--- a/gdb/gdbserver/server.h
+++ b/gdb/gdbserver/server.h
@@ -127,6 +127,8 @@ extern int handle_target_event (int err, gdb_client_data client_data);
 /* Buffer sizes for transferring memory, registers, etc.   Set to a constant
    value to accomodate multiple register formats.  This value must be at least
    as large as the largest register set supported by gdbserver.  */
-#define PBUFSIZ 16384
-
+//#define PBUFSIZ (16384)
+#define PBUFSIZ (16384*16)
+//#define PBUFSIZ 0x100000
+//#define PBUFSIZ 0x400000
 #endif /* SERVER_H */
-- 
1.9.3

>From d426ce0a36830378a8ec2e2cbdd94d9fcb47b891 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Tue, 18 Aug 2015 23:27:20 +0100
Subject: [PATCH 3/3] add readahead cache to gdb's vFile:pread

---
 gdb/remote.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 99 insertions(+), 3 deletions(-)

diff --git a/gdb/remote.c b/gdb/remote.c
index a839adf..8a739c8 100644
--- a/gdb/remote.c
+++ b/gdb/remote.c
@@ -10311,6 +10311,26 @@ remote_hostio_send_command (int command_bytes, int which_packet,
   return ret;
 }
 
+struct readahead_cache
+{
+  int fd;
+  gdb_byte *buf;
+  ULONGEST offset;
+  size_t bufsize;
+};
+
+static struct readahead_cache *readahead_cache;
+
+static void
+readahead_cache_invalidate (void)
+{
+  if (readahead_cache != NULL)
+    readahead_cache->fd = -1;
+}
+
+static unsigned int readahead_cache_hit_count;
+static unsigned int readahead_cache_miss_count;
+
 /* Set the filesystem remote_hostio functions that take FILENAME
    arguments will use.  Return 0 on success, or -1 if an error
    occurs (and set *REMOTE_ERRNO).  */
@@ -10325,6 +10345,8 @@ remote_hostio_set_filesystem (struct inferior *inf, int *remote_errno)
   char arg[9];
   int ret;
 
+  readahead_cache_invalidate ();
+
   if (packet_support (PACKET_vFile_setfs) == PACKET_DISABLE)
     return 0;
 
@@ -10389,6 +10411,9 @@ remote_hostio_pwrite (struct target_ops *self,
   int left = get_remote_packet_size ();
   int out_len;
 
+  if (readahead_cache != NULL && readahead_cache->fd == fd)
+    readahead_cache_invalidate ();
+
   remote_buffer_add_string (&p, &left, "vFile:pwrite:");
 
   remote_buffer_add_int (&p, &left, fd);
@@ -10407,9 +10432,9 @@ remote_hostio_pwrite (struct target_ops *self,
 /* Implementation of to_fileio_pread.  */
 
 static int
-remote_hostio_pread (struct target_ops *self,
-		     int fd, gdb_byte *read_buf, int len,
-		     ULONGEST offset, int *remote_errno)
+remote_hostio_pread_1 (struct target_ops *self,
+		       int fd, gdb_byte *read_buf, int len,
+		       ULONGEST offset, int *remote_errno)
 {
   struct remote_state *rs = get_remote_state ();
   char *p = rs->buf;
@@ -10443,6 +10468,74 @@ remote_hostio_pread (struct target_ops *self,
   return ret;
 }
 
+static int
+remote_hostio_pread_from_cache (struct readahead_cache *cache,
+				int fd, gdb_byte *read_buf, int len,
+				ULONGEST offset)
+{
+  if (cache->fd == fd
+      && cache->offset <= offset
+      && offset < cache->offset + cache->bufsize)
+    {
+      ULONGEST max = cache->offset + cache->bufsize;
+
+      if (offset + len > max)
+	len = max - offset;
+
+      if (remote_debug)
+	fprintf_unfiltered (gdb_stdlog,
+			    "readahead cache hit %d\n",
+			    ++readahead_cache_hit_count);
+      memcpy (read_buf, cache->buf + offset - cache->offset, len);
+      return len;
+    }
+
+  return -1;
+}
+
+static int
+remote_hostio_pread (struct target_ops *self,
+		     int fd, gdb_byte *read_buf, int len,
+		     ULONGEST offset, int *remote_errno)
+{
+  int ret;
+
+  if (readahead_cache != NULL)
+    {
+      ret = remote_hostio_pread_from_cache (readahead_cache, fd,
+					    read_buf, len, offset);
+      if (ret >= 0)
+	return ret;
+
+      readahead_cache_invalidate ();
+    }
+
+  if (remote_debug)
+    fprintf_unfiltered (gdb_stdlog, "readahead cache miss %d\n",
+			++readahead_cache_miss_count);
+
+  if (readahead_cache == NULL)
+    readahead_cache = XCNEW (struct readahead_cache);
+  readahead_cache->fd = fd;
+  readahead_cache->offset = offset;
+  readahead_cache->bufsize = get_remote_packet_size ();
+  readahead_cache->buf = xrealloc (readahead_cache->buf, readahead_cache->bufsize);
+
+  ret = remote_hostio_pread_1 (self,
+			       readahead_cache->fd,
+			       readahead_cache->buf, readahead_cache->bufsize,
+			       readahead_cache->offset, remote_errno);
+  if (ret <= 0)
+    {
+      readahead_cache_invalidate ();
+      return ret;
+    }
+
+  readahead_cache->bufsize = ret;
+  return remote_hostio_pread_from_cache (readahead_cache, fd,
+					 read_buf, len, offset);
+}
+
 /* Implementation of to_fileio_close.  */
 
 static int
@@ -10452,6 +10545,9 @@ remote_hostio_close (struct target_ops *self, int fd, int *remote_errno)
   char *p = rs->buf;
   int left = get_remote_packet_size () - 1;
 
+  if (readahead_cache != NULL && readahead_cache->fd == fd)
+    readahead_cache_invalidate ();
+
   remote_buffer_add_string (&p, &left, "vFile:close:");
 
   remote_buffer_add_int (&p, &left, fd);
-- 
1.9.3


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]