This is the mail archive of the libc-hacker@sourceware.cygnus.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

[Bill Paul <wpaul@CTR.COLUMBIA.EDU>] Re: easy DoS in most RPC apps



I'm appending an explanation to the patch Zack send around yesterday.

Andreas
-- 
 Andreas Jaeger   aj@arthur.rhein-neckar.de    jaeger@informatik.uni-kl.de
  for pgp-key finger ajaeger@alma.student.uni-kl.de


Of all the gin joints in all the towns in all the world, Scott Stone had
to walk into mine and say:

> On Sun, 17 May 1998, David LeBlanc wrote:
>
> > At 02:35 AM 5/15/98 +0200, Peter van Dijk wrote:
> > >Finally, I'm quite sure of this: the bug is in Sun's RPC code.
> > >Investigations show Linux, FreeBSD, SunOS, System V and NeXTstep machines
> > >are affected, which means we've got a _big_ problem here.
> >
> > If that's the case, then any ports of these utilities running on Windows NT
> > would also exhibit the same problem - we're all running off of pretty much
> > the same Sun ONC RPC code.
> >
>
> The FreeBSD people have already made a patch for this, check their home
> site.  I'm going to attempt to port the patch to Linux, as the base code
> should be about the same.. the fix is to a couple of rpc-related files in
> the C libraries.

(I thought I sent a message about this to the list on Friday but it seems
not to have made it. Either I sent it off into /dev/null without realizing
it or gremlins ate it. In either case, I'll try again.)

The modifications I made to the FreeBSD RPC library prevent an attacker
from completely wedging a stream based RPC service for an indefinite
period, however there really ought to be more done to avoid the problem
completely.

The real problem is in the XDR record marking code which is used for
the TCP transport. (In RPC 4.0, TCP is the only transport affected. In
TI-RPC, any 'virtual circuit' transport including but not limited to TCP
is affected.) The set_input_fragment() routine in src/lib/libc/xdr/xdr_rec.c
attempts to read a record header which is supposed to specify the size
of the record that follows. Unfortunately, this routine performs no
sanity checking: if you telnet to a TCP service and send a few carriage
returns, set_input_fragment() misinterprets them as a ridiculously
large record size. This in turn causes the fill_input_buffer() routine
to try reading a ridiculously large amount of data from the network.
This is why the service stays wedged until you disconnect.

The patch I made to fix this is as follows:

*** xdr_rec.c.orig      Fri May 15 17:43:57 1998
--- xdr_rec.c   Fri May 15 17:47:58 1998
***************
*** 550,555 ****
--- 550,561 ----
                return (FALSE);
        header = (long)ntohl(header);
        rstrm->last_frag = ((header & LAST_FRAG) == 0) ? FALSE : TRUE;
+       /*
+        * Sanity check. Try not to accept wildly incorrect
+        * record sizes.
+        */
+       if ((header & (~LAST_FRAG)) > rstrm->recvsize)
+               return(FALSE);
        rstrm->fbtbc = header & (~LAST_FRAG);
        return (TRUE);
  }


The next change relates to the svc_tcp.c module directly. The
svctcp_recv() routine calls xdr_callmsg() to attempt to decode the RPC
message header that should accompany every RPC request. With the UDP
transport, a datagram that doesn't contain a valid header is dropped on
the floor. With TCP, the connection is left open to attempt to receive
another request that may be pending. In my view, if no valid message
header is found where there should have been one, the connection should be
dropped. The following patch to src/lib/libc/rpc/svc_tcp.c does this:

*** svc_tcp.c.orig      Fri May 15 17:11:21 1998
--- svc_tcp.c   Fri May 15 17:09:02 1998
***************
*** 404,409 ****
--- 404,410 ----
                cd->x_id = msg->rm_xid;
                return (TRUE);
        }
+       cd->strm_stat = XPRT_DIED;      /* XXXX */
        return (FALSE);
  }


This marks the transport handle as dead if xdr_callmsg() fails, which
in turn will cause the dispatcher to drop the connection.

With these patches, you have 35 seconds to supply a valid record
containing an RPC message header and request, otherwise the session
is disconnected. If you enter garbage data, the connection is dropped
immediately.

As far as I know, this bug is likely present in all Sun-derived ONC RPC
implementations, including TI-RPC from ONC+, which is what you'll find in
Solaris 2.x and I think AIX 4.2 and up. TI-RPC uses the same XDR record
marking code, although it has an svc_vc.c module to handle virtual circuit
transports as opposed to a transport-specific svc_tcp.c module. Mind you,
this observation is based on the TI-RPC 2.3 source, which is quite old.

I do not consider the bug completely fixed though. These patches
only work around the immediate problem. A proper fix would allow the
service to continue to handle new requests even while waiting for the 35
second timeout to expire, and would apply a more intelligent sanity check
in set_input_fragment(). I think one solution would be to modify readtcp()
so that it monitors the other transport handles in addition to the
current socket that it's reading from, but I still have to do some tests
to see if this idea is really practical.

-Bill

--
=============================================================================
-Bill Paul            (212) 854-6020 | System Manager, Master of Unix-Fu
Work:         wpaul@ctr.columbia.edu | Center for Telecommunications Research
Home:  wpaul@skynet.ctr.columbia.edu | Columbia University, New York City
=============================================================================
  "Now, that's "Open" as used in the sentence "Open your wallet", right?"
=============================================================================




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]