This is the mail archive of the
ecos-discuss@sourceware.org
mailing list for the eCos project.
Re: SYN problem with new TCP/IP stack
On 2006-02-04, Grant Edwards <grante@visi.com> wrote:
>> After switching from the "old" TCP/IP stack to the "new" one,
>> we've run into a problem which is causing customers to
>> complain.
>>
>> Here's the scenario:
>>
>> 1) Host opens TCP connection to eCos application.
>>
>> 2) Somebody pushes the reset button on the host.
>>
>> 3) The host reboots and attempts to open a TCP connection to
>> the eCos application _using_the_same_source_port_ (and the
>> same destination port).
>>
>> 4a) When the _old_ stack received a SYN for an already-open
>> connection, it sent an ACK with the sequence number from
>> the old connection. The host saw this and sent a RST
>> (which drops the connection), then attempted to re-open the
>> connection (which succeeded). All was good.
>>
>> 4b) When the _new_ stack receives a SYN for an already-open
>> connection, it just ignores it. So, after a timeout of a
>> minute or so, the host sends it again. Again it's ignored.
>> This goes on for 10-15 minutes at which point the eCos
>> stack times out the connection and closes it. Only then
>> can the host open a new connection.
>
> According to my reading of RFC793, page 34 describes this
> scenario exactly and requires the behavior of the old network
> stack.
I've got a fix for the new TCP/IP stack that seems to work. The
bug is in tcp_input.c (no surprise).
At one point it is determined (correctly) that the sequence
number in the SYN is outside the receive window, so the ACKNOW
flag is set at line 1636:
1619 /*
1620 * Following if statement from Stevens, vol. 2, p. 960.
1621 */
1622 if (todrop > tlen
1623 || (todrop == tlen && (thflags & TH_FIN) == 0)) {
1624 /*
1625 * Any valid FIN must be to the left of the window.
1626 * At this point the FIN must be a duplicate or out
1627 * of sequence; drop it.
1628 */
1629 thflags &= ~TH_FIN;
1630
1631 /*
1632 * Send an ACK to resynchronize and drop any data.
1633 * But keep on processing for RST or ACK.
1634 */
1635 tp->t_flags |= TF_ACKNOW;
1636 todrop = tlen;
1637 tcpstat.tcps_rcvduppack++;
1638 tcpstat.tcps_rcvdupbyte += todrop;
1639 } else {
1640 tcpstat.tcps_rcvpartduppack++;
1641 tcpstat.tcps_rcvpartdupbyte += todrop;
1642 }
Now we rattle on down through the tcp_input fuction a ways and
end up jumping to "drop" at line 1739:
1729 /*
1730 * If the ACK bit is off: if in SYN-RECEIVED state or SENDSYN
1731 * flag is on (half-synchronized state), then queue data for
1732 * later processing; else drop segment and return.
1733 */
1734 if ((thflags & TH_ACK) == 0) {
1735 if (tp->t_state == TCPS_SYN_RECEIVED ||
1736 (tp->t_flags & TF_NEEDSYN))
1737 goto step6;
1738 else
1739 goto drop;
1740 }
The code at "drop:" cleans up a little and returns without ever
calling tcp_output(tp) to actually send the ACK that's required
by the RFC and was requested by the setting of the TF_ACKNOW bit.
2398 drop:
2399 /*
2400 * Drop space held by incoming segment and return.
2401 */
2402 #ifdef TCPDEBUG
2403 if (tp == 0 || (tp->t_inpcb->inp_socket->so_options & SO_DEBUG))
2404 tcp_trace(TA_DROP, ostate, tp, (void *)tcp_saveipgen,
2405 &tcp_savetcp, 0);
2406 #endif
2407 m_freem(m);
2408 /* destroy temporarily created socket */
2409 if (dropsocket)
2410 (void) soabort(so);
2411 return;
2412 }
So, the ACK that's required by the TCP RFC is never sent (the
SYN packet is just ignored). So, the host just sits there and
sends SYNs. Then the host's owner gets annoyed and calls
customer support, yadda yadda, and here were are.
Adding the following code immediately after the drop: label at
line 2398 fixes the problem.
if (tp->t_flags & TF_ACKNOW)
(void)tcp_output(tp);
I haven't noticed any side-effects, and I think the _worst_
that could happen is that an extra ACK would be sent out. I've
been watching for that and haven't seen it happen. Actually, I
don't see how it could happen. Sending an ACK clears the
ACKNOW flag, so tcp_output() wouldn't be called at this point
unless somebody set the ACKNOW flag and an ACK never got sent.
I haven't traced the flow through the old stack's tcp_input()
to find where the difference is. I need to get a fix out to
the customer first...
--
Grant Edwards grante Yow! .. If I cover this
at entire WALL with MAZOLA,
visi.com wdo I have to give my AGENT
ten per cent??
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss